

METROLOGY AND MEASUREMENT SYSTEMS Index 330930, ISSN 0860-8229

www.metrology.pg.gda.pl



# A HIGH-SPEED FULLY DIGITAL PHASE-SYNCHRONIZER IMPLEMENTED IN A FIELD PROGRAMMABLE GATE ARRAY DEVICE

#### Robert Frankowski, Dariusz Chaberski, Marcin Kowalski, Marek Zieliński

Nicolaus Copernicus University, Faculty of Physics, Astronomy and Informatics, Grudziądzka 5, 87-100 Toruń, Poland (⊠ robef@fizyka.umk.pl, +48 56 611 3343, daras@fizyka.umk.pl, markow@fizyka.umk.pl, marziel@fizyka.umk.pl)

#### Abstract

Most systems used in quantum physics experiments require the efficient and simultaneous recording different multi-photon coincidence detection events. In such experiments, the single-photon gated counting systems can be applicable. The main sources of errors in these systems are both instability of the clock source and their imperfect synchronization with the excitation source. Below, we propose a solution for improvement of the metrological parameters of such measuring systems. Thus, we designed a novel integrated circuit dedicated to registration of signals from a photon number resolving detectors including a phase synchronizer module. This paper presents the architecture of a high-resolution (~60 ps) digital phase synchronizer module cooperating with a multi-channel coincidence counter. The main characteristic feature of the presented system is its ability to fast synchronization (requiring only one clock period) with the measuring process. Therefore, it is designed to work with various excitation sources of a very wide frequency range. Implementation of the phase synchronizer module in an FPGA device enabled to reduce the synchronization error value from 2.857 ns to 214.8 ps.

Keywords: phase synchronizer, delay line, coincidence counting, quantum information, time interval measurement, time-to-digital converters, field programmable gate arrays.

© 2017 Polish Academy of Sciences. All rights reserved

#### 1. Introduction

The rapid development of quantum technologies in quantum optics experiments delivered more valuable information about the fundamental nature of light. Experiments prepared in such areas as the spectral state of a single photon characterization, the quantum cryptography and implementation of quantum teleportation protocols still remain valid and could become very possible directions for the future research [1–4]. The methods of single and multi-photon statistics' reconstruction using an optical fibre-loop detector are reported [5–6]. For this purpose, the indirect single-photon counting techniques with picoseconds resolution can be practicable. The preferable solution in these experiments is to use a box-car-like module [7]. Actually, a box-car-like module is a modified single-photon gated counting system with a counting rate limited to one. In this situation, coincidences of two or more electrical pulses can be simply obtainable.

The most popular coincidence measurement systems were built of appropriate electronic modules, such as analogue time-to-amplitude converters, discriminators and manually- or digitally-controlled delay gates. Additionally, most of them were characterized by slow data processing electronic devices. Also, they required manual synchronizing signals between their electronic stages [8–9]. Recently, the coincidence measurement systems have been constructed in a more integrated form, *i.e.* using high-performance *Field Programmable Gate Array* devices (FPGA) [10–13]. The rapid development of FPGA technology enables to implement fully-digital high-resolution programmable delay lines in these structures and guarantees very short

Article history: received on Jun. 07, 2016; accepted on Apr. 29, 2017; available online on Sep. 01, 2017; DOI: 10.1515/mms-2017-0037.

rising and falling times of the propagated signals. It is very important regarding quick achievement of the coincidence window.

In a traditional digital box-car module, the delay time between the trigger and the (delayed by the digital delay line) opening and closing gate pulses, depends mainly on the internal clock period. If the delay time is relatively long (*i.e.* contains a large number of clock periods), a high stability of the internal clock must be guaranteed [14]. This involves the fact that the precision of determination of that time interval is proportional to the number of clock periods. Otherwise, for short time intervals, the problem of synchronization of the physical process with the measuring system becomes very important [15]. In such a situation, the synchronization error (without compensation) is comparable to the internal clock period *T*. Therefore, it can be described by a uniform probability distribution  $\Theta_{CLK}$  in a range of -T/2 to T/2. If such a large synchronization error is not compensated, the effective width of time-gate increases. In consequence, the modification effect of the ideal time-gate functions by the  $\Theta_{CLK}$  function cannot be omitted. This should be taken into account when the width of the coincidence window is being determined. In the presented paper, the problem of compensation of this time-interval error has been solved. It was possible thanks to the application of the *fully-digital phase synchronizer* (DPS) module in an FPGA structure.

Well-known methods performing the synchronization process employ PLL and DLL control modules [16–17]. Most of them are supplied in wire and wireless sensor network systems for event ordering and efficient communication scheduling [18–19]. Others can be used to effective compensation of the synchronization error in gated counting systems. This problem has been solved by the application of a start-able PLL oscillator used as the internal clock [20]. Similar solutions concern using a DLL loop to stabilize the delay line parameters against temperature or power supply voltage variations [21]. The known concept of applying a PLL to the CPU-Coprocessor synchronization [22] remains also valid in constructing various types of precise TDC systems [23] to guarantee a high synchronization level between the internal signals and appropriate functional blocks of the system. In modern FPGAs it can be achieved *e.g.* by DCM blocks operating in the precise phase shifting mode [24]. Another class represents the synchronizers used for synchronization of asynchronous input pulses with the nearest edges of clocks. The main purpose of their application is to effectively reduce the possibility of potential causes of metastable states in the period counters. It can be achieved by generation of synchronized *enable* signals by single or dual-edge double synchronizers [25].

Contrary to the above solutions we have designed a purely digital module meant to synchronize the internal clock signal with the leading edge of trigger pulse by a digital delay line. In our approach a high operating frequency has been achieved by using a direct time conversion method.

The presented paper is organized as follows. Firstly, a possibility of applying a DPS module in quantum physics experiments has been explained in Section 2. For this purpose, an example of optical equipment for the experimental characterization of the statistics of photon pairs is described. A simplified block diagram of DPS module and some essential information of the proposed synchronization technique is reported in Section 3. In this section, the experimental results of characterization of the *phase detection* (PDM) and *delay selection* (DSM) modules and the simulation results obtained from the DPS model are also included. In the next section, the experimental results of real DPS testing that confirm the effective gate width reduction and their interpretations (in Section 4) are presented. Finally, we summarize our work.

#### 2. Box-car-like architecture

The main aim of the research was to analyse the concept of the construction and implementation – in a Virtex4 FPGA programmable structure – of a multi-platform DPS

module which cooperates with a multi-channel *coincidence counter* (CC). The basic task of DPS module was the fast synchronization of the measurement system with the trigger pulses produced by a pulse laser (RegA 900, Coherent, 165 fs FWHM, 774 nm). A satisfactory level of synchronization (differences of the successive time intervals between the trigger pulse position and the time-gate activation) required during the entire measurement process should be below 300 ps and must be independent of the trigger source repetition rate. An example of practical application of the DPS module is shown in Fig. 1. Such a construction was used in the physical experiment where the parameters of single photon sources are obtained.



Fig. 1. A multichannel coincidence meter as an example of application of the digital phase synchronizer.
DCM – the *digital clock manager* module built in a XC4VFX12 programmable structure; CLK\_REF – the *clock signal synchronized* with the rising edge of trigger pulse; CU – *control unit*; FL – *focusing lens*; XSH – BBO crystal for generation of the second harmonic; IL – *imaging lens*; DM – *dichroic mirrors*;
BG – *blue glass* filter; X – down-conversion crystal; IF – *interference filter*; FC – *fibre coupling* stage, HWP – *half-wave plate*; ND – *neutral density* filter; FLD – *fibre-loop detector*; D1 and D2 – single photon counting detectors.

A typical photon source consists of a nonlinear crystal (BBO –  $\beta$ -barium borate) which is pumped by a pulse laser. The most important element of the optical part of measuring equipment (depicted in Fig. 1) is a *fibre-loop detector* (FLD) [26]. A single photon may be propagated by it in eight distinct ways. The minimal delay in the proposed FLD construction is about 100 ns, but could be shorter. It depends on the laser repetition rate (for instance, the standard frequencies of pulse lasers are within a range of 1 kHz – 100 MHz). Inside the FLD, photons are divided by 50/50 fibre couplers. Thus separated in time, the photons are then registered by two photo-detectors (D1 and D2). The signals received from the photo-detectors are compatible with the Low Voltage CMOS I/O standard supported by Virtex-4 devices. Therefore, they can be directly connected to the CC inputs.

Unlike the well-known box-car constructions [27], the presented CC system stores information only about the occurrence of photons in appropriate time-gates. The basic modules of system include: a *control unit* (CU), a *phase synchronizer* (DPS), a *time-gate generator* (TGG) with a fast counter (not shown in Fig. 1) and FIFO memory blocks. The internal clock signal (350 MHz) which supplies the DPS module is synthesized by a built-in XC4VFX12 structure of the *digital clock manager* (DCM) blocks [28]. Its parameters determine the internal

structure of DPS module (*i.e.* the required number of delay cells). At its output, the highresolution DPS module produces the reference clock signal CLK\_REF. This signal is synchronized with the rising edge of trigger pulse and provides the time base of the measurement system. Thus, it is responsible for the precise control of activation time of the time-gates. In this way, the CC system gained a functionality of fast synchronization with the incoming trigger pulses.

In Fig. 1, there is presented a two-channel CC meter containing eight programmable timegates in each channel. The digital time-gate parameters (*i.e.* the delay and width of gate) are stored into the CU by a PowerPC processor, through the processor local bus interface. The PowerPC processor can modify this information at the beginning of the measuring process. The above time-gate parameters remain unchanged during an ongoing single measurement cycle. Each of the measurement cycles is triggered by the signal from the pulse laser. The signal which stores the collected data into FIFO memory is produced on the basis of internal flags (not shown in Fig. 1) delivered by the time-gate generator module which confirms closing a respective group of time-gates. The same signal (delayed in time) erases the information collected from the time-gates and resets the time-gate generator. The next step of data processing is executed by the PowerPC processor [29]. Such a peculiarly constructed CC system enables precise collection of information about incoming photons and their coincidences using for this purpose various types of excitation sources (limited by the construction of FLD).

### 3. Digital phase synchronizer module

The resolution of gated counting systems is limited mainly by the instability of the reference clock signal and the effect of lack of synchronization with the trigger pulse. These two factors have a significant impact on the form of *effective gate function* (EGF). The EGF describes the probability of wrong allocation of the input pulses in respective time-gates. In a general case, the EGF is a result of convolution of the Gaussian probability density function (describing instability of the main clock source) and the ideal gate function. In the case of CC system, when the measuring system is synchronized with the excitation source, we also observe an additional problem of compensation of the time interval between the real trigger pulse position and the beginning of digital time-gate. This synchronization error (described by a random variable) has a uniform distribution  $\Theta_{CLK}$  of probability and significantly modifies the time-gate function. In consequence, the time-gate width increases, but the time resolution of CC system is reduced. The main purpose of the phase synchronizer module implementation is minimizing the synchronization error.

### 3.1. Circuit description

A block diagram of high-resolution DPS is shown in Fig. 2. The DPS architecture consists of two high-resolution *delay lines* (DL) [30–34]. One of them is used to the construction of a PDM, while the second one has been implemented in a DSM. The CLK signal has been connected to both modules. The PDM module is used for fast phase measurement of the standard clock CLK signal. Therefore, the propagation time of delay elements in the PDM module must be equal to the standard clock period ( $T_{CLK}$ ). The information stored in the PDM module ( $\psi_0, \psi_1, \dots, \psi_{N-1}$ ) is represented by a pseudo-thermometric code. In this way, the *phase decoder* (PD) module provides detection of the first rising edge of clock signal. On this basis, there is calculated the number of delay elements which are used in DSM module to perform the synchronization process.



Fig. 2. The Digital Phase Synchronizer module: a block diagram (a); the principle of operation (b).

The synchronization process can be performed by successive delaying of the clock signal through an appropriate number of delay elements (built in the CLB blocks). The correct number of delay elements is determined by the *carry chain multiplexers* (MUXCY). For the ideal DPS module, the time characteristics of DL for both PDM and DSM modules have the same parameters. Hence, it would be possible to directly assign the relevant time channels in both modules. This solution provides a more simplified construction of the PD block. The main task of PD is converting the data from a pseudo-thermometric to one-of-M code, as follows:  $n\tau_D = (M - n)\tau_W$ . Such a solution makes possible activation of only one of the MUX elements at a given moment. Unfortunately, in a real situation, the time characteristics of PDM and DSM blocks are different. They are the main source of synchronization errors in the DPS module.

### 3.2. Delay line implementation

The modern FPGAs provide an array of fast *configurable logic blocks* (CLB) which component logic resources are characterized by very short propagation times. Each of CLB elements contains four *slices* (SL) which are grouped into two pairs (of two SLs each). Fig. 2 shows only part of two pairs of SLs. One pair is used for the construction of PDM, while the second pair – for DSM. They provide the ability of implementing high-resolution DLs in the FPGA structures. In our case, the DLs applied to the construction of DPS are made by using logical elements placed inside SL (such as *xor* gates or carry chain multiplexers) and fast carry chain interconnections. To decrease a dead time (the minimum time between two measurements) and to adapt to work with high-frequency excitation sources we decided to use a direct coding delay line in the PDM module. The time parameters were measured for both DLs by a statistical test method [35]. The results are shown in Fig. 3.



Fig. 3. The *integral nonlinearity characteristics* (INL): the phase detection module (a); the delay selection module (b).

In the experiment, a standard clock generator of frequency  $F_D = 100$  MHz was used. This element was an integral part of the ML403 evaluation board [36]. The standard clock signal has been delivered to the DLs' inputs. The clock phase was measured by a DF1650B function generator of frequency 1.832 MHz. In order to obtain the DLs' characteristics, six series of 135 thousand measurements were performed for both lines. From the above results, an average delay of each DL was calculated. In our case, the average delay of PDM module is equal to 60 ps, whereas for the DSM module it is equal to 62 ps. Based on these parameters, the number of delay elements in the PDM module could be determined. The calculated value (in relation to the standard clock period) was 48. Using the average DL values, the INL and DNL characteristics were calculated [37]. The maximal nonlinearity deviation for both cases did not exceed the value of 1 LSB.



Fig. 4. The time deviation of the reference clock signal on an appropriate multiplexer's input.

The clock signal has been delivered to the DSM module by – dedicated to such purposes – the global clock connections. Such a signal distribution guarantees small time deviations of other CLB connections. The time deviations of CLK distribution to appropriate inputs of MUX elements in DSM were calculated by an optical direct method [38]. The optical method enables to measure deviations of clock distribution (in the DSM module) with a 0.34 ps resolution. The results are presented in Fig. 4. The mean value of differences between the propagation times to each of the MUX components is equal to 58.9 ps. Moreover, it should be noted that using accessible commercial devices does not enable such precise measurements as those possible with the mentioned above optical delay line.

The obtained data will be used in Subsection 3.3 to identify and interpret the decisive factors which may affect the level of synchronization errors.

### 3.3. DPS simulation

The main sources of errors in CC systems are the internal clock source instability and the imperfect synchronization of the CC system with the excitation source. For this purpose, a short analysis of the CC system (whose resolution has been determined by the CLK period) is presented. The following section shows also the simulation results of the DPS module.

a) Influence of the clock time parameters on determining the time-gate activation time. The internal clock is synthesized by the built-in DCM blocks. The digital DCM's frequency synthesizer FX outputs drive the global clock routing network in the Virtex4 device. Such a distribution minimizes the clock skew due to the differences in distance. To obtain some essential information about the level of synchronization error and to prepare a DPS simulation model, a separate measurement of the DCM's clock jitter parameters had to be done. The jitter parameters have been measured directly at the XC4VFX12 outputs by a digital oscilloscope. For this measurement, a *Jitter and Timing Analysis* (JTA) toolkit has been used. The measured results are shown in Fig. 5a. In the CC system, each of time-gates have been activated after a specified time interval  $\Delta t_N$ , determined by the internal clock parameters. In our experiment, the measurement was performed for specified numbers *N* of clock cycles, which values depend mainly on the FLD configuration. Therefore, the photons separately propagated through three main fiber-loop ways (shown in Fig. 1), could be measured by three out of eight available time-gates. For this reason, the selected time-gates have been activated after 35, 70 and 140 clock cycles.



Fig. 5. The experimental results of: the reference clock's phase fluctuations (the signal generated by the DCM module) (a); the time-gate fluctuations (obtained by a Tektronix DPO7054 digital oscilloscope) without the phase synchronizer module (b).

The observed DCM's clock fluctuations are the consequence of a specific DLL block construction. The DLL is an internal part of DCM block and has been used for completion of the digital frequency synthesis process. The obtained results indicate a determined character of phase fluctuations. This deviation (about 40 ps) corresponds to a DL line resolution that was used to build a DCM block.

A simple model of CC system used to determine the time-gate activation time was prepared below. In such a case, the time interval  $\Delta t_N$  responsible for the time-gate activation can be described by a random variable  $\varepsilon$  in the following way:

$$\Delta t_N = NT_{CLK} + \sum_{n=1}^N \varepsilon_n \,, \tag{1}$$

where:  $\varepsilon$  – is a deviation from the average value of reference clock period; N – the number of clock cycles for determination of a specified time interval;  $T_{CLK}$  – a reference clock period. For the standard clock, the phase fluctuations have a normal distribution. Therefore, the probability density function  $\varphi(\varepsilon)$  of the random variable  $\varepsilon$  is described by:

$$\varphi(\varepsilon) = \frac{1}{\sigma_{\varepsilon}\sqrt{2\pi}} \exp\left\{-\frac{(\varepsilon - x_{\varepsilon})^2}{2\sigma_{\varepsilon}^2}\right\},\tag{2}$$

where:  $x_{\varepsilon}$  is an average value of  $\varepsilon$ ,  $\sigma_{\varepsilon}^2$  is variance of the random variable  $\varepsilon$ .

In our case, the phase fluctuations are determined by the DCM construction and depend on its DL parameters. Thus, the probability density function  $\varphi(\varepsilon)$  is approximated by the sum of two Gaussian probability density functions.

Using the above assumption, the PDF of  $\Delta t_N$  variable can be described as:

$$\varphi(\Delta t_N) = \frac{1}{\sigma_{\Delta t_N} \sqrt{2\pi}} \exp\left\{-\frac{\left(\Delta t_N - NT_{CLK}\right)^2}{2\sigma_{\Delta t_N}^2}\right\},\tag{3}$$

where:  $\sigma_{\Delta t_N}^2$  is variance of the random  $\Delta t_N$  variable.

Knowing the  $\varphi(\varepsilon)$  function parameters, a model of real clock signal will be prepared in Subsection 3.3c.

b) Synchronization of the CC system with the trigger pulse position.

In practice, the time-gate activation process consists of two parameters, where the first parameter is the number N of a standard clock period, while the second one is the index P of a quantization step. The required number N of a standard clock period is obtained directly by the fast counter (not shown in Fig. 1) for every time-gate. For the simulation purposes, it can be calculated as the integer value of the result of dividing the  $\Delta t_N$  interval by the standard clock period value. The precision of time-gate settings depends on the resolution of DL which have been implemented in the TGG module (not presented in this paper).

According to the previous considerations, there are two main factors that determine the level of time-gate activation error: the standard clock instability (measured in Subsection 3.3a) and the trigger pulse synchronization in relation to the time base of CC system. The synchronization error (described by the  $t_{SYNC}$  random variable) between the CC system and the trigger position depends on the standard clock period and has a major impact on the level of time-gate fluctuations in gated counting systems. Because any value of the time interval in a range of 0 to  $T_{CLK}$  has the same probability (equal to  $1/T_{CLK}$ ), the PDF function of synchronization error may be described by a uniform distribution of probability  $\Theta_{CLK}$ . Thus, the PDF of time-gate activation error can be expressed as:

$$\Gamma(\Delta t_N) = \Theta_{CLK}(t_{SYNC}) \otimes \varphi(\Delta t_N).$$
(4)

As stated in Section 1, the problem of compensating the time interval error between the trigger pulse position and the beginning of measuring cycle (*i.e.* the time-gate position) can be effectively solved by a fast and portable DPS module which is constructed from the most popular FPGA logic components. In order to ensure a short conversion time, the time interval quantization process is based on the direct time conversion method. The quantization error being a result of the quantization process in PDM, is proportional to the propagation times of appropriate delay cells. Therefore, we assumed that the distribution function of the quantization error  $\Theta_{KW}$  has a uniform distribution, where the width parameter  $W_{KW}$  is equal to the average quantization step of PDM module. This is described by the following relation:

$$\Theta_{KW}(t) = W_{KW}^{-1}p, \quad where \quad p = (-0.5 W_{KW} \le t \le 0.5 W_{KW}). \tag{5}$$

The predicate *p* value is equal to one in the case when the relation  $(-0.5 W_{KW} \le t \le 0.5 W_{KW})$  holds. Otherwise, its value is equal to zero. An additional source of errors caused by the DPS module during the conversion process are different INL characteristics of both delay lines. The resulted errors are described by the  $\Theta_{INL}(t_{INL})$  function which depends on the differences of the non-linearity errors of both DLs ( $\Delta R_{INL}$ ). In view of the fact that the random variable  $t_{INL}$  have a discrete nature we can postulate that the probability distribution function  $\Theta_{INL}(t_{INL})$  may take the following form:

$$\Theta_{INL}(t) = \sum_{n=1}^{M} P(n)\delta(t - \Delta R_{INL}(n)), \qquad (6)$$

where: P(n) – is a likelihood of occurring an INL error with the  $\Delta R_{INL}$  value (assigned to the *n*-th time channel);  $\delta$  – the Dirac delta function. For simplicity, the  $\Theta_{INL}(t_{INL})$  function has been approximated by a uniform distribution in a range of the minimum and maximum values of the  $s = t - \Delta R_{INL}(n)$  parameter. Thus, the  $\Theta_{SYNC}(t_{SYNC})$  function which describes the properties of synchronization process in the DPS module has been acquired by convolution of the  $\Theta_{KW}(t_{KW})$  and  $\Theta_{INL}(t_{INL})$  functions. The final form of the PDF of time-gate activation error may be written as:

$$\Gamma_{SYNC}(\Delta t_N) = \Theta_{SYNC}(t_{SYNC}) \otimes \varphi(\Delta t_N).$$
<sup>(7)</sup>

c) The procedure of DPS simulation.

The following steps have been taken to obtain the DPS simulation:

- At the beginning, a model of real clock signal was prepared. For this purpose the von Neumann method [39] has been applied. Using this model, it was possible to produce a stimulus vector that successfully emulates the reference clock signal. The stimulus vector represents the next phases (not quantized) of the reference clock signal with a specified time period and imposed clock instability. In consequence of the above step, a wide spectrum of the clock's phase dispersion could be simulated. The results are shown in Fig. 6a. The generated data were then used to prepare a simulation process of the PDM and DSM modules.
- In the next step, the quantization process performed by the PDM module has been simulated. For this purpose, the real delay line characteristics (discussed in Subsection 3.2) and the model of real clock signal have been used. In such a situation, the widths of *quantization steps* (QSs) are equal to the propagation times of appropriate delay cells in the PDM. Therefore, for each QS, we have to determine its beginning  $t_D^B$  and end  $t_D^B$ . Based on the randomly generated clock's phases  $P_i$  we can calculate the quantized standard clock phase values (phase numbers) as the numbers  $n_i$  of QSs,  $n \in [1, N]$ , in which the generated phases  $P_i$  was accepted. This can be written in the following way:  $t_{Dn_i}^B \leq P_i < t_{Dn_i}^E$ ,  $i \in [1, K]$ , where K indicates the number of stimulus vector elements. Therefore, a phase number n represents the quantized standard clock phase value. In practice, this measured time interval corresponds to the sum of propagation times of delay elements  $n\tau_D$ , where n is the highest position of the flip-flop which is storing the high state at its output. The simulation result of PDM quantization process obtained by using the ideal DL characteristic (*i.e.* the propagation time of a related delay cell is equal to the average QS of PDM module) is presented in Fig. 6b.



Fig. 6. The results of simulation: the normalized *probability density function* (PDF) of the reference clock phase fluctuations (a); the PDM module simulation (b).

- The number of delay line segments M-n in the DSM module has been chosen on the basis of known above phase numbers. In this case, no nonlinearity correcting operations were applied.
- To verify the DSM module, the not quantized random  $\Delta t_N$  values calculated by the method described in Subsection 3.3a, have been generated. Considering the DL parameters specified in Subsection 3.2, *e.g.* the propagation times of delay elements ( $\tau_W$ ) and the propagation times of MUX elements ( $\tau_{MUX}$ ) in the DSM module, the corresponding time intervals  $\sum_{i=M-n}^{M} \tau_{MUX}^{i} + \tau_{MUX}^{M-n}$  have been assigned to the previously

generated data.

These proceedings have been used to validate the value of synchronization error between the measuring system and the observed physical process. The result of computer simulation is presented in Fig. 7. It is the result of generating a series of 50000 test vectors with appropriate phase numbers. The applied phase noise of reference clock signal has been presented as the distribution shown in Fig. 6a. For the collected data (obtained from the DPS model) we are fitting  $\Gamma_{STNC}^M(\Delta t_N)$  curve. The obtained uniform distribution fitting parameters are equal to  $W_{KW} = 60.4$  ps and  $W_{INL} = 146.6$  ps, respectively. Hence, we can conclude that the differences of the non-linearity errors of both lines used in the DPS construction have a significant influence on the level of time-gate fluctuations. The minimal variation value from the value achieved by the PDM module reached 4.7 ps, whereas its maximum value was equal to 159.7 ps. Thus, the average value was equal to 78 ps.



Fig. 7. The results of DPS module simulation (with the real DPS characteristics): the probability distribution of time-gate fluctuations (a); the synchronization errors (b).

The distribution parameters of  $\Theta_{KW}$  and  $\Theta_{INL}$  functions have been used to define the synchronization process accuracy. For this purpose, the expanded uncertainty has been estimated for the assumed confidence level of  $\alpha = 0.997$ , based on:

$$\sigma_{SYNC}(t) = W_{KW} + W_{INL} - 2\sqrt{(1-\alpha)W_{KW}W_{INL}}.$$
(8)

Based on the simulation results obtained from the DPS module, we can efficiently minimize the level of phase fluctuations of the time-gates from 2.857 ns to 196.7 ps. The same procedure in relation to the measurement data has been applied in Section 4.

#### 4. Measurement results

The phase fluctuations of time-gates were measured using Tektronix DPO7054 and LeCroy LT374 digital oscilloscopes. Fluctuations of the time-gate activation time in relation to the triggering signal were measured. The tests were performed for both of the two cases: with and without the DPS module. The gates have been opened after the same time interval in relation to the excitation pulse. Therefore, the accumulated jitter for both cases (with and without the DPS module) has the same parameters.



Fig. 8. The experimental results of time-gate fluctuations (LeCroy LT374). The coincidence meter: using the DPS module (a); without the DPS module (b).

The results of measurements are presented in Fig. 8a and Fig. 8b. The convolutions of the Gaussian  $\varphi(\Delta t_N)$  and either the  $\Theta_{CLK}$  or  $\Theta_{SYNC}$  distributions have been fitted to the obtained measurement results. For this purpose the *conv2* and *fminsearch* functions from MATLAB<sup>®</sup> toolkit have been used.

From the obtained measurement results it can be concluded that the DPS module effectively reduces fluctuations of the time-gate opening from 2.857 ns to 214.8 ps. In this way almost a fourteen-fold improvement has been obtained. The time-gate was being opened after 2800 clock cycles since the trigger event. Similar measurements were carried out also for another number of clock cycles while the accompanied changes of variations were observed.

In the case without a synchronization module, the width W of uniform distribution is equal to the  $T_{CLK}$  period of reference clock signal delivered from the DCM module. When the DPS module is used, then the fluctuation of time-gate opening is determined mainly by the INL errors. Additional noise created by the synchronization process is a result of the nonlinearity of real (non-linear) DL characteristics and the voltage and temperature fluctuations of the delay line. The last case may cause minor differences between the simulation and experimental results.

## 5. Conclusions

Further development of quantum enhanced technologies requires robust, cost effective and easy to configure high-speed electronics capable of processing information streams on the fly. We proposed a novel, fully synchronized gated counting system for processing signals from avalanche photodiodes as a solution for registering signals from photon number resolving detectors. Employing a high-resolution digital phase synchronizer module enables to synchronize the internal reference clock source with an external physical process (under research) and to suppress dark counts. Currently, the synchronization error has been reduced from 2.857 ns to 214.8 ps. This ensures almost a fourteen-fold improvement. The proposed DPS module can be operated with a wide spectrum (< 350 MHz) of excitation sources without deterioration of its metrological parameters. Also, it is characterized by an extremely low dead time that is equal to one clock cycle. Therefore, the implementation of the DPS module in a gated counting system and additional application of the Virtex4 XC4VFX12 resources enables to effectively collect complete information about the total number of photo-counts and the history of registered events. This enables to apply our device in the quantum cryptography and the systems diagnosing single photon sources.

## Acknowledgements

We acknowledge the insightful discussions with K. Banaszek and W. Wasilewski and the ability to use the National Laboratory for Atomic, Molecular and Optical Physics.

#### References

- Wasilewski, W., Kolenderski, P., Frankowski, R. (2007). Spectral density matrix of a single photon measured. *Phys. Rev. Lett.*, 99, 123601.
- [2] Gisin, N., Ribordy, G., Tittel, W., Zbinden, H. (2002). Quantum cryptography. *Reviews of Modern Physics*, 74(1), 145–195.
- [3] Wu, L., Chen, Y. (2015). Three-Stage Quantum Cryptography Protocol under Collective-Rotation Noise. *Entropy*, 17(5), 2919–2931.
- [4] Pirandola, S., Eisert, J., Weedbrook, Ch., Furusawa, A., Braunstein, S.L. (2015). Advances in Quantum Teleportation. *Nature Photonics*, 9, 641–652.
- [5] Wasilewski, W., Radzewicz, Cz., Frankowski, R., Banaszek, K. (2008). Statistics of multiphoton events in spontaneous parametric down-conversion. *Phys. Rev. A*, 78, 033831.
- [6] Achilles, D., Silberhorn, Ch., Sliwa, C., Banaszek, K., Walmsley, I.A. (2003). Fiber-assisted detection with photon number resolution. *Optics Letters*, 28(23), 2387–2389.
- [7] Zieliński, M., Karasek, K., Płóciennik, P., Dygdała, R. (1996). Digital Gated Single-Particle Counting Systems, design and applications. *Metrologia i Systemy Pomiarowe*, 3(3–4), 199–211.
- [8] Simms, P.C. (1961). Fast coincidence system based on a transistorized time-to-amplitude converter. *Rev. Sci. Instrum.*, 32(8), 894–898.
- [9] Gaertner, S., Weinfurter, H., Kurtsiefer, C. (2005). Fast and compact multichannel photon coincidence unit for quantum information processing. *Rev. Sci. Instrum.*, 76, 123108.
- [10] Frankowski, R., Wasilewski, W., Kowalski, M., Zieliński, M. (2008). High resolution two channel Box-Car system implemented in single FPGA structure Virtex4 for apply in quantum physics. *Elektronika*, 49(5), 21– 23.
- [11] Zhu, F., Hsieh, S.C., Yen, W.W., Chou, H.P. (2011). A digital coincidence measurement system using FPGA techniques. Nuclear Instruments and Methods in Physics Research Section A, 652(1), 454–457.
- [12] Antonioli, S., Miari, L., Cuccato, A., Crotti, M., Rech, I., Ghioni, M. (2013). 8-Channel acquisition system for Time-Correlated Single-Photon Counting. *Rev. Sci. Instrum.*, 84, 064705.

- [13] Park, B.K., Kim, Y.S., Kwon, O., Han, S.W., Moon, S. (2015). High-performance reconfigurable coincidence counting unit based on a field programmable gate array. *Applied Optics*, 54(15), 4727–4731.
- [14] Zieliński, M., Kowalski, M., Frankowski, R., Chaberski, D., Grzelak, S., Wydźgowski, L. (2009). Accumulated jitter measurement of standard clock oscillators. *Metrol. Meas. Syst.*, 16(2), 259–266.
- [15] Zieliński, M. (2000). Digital Gated Single-Particle Counting System, The Errors Analysis. *IEEE Trans. on Instrum. and Measurement.*, 49(5), 1069–1076.
- [16] Tonietto, R., Zuffetti, E., Castello, R., Bietti, I. (2006). A 3MHz Bandwidth Low Noise RF All Digital PLL with 12ps Resolution Time to Digital Converter. *Solid-State Circuits Conference*, ESSCIRC, 150–153.
- [17] Santos, D.M., Dow, S.F., Flasck, J.M., Levi, M.E. (1996). A CMOS Delay Locked Loop and Sub-Nanosecond Time-to-Digital Converter Chip. *IEEE Transactions on Nuclear Science*, 43(3), 289–291.
- [18] Derogarian, F., Canas, J., Grade, V.M. (2014). A Time Synchronization Circuit with an Average 4.6 ns One-Hop Skew for Wired Wearable Networks. 17th Euromicro Conference on Digital System Design (DSD 2014), 146–153.
- [19] Buevich, M., Rajagopal, N., Rowe, A. (2014). Hardware Assisted Clock Synchronization for Real-Time Sensor Networks. *Real-Time Systems Symposium (RTSS)*, 268–277.
- [20] Chu, D.C. (1978). The triggered phase-locked oscillator. Hewlett-Packard J., 8–9.
- [21] Dudek, P., Szczepański, S., Hatfield, J.H. (2000). A high-resolution CMOS time-to-digital converter utilizing a Vernier delay line. *IEEE Trans. Solid-State Circuits*, 35(2), 240–247.
- [22] Johnson, M.G., Hudson, E.L. (1988). A Variable Delay Line PLL for CPU Coprocessor Synchronization. IEEE Journal of Solid-State Circuits, 23(5), 1218–1223.
- [23] Kim, H., Kim, S.Y., Lee, K.Y. (2013). A low power small area cyclic time-to-digital converter in all-digital PLL for DVB-S2 application. *Journal of Semiconductor Technology and Science*, 13(2), 145–151.
- [24] Szplet, R. (2009). Auto-tuned counter synchronization in FPGA-based interpolation time digitisers. *Electronics Letters*, 45(13), 671–672.
- [25] Jansson, J.P., Mäntyniemi, A., Kostamovaara, J. (2009). Synchronization in a Multi-level CMOS Time-to-Digital Converter. *IEEE Transactions on Circuits and Systems*, 56(8), 1622–1634.
- [26] Rehacek, J., Hradil, Z., Haderka, O., Perina, J. Jr., Hamar, M. (2003). Multiple-photon resolving fiber-loop detector. *Physical Review*. A, 67(6), 061801.
- [27] Dygdała, R., Fuso, F., Arimondo, E., Zieliński, M. (1995). Modular digital box-car for applications in pulsed laser spectroscopy. *Rev. Sci. Instrum.*, 66(6), 3507–3512.
- [28] Frankowski, R., Kowalski, M., Zieliński, M. (2011). The phase fluctuations of the clock signal generated in the digital frequency synthesis process. *Electrical Review*, 87(9a), 95–100.
- [29] Product specification. (2006). Xilinx Corp. User Guide for EDK: ML40x EDK Processor Reference Design, UG082 (v5.0).
- [30] Zieliński, M. (2009). Review of single-stage time-interval measurement modules implemented in FPGA devices. *Metrol. Meas. Syst.*, 16(4), 641–647.
- [31] Wu, J. (2010). Several key issues on implementing delay line based TDCs using FPGAs. *IEEE Trans. Nucl. Sci.*, 57(3), 1543–1548.
- [32] Szplet, R., Jachna, Z., Kwiatkowski, P., Rożyc, K. (2013). A 2.9 ps equivalent resolution interpolating time counter based on multiple independent coding lines. *Measurement Science and Technology*, 24(3), 35904– 15.
- [33] Chaberski, D. (2016). Time-to-digital-converter based on multiple-tapped-delay-line. *Measurement*, 89, 87– 96.
- [34] Frankowski, R., Gurski, M., Płóciennik, P. (2016). Optical methods of the delay cells characteristics measurements and their applications. *Optical and Quantum Electronics*, 48(3), 1–19.
- [35] Mota, M., Christiansen, J. (1999). A high-resolution time interpolator based on a delay locked loop and an R-C delay line. *IEEE J. Solid State Circuits*, 34(10), 1360–1366.

- [36] Product specification. (2006). Xilinx Corp. User Guide: ML401/ML402/ML403 Evaluation Platform, UG080 (v2.5).
- [37] Cova, S., Bertolaccini, M. (1970). Differential linearity testing and precision calibration of multichannel time sorters. *Nuclear Instruments and Methods*, 77(2), 269–276.
- [38] Frankowski, R., Chaberski, D., Kowalski, M. (2015). An optical method for the time-to-digital converters characterization. Proc. IEEE ICTON 2015, Budapest, Hungary, paper We.P.14, 1–4.
- [39] Forsythe, G.E. (1972). Von Neumann's Comparison Method for Random Sampling from the Normal and Other Distributions. *Mathematics of Computation*, 26(120), 817–826.