# Phase linearity and uniformity in optoelectronic-VLSI receiver arrays

Michael B. Venditti, Joshua D. Schwartz, and David V. Plant Department of Electrical and Computer Engineering, McGill University<sup>\*</sup>

# ABSTRACT

Parallel synchronous digital links require tight control over inter-channel skew in order to obtain maximum performance. In parallel optical interconnects (POIs), inter-channel skew can arise from differences in optical and electrical path lengths as well as latency variations through transmitters and receivers. The receiver pre-amplifier is particularly susceptible to latency variation. In POI applications such as optoelectronic-VLSI, pre-amplifiers can exhibit variable latency and gain resulting from differences in optical power levels across the receiver array due to transmitter, laser, detector, or optical system non-uniformity. This establishes an input DC photocurrent and pre-amplifier operating point that varies across the array, resulting in gain and phase non-uniformity. We consider different receiver pre-amplifier designs and the susceptibility of their gain and phase delay uniformity to operating point variations. Feedback circuitry that stabilizes the pre-amplifier operating point is considered as a potentially robust approach for array-scale POI skew reduction. Optical ring oscillators are used to characterize the phase delay of different pre-amplifier designs by measuring the period of oscillation as a function of optical system power throughput.

Keywords: CMOS, parallel optical interconnects, optoelectronic-VLSI, optical receivers, latency

# **1** INTRODUCTION

Parallel synchronous digital links require tight control over inter-channel skew in order to obtain the highest possible performance. In parallel optical interconnects (POIs), channel-to-channel and intra-channel skew can arise from differences in optical and on-chip electrical path lengths as well as latency variations through transmitters and receivers.

The receiver pre-amplifier, which converts an input photocurrent into a small voltage signal, is susceptible to latency variations induced by changes in its operating point due to varying input DC photocurrents. This can arise due to differences in incident optical power levels across a receiver array caused by transmitter or optical system nonuniformity. This establishes an input DC photocurrent and pre-amplifier operating point that can vary across the array, resulting in gain and phase non-uniformity. These problems can be considerable in POI applications involving high optical power or optical systems with significant power throughput non-uniformity.

We consider different receiver pre-amplifier designs and the susceptibility of their gain and phase delay uniformity to operating point variations. Feedback circuitry that stabilizes the pre-amplifier operating point is considered as a potentially robust approach for array-scale POI skew reduction. Chip-to-chip optical ring oscillators are used to characterize the phase delay of different pre-amplifier designs by measuring the period of oscillation as a function of optical system power throughput.

#### 1.1 Characteristics of optoelectronic-VLSI receivers

In large-scale POI applications such as optoelectronic-VLSI (OE-VLSI), optical connections number in the hundreds or thousands<sup>1,2</sup>, and require large N × M arrays of vertical-cavity surface emitting lasers (VCSELs) and photodiodes (PDs) to be integrated with underlying receiver circuitry using some form of heterogeneous integration procedure such as flip-chip bonding<sup>1,3</sup>. The tight pitch of PDs in the array – typically 125  $\mu$ m, and possibly smaller – restricts the area available to implement the receiver, and can limit the circuit complexity that can be implemented. Some dedicated circuit stages and features typically found in receivers for telecommunications applications, such as automatic gain/offset control, clock/data recovery, and equalization may need to be omitted in OE-VLSI applications due to these restrictions.

<sup>\* {</sup>mikev, joshs, plant}@photonics.ece.mcgill.ca; phone 1 514 398-2531; fax 1 514 398-3127; http://www.photonics.ece.mcgill.ca; Dept. of Electrical and Computer Engineering, McGill University, 3480 University Street, Montreal, Quebec, Canada H3A 2A7.

On-chip passive components such as resistors, capacitors, and inductors often used in very-high data rate receiver applications generally must be sacrificed in OE-VLSI due to the same space constraints. As a consequence of these space restrictions, amplifier stages for OE-VLSI receivers must employ a conservative and simple design. In OE-VLSI applications, passive circuit elements are usually implemented using active devices – resistive feedback elements in receiver pre-amplifiers are a typical example<sup>4</sup>.

OE-VLSI systems generally provide optical I/O to digital processing circuitry, which usually parallel processes data synchronously, placing constraints on latency and skew for receivers that form logical channels. Additionally, OE-VLSI systems inherently differ from their telecommunications applications counterparts in terms of optical losses and required receiver sensitivity. In long-haul telecommunications applications, where hundreds or thousands of km of fiber are traversed, receiver sensitivities on the order of -25 dBm are common. In OE-VLSI, losses may be small, depending on the optical system implementation. Additionally, OE-VLSI systems tend to employ optical systems with poor optical system throughput (OSTP) uniformity, with as much as  $\pm 35\%$  variation from the mean within localized groups of optical data links<sup>5,6</sup>.

#### 1.2 Skew in optoelectronic-VLSI receivers

There are two main sets of causes of skew for OE-VLSI receivers: load imbalances, and incident optical power variations across a group of receivers. In OE-VLSI applications involving large receiver arrays and a modular system architecture<sup>7</sup> it is common for the performance of the receiver to ultimately be limited by its final driving stage, (typically a CMOS inverter/buffer) which must drive the on-chip interconnect between it and digital processing circuitry, which may be hundreds or even thousands of microns away. The interconnect RC delay,  $\tau$ , of long interconnect lines with resistance per unit length L, width W, and thickness T is given approximately by<sup>8</sup>:

$$\tau \approx \left(3 \times 10^{-18} s\right) \frac{L^2}{W \cdot T} \tag{1}$$

The delay of a CMOS inverter/buffer will thus depend strongly on the length of the interconnect to be driven, as well as the presence of parasitic capacitances along the line from neighboring interconnect lines and other structures, plus the penultimate capacitive load at the end of the interconnect line. Thus, differences in interconnect lengths or mismatches in surrounding parasitic structures can result in effective load imbalances for the final driving stage of a group of receivers, which lead directly to skew.

A typical tradeoff in OE-VLSI receiver design is the choice of the pre-amplifier bias current. If this current is large, there will be appreciable aggregate power dissipation in the array, which is undesirable. The smaller the bias current is, though, the more sensitive it is to changes in the input DC photocurrent, which generally enter the pre-amplifier circuit and affect its operating point. Further, "passive" devices implemented with active devices, such as the feedback resistance in a trans-impedance amplifier, will have its properties strongly affected by the amount of DC current flowing through it. This can lead to changes, even at midband frequencies, of the pre-amplifier gain and phase characteristics. In the presence of optical system throughput variations across a group of receivers, this can result in skew, and will be the focus of the sections that follow.

#### **2** RECEIVER DELAY VARIATIONS

#### 2.1 Gain and phase analysis

Many characteristics of transistor amplifiers, such as midband gain, pole frequency locations and phase delay or latency are dependent on their operating point. In optical receiver applications, the input DC photocurrent generally affects the DC biasing of one or more transistors in the pre-amplifier. In long-haul telecommunications applications, the magnitude of the input DC photocurrent is generally on the order of micro-amps, and the nominal biasing currents within the pre-amplifier can simply be made large with respect to this magnitude to ensure that the pre-amplifier operating point is not sensitive to changes in the input DC photocurrent. In OE-VLSI applications, however, input photocurrents can be much larger – on the order of tens or hundreds of micro-amps – and there is a desire to maintain a smaller biasing current in the pre-amplifier from a perspective of overall power dissipation in the array. Thus, the pre-amplifier operating point



Figure 1. Stick diagram schematics of (a) TIA and (b) CGA preamplifiers.

will be inherently more sensitive to changes in the input DC photocurrent.

Figure 1 shows generic schematics for a classical trans-impedance amplifier (TIA) and a common-gate amplifier (CGA). Expressions for the midband transresistance gain of these amplifiers are given in (2)

$$\frac{v_{o}}{i_{ph}}\Big|_{TIA} = -\frac{R_{F} - \frac{1}{g_{m1}}}{1 + \frac{1}{g_{m1}}(r_{o1} || R_{D})}$$
(2)  
$$\frac{v_{o}}{i_{ph}}\Big|_{CGA} \approx \frac{g_{m1}r_{ol}}{1 + g_{m1}r_{ol}} \cdot R_{D}$$

where  $g_{m1}$  and  $r_{o1}$  are the transconductance and output resistance of transistor M1, respectively, and  $r_{o1}$  is the output resistance of the tail current source for the CGA amplifier. In integrated circuit form, the  $R_D$  and  $R_F$  resistances would be implemented using active devices, and would introduce additional small-signal parameters into (2). All of the smallsignal parameters in (2) are dependent on the operating points of the amplifiers and the magnitude of the input DC photocurrent. In the presence of input DC photocurrent variations across a group of optical receivers, the amplifier operating points will correspondingly change and can affect the small signal transistor parameters in (2), resulting in a variation in midband transresistance gain across the group of receivers.

It can be shown that, due to the dominating effect of the photodetector capacitance, both the TIA and CGA preamplifiers have a frequency response with a dominant pole proportional to the ratio of  $g_{m1}$  to the total capacitance at the input node,  $C_{IN}$ , as shown in (3).

$$\omega_{3dB} \propto \frac{g_{m1}}{C_{IN}} \quad (3)$$

In the case of the TIA pre-amplifier,  $C_{IN}$  includes the photodetector capacitance, the gate-source and gate-bulk capacitances of M1, the parasitic capacitance of the feedback element, and a portion of the gate-drain capacitance of M1. For the CGA pre-amplifier,  $C_{IN}$  includes the photodetector capacitance, and the gate-source and source-bulk capacitances of M1. The exact expression for  $\omega_{3dB}$  for each pre-amplifier includes additional small-signal parameters. Thus, input DC photocurrent variations across a group of optical receivers can result in variations in pre-amplifier  $\omega_{3dB}$  across a group of



Figure 2. Illustration of optical ring oscillator. Dashed boxes indicate a single board consisting of a receiver, a transmitter, and either an inverter or buffer.

receivers in a manner similar to its effect on the midband transimpedance gain described above. Similarly, for receiver operation at a given frequency or data rate, the phase delay through a group of pre-amplifiers can also be variable in response to variable input DC photocurrents.

#### 2.2 Simulations

To evaluate the sensitivity of receiver phase delays to input DC photocurrent, an approach using simulated optical ring oscillators (OROs) in SPICE was employed, mimicking an experimental setup to be implemented later. Figure 2 illustrates a simulated board-to-board link, with each board comprising an optically and electrically differential receiver (including PDs) whose output drives an optically differential transmitter (including VCSELs). The transmitter outputs of a board were connected to the receiver inputs of the other board, forming a ring. Variable optical attenuation was implemented using voltage controlled voltage sources. With variable attenuation, the throughput non-uniformity of an optical system for an OE-VLSI application could be modeled. The overall ring was made inverting by having the receiver output of one of the boards inverted before driving the transmitter. The resulting ring oscillates with a period of oscillation ( $T_{OSC}$ ) given by the sum of all delays in the ring

$$T_{OSC} = 2 \cdot (\text{RX delay} + \text{TX delay} + \text{inverter/buffer delay})$$
(4)

As the optical attenuation is varied, T<sub>OSC</sub> changes correspondingly to changes in the delays of the various components in the ring.

The receiver circuits were comprised of a pre-amplifier stage followed by four post-amplifier stages and a line driver output stage. Two receiver pre-amplifier circuits – one TIA and one CGA – were designed (design details will be given in section 2.3) for simulation characterization via the ORO. With attenuation ranging from 95% to 90%, corresponding to OSTPs between 5% and 10%, the dependence of  $T_{OSC}$  on OSTP for each receiver design is shown in figure 3. This OSTP range approximately corresponds to an optical system with 7.5% nominal throughput with  $\pm 35\%$  variations from the mean, and will be used as a reference optical system throughout this paper.

The focus will be on changes to  $T_{OSC}$  as a result of the different optical throughputs considered. This is effectively described in (5):

$$\Delta T_{\alpha sc} = 2 \cdot (\Delta RX \text{ delay} + \Delta TX \text{ delay} + \Delta inverter/buffer \text{ delay})$$
(5)

Intuitively, the effect of optical attenuation on the delay of the transmitter should be minimal, and this is confirmed through simulation, where the transmitter delay was found to change by less than one hundredth of one percent for OSTPs between 2.5% and 10%. Further, because the receiver outputs were rail-to-rail CMOS signals for all OSTP val-

Proc. of SPIE Vol. 4788 61



Figure 3. Simulated dependence of the optical ring oscillator period of oscillation on the optical system throughput for the TIA and CGA receiver pre-amplifier designs.

ues considered, the delay of the inverter/buffer does not change significantly, either. Thus, (5) simplifies to (6):

$$\Delta T_{OSC} \approx 2 \cdot \Delta RX \text{ delay} \tag{6}$$

The results of figure 3 show that receiver skews of 54.5 ps and 270 ps can exist for the TIA and CGA receiver preamplifier designs, respectively, when the reference optical system is used.



Figure 4. Simulated receiver stage delays for a receiver (CGA pre-amplifier design) in the ORO as a function of optical system throughput.

62 Proc. of SPIE Vol. 4788



Figure 5. Receiver design. (a) Block-diagram overview. Transistor-level schematics of (b) CGA and (c) TIA pre-amplifiers, and (d) post-amplifier and (e) decision-stage

Proc. of SPIE Vol. 4788 63

The effects of a variable OSTP on the delays through each receiver stage was considered, with results shown in figure 4 for a receiver with a CGA pre-amplifier. As one would expect, given that it is most directly affected by a variable OSTP, the pre-amplifier exhibits the largest change in delay of all the receiver stages. The changes in delay of the other receiver stages are due mainly to the variations in output voltage levels and signal magnitudes, which affect their operating points.

#### 2.3 Receiver Designs

The receiver designs were targeted for Peregrine Semiconductor's  $0.5 \,\mu\text{m}$  Ultra Thin Silicon CMOS (UTSi®) process technology<sup>9</sup>. The general receiver architecture is shown in figure 5 (a). The pre-amplifier is followed by three identical post-amplifier stages for gain and then by a decision stage for logic thresholding and an output stage to drive the onchip interconnect and the load represented by the inverter/buffer and transmitter. All circuit stages are fully differential except for the decision and output stages. A basic common-mode feedback (CMFB) circuit is used to provide bias inputs to the fully differential amplifiers and the decision stage, with its inputs taken after the first post-amplifier stage.

Transistor-level schematics for the CGA and TIA pre-amplifiers are shown in figures 5 (b) and (c), respectively. The CGA pre-amplifier design follows that of figure 1 (b), using a diode-connected PMOS load and source-follower stages for voltage level-shifting. The TIA pre-amplifier is a differential version of figure 1 (a), and uses gate-shorted PMOS devices as resistive feedback loads. Transistor-level schematics for the post-amplifier and decision stage circuits are shown in figures 5 (d) and (e), respectively. The post-amplifier is a simple fully differential amplifier with a diode-connected load and a source follower for voltage level shifting. The decision stage is a current-mirror op-amp<sup>10</sup>, and the receiver output stage is a simple CMOS inverter cascade.

### **3** MINIMIZATION OF RECEIVER DELAY VARIATION

#### 3.1 Background

As described in section 2, receiver delay variations can arise from non-uniformity in optical system throughput, which causes changes in the operating points of the various receiver circuit stages. By attempting to stabilize the operating points of each receiver stage, receiver delay variations can be minimized.

One approach used to deal with a variable input DC photocurrent component is AC-coupling<sup>11,12</sup>, which utilizes large on-chip DC-block capacitors. As described in section 1.1, such an approach is not appropriate for OE-VLSI applications due to space constraints. Another approach has made use of optical filtering to protect against the detrimental effects of ambient lighting<sup>13</sup>, effectively creating an optical passband. Such filters, although efficient, are expensive and impractical for implementation in an OE-VLSI environment due to the scale of the optical interconnect.

An integrated-circuit approach to input DC photocurrent rejection (DCPR) has been proposed using feedback for current shunting for an electrically differential receiver with a single-ended optical input<sup>14</sup>. This approach uses feedback to control a current shunting transistor which draws away the input DC photocurrent from the photodetector before it enters the input node of the pre-amplifier. We have chosen to implement a variation of this approach due to its integrated circuit compatibility.



Figure 6. Block-diagram overview of receiver design with feedback-based DCPR circuitry.

64 Proc. of SPIE Vol. 4788

#### 3.2 Analysis of feedback-based DCPR solution

Figure 6 shows the block diagram of the modified receiver. Compared to the original receiver from figure 5 (a), an additional CMFB block is added with its inputs taken from the outputs of the pre-amplifier. This CMFB block controls two current-shunting transistors intended to draw the DC component of the input photocurrent away from the photodetectors before it enters the input nodes of the pre-amplifier, enhancing the stability of the pre-amplifier operating point and reducing variations in pre-amplifier latency in response to a variable OSTP.

The ORO simulations described in section 2.2 were repeated, this time using the receiver designs outlined in this section, tabulating the measured  $T_{OSC}$  for various OSTP values. Figure 7 presents the results of the ORO simulations both with and without feedback-based DCPR for the (a) TIA and (b) CGA pre-amplifier designs, respectively. For both pre-amplifier designs, a reduction in receiver latency variation is exhibited when DC photocurrent rejection is used. The receiver skew for the corresponding reference optical system for the TIA and CGA pre-amplifier designs are reduced from 54.5 ps to 27 ps, and from 270 ps to 47.5 ps, respectively. The use of feedback-based DCPR also tends to reduce the overall receiver latency. For the TIA pre-amplifier design, the average latency over the indicated OSTP range is reduced by 27 ps; for the CGA pre-amplifier design, it is reduced by 356 ps.



Figure 7. Simulated ORO oscillation period dependency on OSTP both with and without feedback-based DCPR circuitry for the (a) TIA and (b) CGA pre-amplifier designs.

# 4 EXPERIMENTAL VERIFICATION

To verify the capabilities of the considered feedback-based DCPR approach experimentally, an integrated circuit was fabricated containing the receiver and transmitter designs considered in sections 2 and 3 for use in an experimental ORO configuration. The electrical path between the receiver and transmitter contained a logical XNOR gate, which was used to invert or buffer the signal in response to an external control signal. With two chips configured for use in the ORO setup, one would have the XNOR gate set to invert, while the other chip would have it set for buffering, establishing the ring as an overall negative logic path.

The four variations of receiver-transmitter circuits (TIA and CGA, each with and without feedback-based DCPR circuitry) were located in the four corners of the chip. Chips were packaged in a 100-pin pin-grid-array (PGA) package along with a  $1 \times 4$  bar of VCSELs and a  $1 \times 4$  bar of PDs straddling an appropriate chip corner. To facilitate chip-to-chip



Figure 8. Photograph of packaged chip and VCSEL and PD bars imaged using ORO setup.

alignment, the VCSEL and PD bars were abutted with the sides of the chip and its bond pads were aligned with bond pads of the chip, as illustrated in figure 8. It was possible to do this because the substrate of the CMOS chip is made of non-conductive sapphire.

For chip imaging and signal extraction from the ORO, two 25 mm collimating lenses and two 50/50 beam splitters were inserted along the chip-to-chip optical path, resulting in a nominal OSTP of approximately 18%. Additional attenuation to achieve the OSTP ranges in sections 2 and 3 is achieved by inserting neutral density filters between the two beam splitters. Measurement of the oscillation period is achieved by tapping the ORO at one of the beam splitters, spatially filtering one the four available optical signals, converting it into an electrical signal with an external detector, and then using a digitizing oscilloscope to measure the oscillation frequency. Figure 9 shows a picture of the nominal experimental ORO optical setup.



Figure 9. Photograph of experimental ORO setup.

The period of oscillation measured using the optical ORO setup will be greater than that determined via simulation due to the additional delay of the chip-to-chip optical time of flight. Thus,  $T_{OSC}$  will correspond to (7):

$$T_{OSC} = 2 \cdot (\text{RX delay} + \text{TX delay} + \text{inverter/buffer delay} + \text{Time of flight})$$
 (7)

However, because the time of flight is independent of OSTP, the expression for changes in  $T_{OSC}$  represented by (6) is still valid.

The experimental ORO setup is currently being refined, and chips are being packaged. Experimental results will be presented at the conference.

# 5 CONCLUSION

We have presented feedback-based DC photocurrent rejection circuitry that can be used to improve the operating point stability of optical receiver pre-amplifier circuitry, and reduce variations in receiver latency arising as a result of input DC photocurrent variations. We have also presented a novel simulation and experimental approach to measuring receiver latency using an optical ring oscillator.

Simulation-based optical ring oscillator results indicate a significant reduction in receiver latency variation (corre-

sponding to reduced skew in receiver arrays) can be achieved when the DC photocurrent rejection circuitry is used. An improvement of 50.5% (from 54.5 ps down to 27 ps) for the TIA pre-amplifier design and 82.4% (from 270 ps down to 47.5 ps) for the CGA pre-amplifier design was achieved. Additionally, an overall reduction in receiver latency can be achieved when the DC photocurrent rejection circuitry is used: 27 ps on average for the TIA pre-amplifier design, and 356 ps on average for the CGA pre-amplifier design.

#### ACKNOWLEDGEMENT

This work was supported by graduate and undergraduate scholarships from the National Science and Engineering Research Council of Canada and by BAE Systems under a contract with the DARPA/Army Research Laboratory VLSI-Photonics program, DAAL01-98-C-0074. Support from the CoOp/Peregrine foundry program as well as packaging assistance from Jean-Philippe Thibodeau is also gratefully acknowledged.

#### REFERENCES

- D. V. Plant, M. B. Venditti, E. Laprise, J. Faucher, K. Razavi, M. Châteauneuf, A. G. Kirk, and J. S. Ahearn, "256channel bi-directional optical interconnect using VCSELs and photodiodes on CMOS," IEEE Journal of Lightwave Technology 19, pp. 1093-1103, 2001.
- M. B. Venditti, J. Faucher, D. V. Plant, E. Laprise, P.-O. Laprise, and J. S. Ahearn, "Design and Verification of an OE-VLSI Chip with 1080 VCSELs and PDs Heterogeneously Integrated with CMOS," LEOS 2001 Annual Meeting, post-deadline paper PD-1.4.
- 3. D. L. Mathine, "The integration of III-V optoelectronics with silicon circuitry," IEEE Journal of Selected Topics in Quantum Electronics 3, pp. 952-959, 1997.
- 4. T. K. Woodward and A. V. Krishnamoorthy, "1Gbit/s CMOS photoreceiver with integrated detector operating at 850nm," Electronics Letters **34** (12), pp. 1252-1253, 1998.
- 5. M. Châteauneuf, A. G. Kirk, D. V. Plant, T. Yamamoto, J. S. Ahearn, "512-channel vertical-cavity surface-emitting laser based free-space optical link," Submitted to Applied Optics, Jan. 2002.
- 6. T. Maj, "Interconnection of a 2D vertical-cavity surface-emitting laser array to a receiver array via a fiber image guide," M.Eng. dissertation, McGill University, Montreal, Canada, 1999.
- M. B. Venditti, E. Laprise, and D. V. Plant, "On the design of large transmitter and receiver arrays for OE-VLSI applications," Proceedings of the International Topical Meeting on Optics in Computing 2002, April 2002, pp. 158-160.
- 8. "Design of high-performance microprocessor systems," A. Chandrakasan, W. J. Bowhill, and F. Fox, eds., IEEE Press, Piscataway, NJ, 2001, pg. 40.
- 9. http://www.peregrine-semi.com/pdf\_utsi\_utsiformixedsignalics.pdf
- 10. D. A. Johns and K. Martin, "Analog Integrated Circuit Design," John Wiley and Sons, Inc., New York, pp. 273-276.
- 11. M. B. Ritter et al., "Circuit and system challenges in IR wireless communication," 1996 IEEE International Solid-State Circuits Conference, ISSCC96, paper SP 25.1, pp. 398-400.
- 12. T. Ruotsalainen et al., "A current-mode gain-control scheme with constant bandwidth and progagation delay for a transimpedance preamplifier," IEE J. Solid-State Circuit., **34**, no. 2, Feb. 1999, pp. 253-258.
- 13. A. J. C. Moreira et al., "Reducing the effects of artificial light interference in wireless infrared transmission systems," IEE Colloquium on Optical Free Space Communication Links, Feb. 1996, paper 5.
- 14. K. Phang and D. A. Johns, "A CMOS optical preamplifier for wireless infrared communications," IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, **46**, no. 7, July 1999, pp. 852-859.

Proc. of SPIE Vol. 4788 67