### THE UNIVERSITY OF TEXAS AT DALLAS

ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE



# **ANNUAL REPORT** 2022 – 2023



Semiconductor Research Corporation



## **TXACE MISSION**

The Texas Analog Center of Excellence seeks to create fundamental analog, mixed signal and RF design innovations in integrated circuits and systems that improve energy efficiency, health care, and public safety and security.

## **TXACE THRUSTS**

Safety, Security and Health Care
 Energy Efficiency
 Fundamental Analog Circuits

## TxACE 2022–2023 ANNUAL REPORT

The Texas Analog Center of Excellence (TxACE), located at the University of Texas at Dallas is the largest analog research center based in an academic institution. Analog and mixed signal integrated circuits engineering is both a major opportunity and challenge. Analog circuits are critical components of the majority of products for the \$550+ billion per year integrated circuits industry, providing sensing, actuation, communication, power management and other functions. Digital integrated circuits such as microprocessors, logic circuits and memories are now integrating analog functions such as input/output circuits, phase locked loops, temperature sensors and power management circuits. It is also common to find microcontrollers with multiple analog-to-digital and digital-to-analog converters. These circuitries impact almost all aspect of modern life: safety security, health care, transportation, energy, entertainment and others.

Creation of advanced analog and mixed signal circuits and systems depends on the availability of engineering talent for analog research and development. TxACE was established to help translate the opportunity into economic benefits by overcoming the challenge and meeting the need through a collaboration of the state of Texas, Texas Instruments, the Semiconductor Research Corporation, the University of Texas System, and The University of Texas at Dallas.

The research tasks are organized into three research thrust areas: Safety, Security and Health Care, Energy Efficiency and Fundamental Analog. The scope of investigation extends from circuits operating at dc through terahertz, data converters that sample at a few samples/sec to 10's of Giga-samples/sec, AC-to-DC and DC-to-AC converters working at  $\mu$ W to Watts, energy harvesting circuits, sensors and many more. Significant improvements to existing mixed signal systems and new applications have been made and continued to be anticipated. Students who have been exposed to hands-on innovative research are forming the leading edge of analog talent flow into the industry. Close collaboration with and responsiveness to industry needs provide focus to the educational experience.

### DIRECTOR'S MESSAGE



The Texas Analog Center of Excellence (TxACE) is leading analog research and education. Over the past year, TxACE researchers published 24 journals, 59 conference papers and made 10 invited presentations. We also filed 9 patent applications and 2 invention disclosures. 3 patents were granted. 37 Ph.D., 8 M.S., and 2 B.S. students have completed their degree program.

Last year, the Center funded 82 research tasks led by 73 principal investigators at 28 institutions, including three international universities in India, Taiwan, and Canada. The Center supported 215 graduate and undergraduate students.

The Center is continuing to make impact to the industry and our way of life through its research accomplishments. There are always too many to list all. A partial list includes demonstration of a temperature- and agingcompensated RC Oscillator that achieves an inaccuracy of ±1030 ppm over -40°C to 85°C after 500 hours of accelerated aging at 125°C, improvement of efficiency of power delivery from high-voltage busses to scaled-CMOS-compatible voltages (<1V) by ~10% employing a vertically and heterogeneously integrated architecture leveraging hybrid and switched-capacitor DC-DC converters, capturing STEM images of p-GaN E-mode HEMTs with in-situ bias for electrical stress, a hybrid ADC in 28-nm CMOS that combines a VCO-based continuous-time delta-sigma modulator with a noise-shaping successive approximation register quantizer that achieves an 84.2-dB signal-to-noise-distortion ratio and an 86.8-dB dynamic range while consuming 1.62mW at 100MS/s, and a programmable accelerator for DNN in 28-nm CMOS for edge applications achieving 3.1 µJper-inference with 90.2% accuracy on the CIFAR-10 benchmark.

The TxACE laboratory is continuing to help advance integrated circuit research by making its instruments and expertise available to researchers and our industrial partners all over the world.

I am pleased to share the news that TxACE has been renewed by SRC for another three years with a total support of \$19.3 Million. I would like to thank UT Dallas, the University of Texas System, TI, and SRC, as well as many friends of TxACE all over the world for their generous support. Lastly, I would like to thank the students, principal investigators and staff for their efforts, and I look forward to another year of working with the TxACE team to make our way of life better, safer, healthier and more energy efficient through our research, education and innovation.

Kenneth K. O, Director TxACE Texas Instruments Distinguished University Chair Professor The University of Texas at Dallas

### **BACKGROUND & VISION**

The \$550+ billion per year integrated circuits industry is evolving into an analog/digital mixed signal industry. Analog circuits are providing or supporting critical functions such as sensing, actuation, communication, power management and others. These circuits impact almost all aspect of modern life including safety, security, health care, transportation, energy, and entertainment. To lead this change, in particular to lead analog and mixed signal technology education, research, commercialization, manufacturing, and job creation, the Texas Analog Center of Excellence was announced by Texas Governor Rick Perry in October 2008 as a collaboration of the Semiconductor Research Corporation, state of Texas through its Texas Emerging Technology Fund, Texas Instruments Inc., University of Texas system and University of Texas at Dallas. The Center seeks to accomplish the objectives by creating fundamental analog, mixed signal and RF design innovations in integrated circuits and systems that improve energy efficiency, healthcare, and public safety and security as well as by improving the research and educational infrastructure.



Figure 1. TxACE organization relative to the sponsoring collaboration.

### **CENTER ORGANIZATION**

The Texas Analog Center of Excellence is guided by agreements established with the Center sponsors. Members of the industrial advisory boards identify the research needs and select research tasks in consultation with the Center leadership. Figure 1 diagrams the relationship of TxACE to the members of the sponsoring collaboration.

## The internal organization of the Center is structured to flexibly perform the research mission while fully embracing the educational missions of the Universities.

Figure 2 shows the center management structure. The TxACE Director is Professor Kenneth O. The research is arranged into three thrusts that comply with the center mission: Safety, Security and Health Care, Energy Efficiency and Fundamental Analog Research. The third thrust consists of vital research that cuts across the first two research thrusts. The thrust leaders are Prof. Yiorgos Makris of the University of Texas at Dallas for safety, security and health care, and Prof. Ali Niknejad of the University of California, Berkeley for energy efficiency. The leader for fundamental analog is Prof. Pavan Hanumolu of University of Illinois, Urbana-Champaign. The thrust leaders along with Professor Dongsheng Ma of The University of Texas at Dallas form the executive committee. The committee, along with the director, forms the leadership team that works to improve the research productivity by increasing collaboration, better leveraging the diverse capabilities of principle investigators of the Center, and lowering research barriers. The leadership team also identifies new research opportunities for consideration by the Industrial Advisory Boards.



Figure 2. TxACE organization for management of research

### SAFETY, SECURITY & HEALTH CARE

(Thrust leader: Yiorgos Makris, University of Texas at Dallas)

The efforts in the Safety, Security, and Health care thrust focus on improving safety by mitigating various reliability threats in analog/RF devices, including manufacturing defects, process variation, ESD and thermal degradation, as well as by developing effective machine learning-based design, verification and self-test solutions for mixed-signal automotive ICs. Particular emphasis has been placed on characterizing circuit aging, predicting failures and increasing lifetime of nano-scale CMOS circuits. An array of approaches is being developed for infield detection and localization of both hard and soft analog faults, as well as for low-cost design, test and calibration of RF MIMO systems. Machine learning assisted Design-for-Test solutions as well as inductive fault analysis schemes for statistically characterizing effectiveness of test suites through analog test metrics are also being developed for analog/RF ICs, including DACs. Additionally, this thrust investigates G-Band CMOS mmWave imagers and sensors for biomedical applications and IR gas sensors, as well as methods for analyzing reliability and monitoring the condition of GaN HEMTs. Lastly, this thrust investigates methods for motor health monitoring, laser systems for creating single-event effects that can be used to study radiation tolerance, as well as efficient temperature sensors for thermal performance characterization in power ICs.



Figure 3. (Top left) Automatically generated layout of 12-nm LVT odometer (C. Kim, University of Minnesota), (Top right) Setup for error measurements in RF MIMO systems (S. Ozev, University of Arizona). (Middle left) PLL incorporating a 4-GHz VCO, array of cross-coupled NMOS transistors and circuits for on-chip phase noise measurements (K. O, University of Texas at Dallas). (Middle center) TDDB ageing sensor (D. Chen, University of Iowa). (Middle right) Anomaly detection framework for automotive ICs (K. Basu, University of Texas at Dallas). (Bottom left) Axicon lens creating Bessel laser beam mimicking heavy ion event distribution (R. Baumann, University of Texas at Dallas). (Bottom center) Array of 200 SRR-based pixels for 2D biological sample imaging (A. Niknejad, University of California, Berkeley). (Bottom right) Setup for transient reliability and condition monitoring of GaN HEMTs (B. Akin, University of Texas at Dallas).

### **ENERGY EFFICIENCY**

(Thrust leader: Ali Niknejad, UC Berkeley)



Figure 4. The TxACE Energy Efficiency thrust has diverse tasks ranging from advanced power management to digital and analog mixed-signal AI/ML for sensors and IoT systems, spanning a power range all the way from automotive kilowatts to always-on sensor systems consuming microwatts. (Top left) Pipelined machine learning processor architecture for bottleneck layers employed in a 28-nm CMOS prototype achieving 0.83/2.7 µJ/Frame for 86%/90.6% on CIFAR-10 (B. Murmann, Stanford University). (Top right) The two-stage vertical power delivery architecture for 20Vto-1V conversion, VPD-1 including an HVRM and an iPD + SCVR. The heterogeneous integrated PD system prototype achieves ~10% higher peak system efficiency than the state of the art (H. Le, University of California, San Diego). (Bottom, left) Architecture of a test chip that uses predicted voltage based on the CPU operating conditions from a machine learning core to modify the PWM control signal of a buck converter for regulating supply droop (J. Gu, Northwestern University). (Bottom, right) Active EMI filtering uses closed-loop control to sink/source a copy of the power converter ripple current. The amplifier is typically a linear amplifier, which incurs high loss if it must handle large ripple currents, instead a switch-mode amplifier is employed. A prototype demonstrates more than 40 dB of ripple attenuation at 1/8 the volume of a conventional LC filter (A. Hanson, University of Texas, Austin)

The TxACE Energy and Efficiency thrust encompasses cross-cutting research tackling energy efficiency in electronic systems, spanning from advanced power management, all the way to the emerging fields of low power machine learning/AI for edge computing and applications to IoT sensor nodes. The power management research forms the foundation of the center and tackles important issues of efficiency in complex system applications, for example in digital multi-core systems that use single inductor multiple output (SIMO) DC-DC converters, addressing modeling and simulation and optimization of performance (transient response, EMI, security) using non-linear computational control, mixed-signal techniques, digital signal processing, and adaptive algorithms and design automation. This thrust investigates non-conventional hybrid architectures and integration strategies for applications in computing, large-ratio conversion from 48V down to 1V and below, and charging applications. Many of the solutions employ mixed-signal techniques, exploiting advanced CMOS digital nodes alongside GaN power devices, and utilize novel scaling friendly analog architectures to improve the control and expand the flexibility of the overall system.

### FUNDAMENTAL ANALOG CIRCUITS RESEARCH

(Thrust leader: Pavan Hanumolu, U. of Illinois Urbana-Champaign)

The research in this thrust focuses on cross-cutting areas in analog and mixed-signal circuits, which impact all TxACE application areas (Energy Efficiency, Public Safety, Security, and Health Care). The research includes the design of various analog-to-digital converters, communication links, low-power crystal oscillators, on-chip frequency references, I/O circuits, noise reduction techniques, new amplifier topologies suitable for use in nano-scale CMOS, development of CAD tools for automatic design, layout generation, and testing of integrated circuits.



Figure 5. (Left) Noise-shaping SAR ADC (M. Flynn, University of Michigan). (Left middle) aging-compensated RC-based frequency reference (P. Hanumolu, University of Illinois Urbana-Champaign). (Top, right middle) DNN-based exploration engine used in layout-aware analog synthesis (D. Pan, University of Texas, Austin). (Top, right) multi-carrier DAC-based transmitter (S. Palermo, Texas A&M University). (Bottom, right middle) Placement and routing for an 8-bit split DAC (S. Sapatnekar, University of Minnesota). (Bottom, right) bi-directional PA-LNA (H. Wang, National Taiwan University).

### **TXACE ANALOG RESEARCH FACILITY**

The centralized group of laboratories of the Texas Analog Center of Excellence dedicated to analog engineering research and training occupy a ~ 8000-ft<sup>2</sup> area on the 3rd floor of the Engineering and Computer Science North building (Figure 6). The facility includes RF and THz, Integrated System Design, Embedded Signal Processing, and Analog & Mixed Signal laboratories as well as CAD/Design laboratory structured to promote collaborative research. The unique instrumentation capability includes network analyses and linearity measurements up to 325 GHz, spectrum analysis up to 120 THz, and cryo-measurements down to 2°K. The Center also added a pulsed multiple harmonic load and source pull measurement set up (up to 60 GHz for the third harmonic) and a 325-GHz antenna measurement set up. The close proximity of researchers in an open layout enables natural interaction and compels sharing of knowledge and instrumentation among the students and faculty. The TxACE analog research facility is one of the best equipped electronics laboratories. The laboratory is available for use by TxACE researchers and industrial partners all over the world.



Figure 6. TxACE Analog Research Facility

### **RESEARCH PROJECTS AND INVESTIGATORS**

The Texas Analog Center of Excellence (TxACE) is the largest university analog technology center in the world. Table 1 lists the current principal investigators of the 82 tasks from 28 academic institutions funded by TxACE. Four universities (SMU, Texas A&M, UT Austin, UT Dallas) are from the state of Texas. 24 are from outside of Texas. Three (Indian Institute of Tech. Kharagpur, National Taiwan University, and University of Toronto) (Figure 7) are from outside of the US. Of the 73 investigators, 25 are from Texas. During the past year, the Center supported 164 Ph.D., 29 M.S., and 22 B.S. students. 37 Ph.D., 8 M.S., and 2 B.S. degrees were awarded to the TxACE students.

| Investigator  | Institution                            | Investigator | Institution                            | Investigator         | Institution      |
|---------------|----------------------------------------|--------------|----------------------------------------|----------------------|------------------|
| B. Akin       | UT Dallas                              | A. Hazra     | Indian Institute of<br>Tech. Kharagpur | B. Murmann           | Stanford         |
| N. Al-Dhahir  | UT Dallas                              | R. Henderson | UT/Dallas                              | F. Najm              | U Toronto        |
| D. Allstot    | Oregon State                           | D. Heo       | Washington State                       | A. Niknejad          | UC Berkeley      |
| A. Babakhani  | UCLA                                   | S. Hoyos     | Texas A&M                              | К. О                 | UT Dallas        |
| B. Bakkaloglu | Arizona State                          | C. Huang     | lowa State                             | S. Ozev              | Arizona State    |
| K. Basu       | UT Dallas                              | T. Huang     | National Taiwan                        | S. Palermo           | Texas A&M        |
| R. Baumann    | UT Dallas                              | Y. Jia       | UT Austin                              | D. Pan               | UT Austin        |
| D. Blaauw     | U Michigan                             | Y. Kaneda    | U Arizona                              | P. Pande             | Washington State |
| A. Chatterjee | Georgia Tech                           | C. Kim       | U Minnesota                            | M. Quevedo-<br>Lopez | UT Dallas        |
| D. Chen       | Iowa State                             | M. Kim       | UT Dallas                              | G. Rincón-Mora       | Georgia Tech     |
| S. Chen       | USC                                    | J. Kulkarni  | UT Austin                              | R. Rohrer            | SMU              |
| Y. Chiu       | UT Dallas                              | H. Le        | UCSD                                   | E. Rosenbaum         | UIUC             |
| P. Dasgupta   | Indian Institute of<br>Tech. Kharagpur | H. Lee       | UT Dallas                              | S. Sapatnekar        | U Minnesota      |
| J. Doppa      | Washington State                       | M. Lee       | UT Dallas                              | V. Sathe             | U Washington     |
| M. Flynn      | U Michigan                             | T. Levi      | USC                                    | M. Seok              | Columbia         |
| J. Friedman   | UT Dallas                              | P. Li        | UCSB                                   | H. Shichijo          | UT Dallas        |
| H. Fu         | Iowa State                             | K. Lin       | National Taiwan                        | J. Stauth            | Dartmouth        |
| I. Galton     | UCSD                                   | J. Liu       | UT Dallas                              | D. Sylvester         | U Michigan       |
| R. Geiger     | Iowa State                             | H. Lu        | UT Dallas                              | Y. Takashima         | U Arizona        |
| S. Gómez-Díaz | UC Davis                               | D. Ma        | UT Dallas                              | M. Torlak            | UT Dallas        |
| J. Gu         | Northwestern                           | N. Maghari   | U Florida                              | G. Trichopoulos      | Arizona State    |

### Table 1. Principal Investigators (May 2022 through April 2023)

| Investigator        | Institution | Investigator    | Institution  | Investigator | Institution     |
|---------------------|-------------|-----------------|--------------|--------------|-----------------|
| S. Gupta            | USC         | Y. Makris       | UT Dallas    | H. Wang      | National Taiwan |
| A. Hanson UT Austin |             | P. Mercier      | UCSD         | D. Wentzloff | U Michigan      |
| P. Hanumolu         | UIUC        | U. Moon         | Oregon State |              |                 |
| R. Harjani          | U Minnesota | S. Mukhopadhyay | Georgia Tech |              |                 |



Figure 7. Member Institutions of Texas Analog Center of Excellence

### SUMMARY OF RESEARCH PROJECTS

The 82 research projects funded through TxACE during 2022-2023 are listed in Table 2 below by the Semiconductor Research Corporation task identification number.

## Table 2: Funded research projects at TxACE by SRC task identification number (FA: Fundamental Analog, EE: Energy Efficiency, SS: Safety, Security and Health Care)

|    | Task                      | Thrust | Title                                                                                                                   | Task Leader                       | Institution                               |
|----|---------------------------|--------|-------------------------------------------------------------------------------------------------------------------------|-----------------------------------|-------------------------------------------|
| 1  | 2810.019                  | FA     | Design Automation for Coverage Management in<br>Analog and Mixed-Signal SOCs                                            | Dasgupta, Pallab<br>Hazra, Aritra | Indian Institute<br>of Tech.<br>Kharagpur |
| 2  | 2810.028                  | FA     | Robust ATE Multi-Site HW Design to Enable<br>Effective Analog Performance Testing in Analog-<br>Mixed-Signal (AMS) SoCs | Chen, Degang                      | Iowa State                                |
| 3  | 2810.029                  | FA     | 170GHz – 260GHz Wideband PA and LNA Design in<br>Silicon                                                                | Babakhani, Aydin                  | UC Los Angeles                            |
| 4  | 2810.030                  | FA     | Neural Network Recognition & On-Chip Online<br>Learning with STT-MRAM                                                   | Friedman, Joseph                  | UT Dallas                                 |
| 5  | 2810.032<br>&<br>3160.025 | EE     | DRIVR: A Digital, Re-configurable, Unified Clock-<br>Power (UNICAP) Fabric                                              | Sathe, Visvesh                    | Univ. of<br>Washington &<br>Georgia Tech  |
| 6  | 2810.033                  | FA     | Interleaved Noise-Shaping SAR ADCs for High-<br>Speed and High-Resolution                                               | Flynn, Michael                    | Univ. of<br>Michigan                      |
| 7  | 2810.034                  | EE     | Always-on Keyword Spotting based on Analog-<br>Mixed-Signal Computing Hardware                                          | Seok, Mingoo                      | Columbia                                  |
| 8  | 2810.035<br>&<br>3160.026 | EE     | Computationally Controlled Integrated Voltage<br>Regulators                                                             | Sathe, Visvesh                    | Univ. of<br>Washington &<br>Georgia Tech  |
| 9  | 2810.036                  | FA     | Highly Stable Integrated Frequency References                                                                           | Hanumolu, Pavan                   | UIUC                                      |
| 10 | 2810.037                  | FA     | High-performance Ringamp-based ADCs                                                                                     | Moon, Un-Ku                       | Oregon State                              |
| 11 | 2810.038                  | SS     | Extreme Temperature Digital, Analog, and Mixed-<br>Signal Circuits (ET-DAMS)                                            | Kim, Chris                        | Univ. of<br>Minnesota                     |

|    | Task     | Thrust | Title                                                                                                                                                      | Task Leader           | Institution           |
|----|----------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|-----------------------|
| 12 | 2810.039 | EE     | Development of Compact and Low Cost Fully<br>Integrated DC-DC Converter with Resonant Gate<br>Drive and Intelligent Transient Response                     | Gu, Jie               | Northwestern          |
| 13 | 2810.040 | EE     | Hybrid/Resonant Sc Converters with Integrated Lc<br>Resonator for High-Density Monolithic Power<br>Delivery                                                | Stauth, Jason         | Dartmouth<br>College  |
| 14 | 2810.041 | SS     | ESD Protection for IO Operating at 56 Gb/s and<br>Beyond                                                                                                   | Rosenbaum, Elyse      | UIUC                  |
| 15 | 2810.042 | EE     | Digitally Enhanced High Efficiency, Fast Settling<br>Augmented DCDC Converters                                                                             | Bakkaloglu, Bertan    | Arizona State         |
| 16 | 2810.043 | FA     | Analog Optimization Hybridizing Designer's Intent<br>and Machine Learning                                                                                  | Li, Peng              | UC Santa<br>Barbara   |
| 17 | 2810.044 | FA     | Hierarchical Characterization and Calibration of RF/Analog Circuits Using Lightweight Built-in Sensors                                                     | Ozev, Sule            | Arizona State         |
| 18 | 2810.046 | SS     | Generating Current Constraints for<br>Electromigration Safety                                                                                              | Najm, Farid           | Univ. of Toronto      |
| 19 | 2810.047 | SS     | Architecture and DfT methods for improving life<br>time reliability and functional safety of electronic<br>circuits and systems out of application context | Chen, Degang          | Iowa State            |
| 20 | 2810.048 | SS     | Characterization and Mitigation of<br>Electromigration Effects in Advanced Technology<br>Nodes                                                             | Kim, Chris            | Univ. of<br>Minnesota |
| 21 | 2810.049 | EE     | 1-W Battery-Charging CMOS Buck Regulator                                                                                                                   | Rincón-Mora, Gabriel  | Georgia Tech          |
| 22 | 2810.050 | SS     | Integrating Metasurfaces and MEMS for Gas<br>Sensing                                                                                                       | Gómez-Díaz, Sebastian | UC Davis              |
| 23 | 2810.052 | FA     | TI PLM as Hologram Generator for HUD and AR                                                                                                                | Kaneda, Yushi         | Univ. of Arizona      |
| 24 | 2810.053 | FA     | TI PLM to Advanced Lidar and Display Systems                                                                                                               | Takashima, Yuzuru     | Univ. of Arizona      |
| 25 | 2810.054 | SS     | Reconfigurable AC Power Cycling Setup and Plug-<br>in Condition Monitoring Tools for High Power IGBT<br>and SiC Modules                                    | Akin, Bilal           | UT Dallas             |

|    | Task     | Thrust | Title                                                                                                                                                             | Task Leader                                          | Institution          |
|----|----------|--------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|----------------------|
| 26 | 2810.055 | SS/EE  | EMI-Regulated Secure Automotive Power ICs                                                                                                                         | Ma, D. Brian                                         | UT Dallas            |
| 27 | 2810.056 | FA     | Millimeter Wave Packaging Research - Antenna in<br>Package                                                                                                        | Henderson, Rashaunda<br>Lee, Mark<br>Lu, Hongbing Lu | UT Dallas            |
| 28 | 2810.057 | SS     | Reliability Study of E-mode GaN HEMT Devices by<br>AC TDDB and High Resolution TEM                                                                                | Kim, Moon<br>Shichijo, Hishashi                      | UT Dallas            |
| 29 | 2810.058 | SS/FA  | Machine Learning-Based Overkill/Underkill<br>Reduction in Analog/RF IC Testing                                                                                    | Makris, Yiorgos                                      | UT Dallas            |
| 30 | 2810.059 | SS/EE  | Ultra-Low-Power Robust SAR ADC for PMCW<br>Automotive RADAR                                                                                                       | Chiu, Yun                                            | UT Dallas            |
| 31 | 2810.060 | FA     | Intelligent, Learning ADCs for the Post Figure-of-<br>Merit World                                                                                                 | Flynn, Michael                                       | Univ. of<br>Michigan |
| 32 | 2810.061 | EE     | Two-Stage Vertical Power Delivery and<br>Management for Efficient High-Performance<br>Computing                                                                   | Le, Hanh-Phuc<br>Mercier, Patrick                    | UC San Diego         |
| 33 | 2810.062 | FA     | Multi-Carrier DAC-Based Transmitter<br>Architectures for 100+Gb/s Serial Links                                                                                    | Palermo, Samuel<br>Hoyos, Sebastian                  | Texas A&M            |
| 34 | 2810.063 | FA     | Analog and Digital Assist Techniques to Improve<br>Mixed-Signal Performance                                                                                       | Sylvester, Dennis<br>Blaauw, David                   | Univ. of<br>Michigan |
| 35 | 2810.064 | SS     | Characterization and Tolerance of Ageing in<br>Integrated Voltage Regulators                                                                                      | Mukhopadhyay, Saibal                                 | Georgia Tech         |
| 36 | 2810.065 | EE/SS  | Power-Efficient and Reliable 48-V DC-DC<br>Converter with Direct Signal-to-Feature Extraction<br>and DNN-Assisted Multi-Input Multiple-Output<br>Feedback Control | Seok, Mingoo                                         | Columbia             |
| 37 | 2810.066 | SS     | Demonstrably Generalizable Compact Models of<br>ESD Devices                                                                                                       | Rosenbaum, Elyse                                     | UIUC                 |
| 38 | 2810.067 | EE     | Highly Efficient Extreme-Conversion-Ratio Buck<br>Hybrid Converters                                                                                               | Pande, Partha<br>Heo, Deukhyoun<br>Doppa, Janardhan  | Washington<br>State  |
| 39 | 2810.068 | EE     | Active EMI Filtering with Switch-Mode Amplifier<br>for High Efficiency                                                                                            | Hanson, Alex                                         | UT Austin            |

|    | Task                      | Thrust | Title                                                                                                                                                  | Task Leader                           | Institution                 |
|----|---------------------------|--------|--------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------|-----------------------------|
| 40 | 2810.070                  | SS     | Early and Late Life Failure Prediction Methods for<br>Analog and Mixed-Signal Circuits                                                                 | Kim, Chris                            | Univ. of<br>Minnesota       |
| 41 | 2810.071                  | FA     | Accurate Compact Temperature Sensors for<br>Thermal Management of High Performance<br>Computing Platforms                                              | Geiger, Randall<br>Chen, Degang       | Iowa State                  |
| 42 | 2810.072<br>&<br>2810.073 | EE     | AI/ML Edge Hardware for Ultra-reliable Wireless<br>Networks                                                                                            | Allstot, David<br>Markis, Yiorgos     | Oregon State<br>& UT Dallas |
| 43 | 2810.074                  | SS     | Thermal Performance Characterization and<br>Degradation Monitoring of LDMOS based<br>Integrated Power IC with On-Die Temperature<br>Sensors            | Akin, Bilal                           | UT Dallas                   |
| 44 | 2810.075                  | EE     | Hybrid Step-Down DC-DC Converters with Large<br>Conversion Ratios for 48V Automotive<br>Applications                                                   | Lee, Hoi<br>Liu, Jin                  | UT Dallas                   |
| 45 | 2810.076                  | FA     | High Precision Positioning Techniques Based onAl-Dhahir, NMultiple Technologies and Frequency BandsTorlak, Mu                                          |                                       | UT Dallas                   |
| 46 | 2810.077                  | SS     | Increasing Lifetime of Nano-Scale CMOS Circuits                                                                                                        | O, Kenneth                            | UT Dallas                   |
| 47 | 2810.078                  | EE     | Programmable Mixed-Signal Accelerator for DNNs with Depthwise Separable Convolution Layers                                                             | Murmann, Boris                        | Stanford                    |
| 48 | 2810.079                  | EE     | High-Power-Density In-Package SIMO Converters<br>for Next-Generation Microprocessors                                                                   | Huang, Cheng                          | Iowa State                  |
| 49 | 2810.080                  | EE     | Efficient and High-Density Fully In-Package GaN-<br>Based High-Ratio DC-DC Converters                                                                  | Huang, Cheng<br>Fu, Houqiang          | Iowa State                  |
| 50 | 2810.081                  | FA     | Wang, H<br>Development of 70-95 GHz Terabit Beamformer Huang, Tian<br>Lin, Kun-                                                                        |                                       | National Taiwan<br>Univ.    |
| 51 | 2810.082                  | FA     | Adaptive Digital Cancellation of Dynamic Error<br>from Clock Skew, Component Mismatches, and ISI<br>in High-Resolution RF DACs                         | Galton, lan                           | UC San Diego                |
| 52 | 2810.083                  | FA     | Automated Layout of Analog Arrays in Advanced<br>Technology Nodes                                                                                      | Sapatnekar, Sachin<br>Harjani, Ramesh | Univ. of<br>Minnesota       |
| 53 | 2810.084                  | SS     | Soft and hard analog fault detection, injection,<br>coverage, diagnosis, and localization strategies<br>suitable for production test and in-field test | Chen, Degang                          | Iowa State                  |

|    | Task     | Thrust | Title                                                                                                       | Task Leader                                    | Institution           |
|----|----------|--------|-------------------------------------------------------------------------------------------------------------|------------------------------------------------|-----------------------|
| 54 | 2810.085 | FA     | Applications of Circuit Transient Sensitivity<br>Simulation to Semiconductor Circuit Analysis and<br>Design | Rohrer, Ronald                                 | SMU                   |
| 55 | 2810.086 | SS     | Machine Learning-based Functional Safety<br>Improvement of AMS components in Automotive<br>SoCs             | Basu, Kanad                                    | UT Dallas             |
| 56 | 2810.087 | EE     | Grid Optimization and Silicon Validation for Chip<br>Robustness                                             | Najm, Farid                                    | Univ. of Toronto      |
| 57 | 2810.088 | EE     | Grid Optimization and Silicon Validation for Chip<br>Robustness                                             | Kim, Chris                                     | Univ. of<br>Minnesota |
| 58 | 2810.089 | SS     | Techniques for Low-cost Design, Test, and<br>Calibration of RF MIMO Systems                                 | Ozev, Sule<br>Trichopoulos, Georgios           | Arizona State         |
| 59 | 2810.090 | SS     | Motor Health Monitoring                                                                                     | Akin, Bilal                                    | UT Dallas             |
| 60 | 2810.091 | SS     | Development of Two-Photon Absorption Laser<br>System for Creating Single Event Effects                      | Baumann, Robert &<br>Quevedo-Lopez,<br>Manuel  | UT Dallas             |
| 61 | 2810.092 | EE     | Battery-Charging CMOS Voltage Regulator for<br>Resistive Low-Voltage DC Sources                             | Rincón-Mora, Gabriel                           | Georgia Tech          |
| 62 | 3160.002 | EE     | tinyASR: Self-Supervised, Sub-10μW Automatic<br>Speech Recognition Hardware for IoT Devices                 | Seok, Mingoo                                   | Columbia              |
| 63 | 3160.003 | SS     | Techniques for Online Ageing Detection and In-<br>field Characterization of Aging Phenomena                 | Chen, Degang                                   | Iowa State            |
| 64 | 3160.004 | SS     | Inductive Fault Analysis for Determining Statistical<br>Analog Test Metrics                                 | Ozev, Sule                                     | Arizona State         |
| 65 | 3160.005 | SS     | ML-Assisted Scalable DfT and BIST of AMS Systems                                                            | Chatterjee, Abhijit                            | Georgia Tech          |
| 66 | 3160.006 | FA     | Machine-Learning Based Analog Mixed-signal<br>Design Tool                                                   | Chen, Shuo-Wei<br>Gupta, Sandeep<br>Levi, Tony | USC                   |
| 67 | 3160.007 | FA     | AI-Assisted and Layout-Aware Analog Synthesis<br>and Optimization with Design Intent                        | Pan, David Z.<br>Jia, Yaoyao                   | UT Austin             |

|    | Task     | Thrust | Title                                                                                                  | Task Leader                                    | Institution           |
|----|----------|--------|--------------------------------------------------------------------------------------------------------|------------------------------------------------|-----------------------|
| 68 | 3160.008 | FA     | High-Speed DAC with High Output Power and<br>Linearity                                                 | Chen, Shuo-Wei                                 | USC                   |
| 69 | 3160.009 | FA     | 100+GS/s Time-Domain Analog-to-Digital<br>Converters                                                   | Palermo, Samuel<br>Hoyos, Sebastian            | Texas A&M             |
| 70 | 3160.010 | FA     | Design Automation of Low Phase Noise PLL                                                               | Chen, Shuo-Wei<br>Gupta, Sandeep<br>Levi, Tony | USC                   |
| 71 | 3160.011 | SS     | G-Band CMOS mm-Wave Imager and Sensor for<br>Biomedical Applications                                   | Niknejad, Ali                                  | UC Berkeley           |
| 72 | 3160.012 | SS     | Small-area Low-power DAC Designs with In-field<br>Digital Calibration Ensuring Lifetime High Linearity | Chen, Degang                                   | Iowa State            |
| 73 | 3160.013 | EE     | Energy-Efficient Circuits and Architectures for<br>Cryogenic Operation                                 | Kim, Chris                                     | Univ. of<br>Minnesota |
| 74 | 3160.015 | EE     | ULP Receivers                                                                                          | Wentzloff, David                               | Univ. of<br>Michigan  |
| 75 | 3160.016 | EE     | MODO: Hybrid SIMO-DLDO DC-DC Converter for<br>Multi-Core Microprocessors and System-on-Chips           | Seok, Mingoo                                   | Columbia              |
| 76 | 3160.017 | EE/FA  | Multi-phase Sub-100fs Jitter Ring-oscillator-based<br>Clock Multipliers for Beyond 100Gb/s Links       | Hanumolu, Pavan                                | UIUC                  |
| 77 | 3160.018 | FA     | Pseudo-Static Storage Circuits for Extreme Low<br>Voltage Cryo-CMOS Applications                       | Kulkarni, Jaydeep                              | UT Austin             |
| 78 | 3160.019 | FA     | Mixed-Domain High-Performance CT-ΔΣ ADCs                                                               | Maghari, Nima                                  | Univ. of Florida      |
| 79 | 3160.020 | SS     | Transient Reliability and Condition Monitoring of<br>GaN HEMTs                                         | Akin, Bilal                                    | UT Dallas             |
| 80 | 3160.021 | FA/EE  | Automated Generation of Comprehensive<br>Voltage/Frequency Domains - Logic+PLL+Voltage<br>Regulation   | Sathe, Visvesh                                 | Georgia Tech          |
| 81 | 3160.022 | EE     | Domain-Voltage Regulator Co-design for Enhanced<br>SoC Energy Efficiency                               | Sathe, Visvesh                                 | Georgia Tech          |

|    | Task     | Thrust | Title                                                                     | Task Leader          | Institution  |
|----|----------|--------|---------------------------------------------------------------------------|----------------------|--------------|
| 82 | 3160.024 | EE     | On-the-Go Battery Charging/Battery Monitoring<br>SIMIMO Voltage Regulator | Rincón-Mora, Gabriel | Georgia Tech |

### ACCOMPLISHMENTS

In the past year, TxACE has made significant research progress. Table 3 summarizes the number of publications and inventions resulting from the TxACE research during May 2022 to April 2023, while Table 4 lists the major research accomplishments for the Center during the period. The TxACE researchers have published 59 conference papers, 24 journal papers, and 10 Invited Presentations. The team also made 2 invention disclosures, filed 9 patent applications, and 3 patents were granted. The list of publications is included as Appendix I. Following the tabulation, brief summaries of each project are provided.

| Conference |                | Invited       | Invention   | Patents |                 |  |
|------------|----------------|---------------|-------------|---------|-----------------|--|
| Papers     | Journal Papers | Presentations | Disclosures | Filed   | Patents Granted |  |
| 59         | 24             | 10            | 2           | 9       | 3               |  |

### Table 3. TxACE number of publications (May 2022 through April 2023)

| Table 4 | . Major TxACE Research Accomplishments (May 2022 through April 2023) |
|---------|----------------------------------------------------------------------|
|         |                                                                      |

| Category                           | Accomplishment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Energy<br>Efficiency<br>(Systems)  | Search for a middle ground between in-memory computation with conventional digital techniques and programmable accelerators for deep neural networks (DNN), particularly at the Edge such as in IoT applications has led to demonstration of a 28-nm CMOS prototype achieving 3.1 µJ-per-inference (with 90.2% accuracy) on the CIFAR-10 benchmark, as well as commensurate energy savings on all standard tinyML application benchmarks. The key is enabling fully unrolled and pipelined operations which reduces the activation memory needed and eliminates hardware, and reduces activation access. (2810.078, B. Murmann, Stanford University) |
| Energy<br>Efficiency<br>(Circuits) | Efficiency of power delivery from high-voltage busses to scaled-CMOS-<br>compatible voltages (<1V) is improved by ~10% employing a vertically and<br>heterogeneously integrated architecture leveraging hybrid and switched-<br>capacitor DC-DC converters. This architecture also reduces the number of power<br>pins by at least 2x. The SCVR was implemented in a 65-nm CMOS process, while<br>the HVRM in a 180-nm BCD process. (2810.061, H. Le, UCSD)                                                                                                                                                                                          |
| Energy<br>Efficiency<br>(Circuits) | A linear regression machine learning model is trained based on simulation of both digital core and power management circuitry capturing the relationship among CPU's power consumption, supply droops, and CPU's internal signals, e.g. Opcode. A machine learning core generates a prediction of the CPU current consumption 2~3 cycles early, which is sent to the buck converter and combined with real-time measured supply voltage to deliver "feedforward" regulation to the incoming current surge. This proactive scheme achieves 6%~10% improvements on CPU frequency or converter efficiency. (2810.039, J. Gu, Northwestern Univ.)        |

|                                                      | A new EQCh/c multi-carrier transmitter (TV) has been developed utilizing carrier                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fundamental<br>Analog<br>(Circuits)                  | A new 50Gb/s multi-carrier transmitter (TX) has been developed, utilizing carrier orthogonality to allow band overlap. It features three 5GS/s bands with BB PAM4, MB, and HB 16-state complex modulation on carriers at 5 and 10GHz. The TX, fabricated in a 22-nm FinFET process, operates at 50Gb/s by activating these three bands simultaneously. The TX was used to demonstrates BER<10 <sup>-4</sup> over a channel with 5-dB loss at 12.5GHz, thanks to TX FIR equalization and 4-tap ICI cancellation. (2810.062, S. Palermo, Texas A&M University)                                                                                                                         |
| Fundamental<br>Analog<br>(Circuits)                  | A new hybrid ADC architecture combines a VCO-based continuous-time delta-<br>sigma modulator (DSM) with a noise-shaping successive approximation register<br>(SAR) quantizer. The key innovation is an anti-aliasing filter (AAF) that connects<br>the VCO front-end with the NS-SAR quantizer, allowing direct sampling of time-<br>domain information as voltage-domain information. A 28-nm CMOS prototype<br>achieves an 84.2-dB signal-to-noise-distortion ratio and an 86.8-dB dynamic range<br>within a 1-MHz bandwidth while consuming 1.62mW at 100MS/s. The Schreier<br>SNDR figure of merit is 172.1 dB. (2810.033, M. Flynn, U. Michigan)                                |
| Fundamental<br>Analog<br>(Circuits)                  | The first temperature- and aging-compensated RC Oscillator (TACO) maintains<br>long-term stability by periodically synchronizing its frequency with a less-aged<br>reference oscillator. To enhance its stability, TACO incorporates resistors with<br>higher activation energy ( $E_a$ ), employs switched dual RC branches to reduce stress<br>induced by DC currents, and applies duty cycling to slow down the aging of the<br>reference oscillator. A prototype 100-MHz oscillator built using a 65-nm CMOS<br>process achieves an inaccuracy of ±1030 ppm over -40°C to 85°C after 500 hours<br>of accelerated aging at 125°C. (2810.036, P. Hanumolu, University of Illinois) |
| Safety,<br>Security and<br>Health Care<br>(CADT)     | A cost-effective DfT (Design for Testing) method for improving LRFS (Lifetime<br>Reliability and Functional Safety) of analog and mixed-signal circuits has been<br>demonstrated. A concurrent sampling strategy was developed for simultaneous<br>multi-analog-node online measurements. The method was incorporated with a<br>digital-like DfT method and implemented with a PCB demo design. Low-cost SAR<br>ADC defect-oriented test achieved 100% defect coverage by re-using existing<br>digital circuitry. (2810.047, D. Chen, U of Iowa)                                                                                                                                     |
| Safety,<br>Security and<br>Health Care<br>(Circuits) | Miniaturized mid-infrared (IR) sensors operating at room temperature based on<br>ultrathin meta-surfaces integrated within a piezoelectric nanomechanical<br>resonator system (MEMS), optimized for gas sensing are demonstrated.<br>Outstanding NEP (noise equivalent power) of 80pW/VHz at room temperature,<br>drastically outperforming the state-of-the-art is achieved. (2810.050, S. Gomez<br>Diaz, UC Davis)                                                                                                                                                                                                                                                                 |
| Safety,<br>Security and<br>Health Care<br>(Circuits) | STEM images of commercially available p-GaN E-mode HEMTs are captured with<br>in-situ bias for electrical stress. The samples with a gate-drain structure were<br>prepared using FIB and e-chip. Leakage current was reduced using plasma and FIB<br>oxide deposition on the samples. This demonstration suggests a potential path<br>for identification of location of initial device failure, and understanding the gate's<br>multi-stage breakdown paths and the current evolution stages. (2810.057, M.<br>Kim, UT Dallas)                                                                                                                                                       |

## Safety, Security and Health Care Thrust



| Category                                             | Accomplishment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |  |
|------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Safety,<br>Security and<br>Health Care<br>(CADT)     | A cost-effective DfT (Design for Testing) method for improving LRFS (Lifetime<br>Reliability and Functional Safety) of analog and mixed-signal circuits has been<br>demonstrated. A concurrent sampling strategy was developed for simultaneous<br>multi-analog-node online measurements. The method was incorporated with a<br>digital-like DfT method and implemented with a PCB demo design. Low-cost SAR<br>ADC defect-oriented test achieved 100% defect coverage by re-using existing<br>digital circuitry. (2810.047, D. Chen, U of Iowa) |  |
| Safety,<br>Security and<br>Health Care<br>(Systems)  | Miniaturized mid-infrared (IR) sensors operating at room temperature based on ultrathin meta-surfaces integrated within a piezoelectric nanomechanical resonator system (MEMS), optimized for gas sensing are demonstrated. Thousands of such sensors were tested to characterize their responsivity and noise. Outstanding NEP (noise equivalent power) of 80pW/vHz at room temperature, drastically outperforming the state-of-the-art is achieved. (2810.050, S. Gomez Diaz, UC Davis)                                                        |  |
| Safety,<br>Security and<br>Health Care<br>(Circuits) | STEM images of commercially available p-GaN E-mode HEMTs are captured with<br>in-situ bias for electrical stress. The samples with a gate-drain structure were<br>prepared using FIB and e-chip. Leakage current was reduced using plasma and FIB<br>oxide deposition on the samples. This demonstration suggests a potential path for<br>identification of location of initial device failure, and understanding the gate's<br>multi-stage breakdown paths and the current evolution stages. (2810.057, M. Kim,<br>UT Dallas)                   |  |





### TASK 2810.038, EXTREME TEMPERATURE DIGITAL, ANALOG, AND MIXED-SIGNAL CIRCUITS (ET-DAMS)

CHRIS H. KIM, UNIVERSITY OF MINNESOTA, CHRISKIM@UMN.EDU

### SIGNIFICANCE AND OBJECTIVES

We have been focusing on the testing of the arraybased, densely populated transistor characterization circuit in a 0.35- $\mu$ m process. Using an on-chip heater and M1 metal-based temperature sensor, we were able to collect statistical IV curve data from a large number of devices from cryogenic conditions (liquid nitrogen, -196°C) as well as above 200°C, covering a wide temperature range.

### **TECHNICAL APPROACH**

Testing software was developed for an array-based characterization circuit to efficiently transistor characterize transistor I-V behavior under extreme temperatures. We have implemented a metal-based, small-sized on-chip heater to help us reach the target temperature using joule-heating for measurements above 200°C. We have also developed a robust cryogenic measurement system with enhanced connection for the measurement at 77K. In the past year, we have successfully collected I-V characteristics from 6 test chips. Each chip has over 12,000 transistors. The measurements were performed in both extremely high and extremely low temperatures.

### SUMMARY OF RESULTS

Fig. 1 shows the major results we have had in the past year. Fig. 1(a) shows the full chip layout of the 0.35-µm test chip with PMOS/NMOS array, on-chip-heaters, and scan-chain controlling circuitry. The rest of the top figure shows the extreme temperature measurement setup. For cryogenic measurements, we submerge the testing PCB inside a dewar of liquid nitrogen. All the connection points have been enhanced accordingly to prevent connection failure under cryogenic connections. Fig. 1(b) shows the temperature map for high-temperature measurements obtained by the threshold voltage measurements of each transistor in the device array. Figs. 1(c) and 1(d) show the characterization results of the transistor array under different temperatures including extremely high temperatures, room temperature, and 77K. The sigma/mu values of the subthreshold slope and threshold voltage distributions were 5X and 0.9x, respectively, at 77K as compared to 25°C. In Fig. 1(d), we have shown the distribution of transistor  $V_{TH}$  and subthreshold slope for temperatures ranging from 25°C to 216°C.



Figure 1. (a) Full chip layout of 0.350-µm device characterization array and measurement setup (b) Deviceunder-test temperature underneath the heater showing spatial gradient. (c) Variation measurements of NMOS transistors under 77K. (d) Variation measurements of NMOS and PMOS

transistors under temperature ranges from 146°C to 216°C.

**Keywords:** Device characterization array, on-chip heater, high temperature operation, cryogenic, fully automated testing

### INDUSTRY INTERACTIONS

Intel, Texas Instruments

### MAJOR PAPERS/PATENTS

[1] H. Yu, Y. Yi, N. Pande, and C.H. Kim, "On-chip Heater Design and Control Methodology for Reliability Testing Applications Requiring over 300°C Local Temperatures," IEEE Trans. Device and Material Reliability (TDMR), 2023.

### TASK 2810.041, ESD PROTECTION FOR IO OPERATING AT 56 GB/S AND BEYOND

ELYSE ROSENBAUM, UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN, ELYSE@ILLINOIS.EDU

#### SIGNIFICANCE AND OBJECTIVES

The project objective is to develop a co-design methodology for ESD-protected high-speed receiver front-end circuits. This will enable a designer to achieve adequate component-level ESD protection while also meeting return loss specifications.

### **TECHNICAL APPROACH**

Front-end circuits are co-designed for performance and ESD reliability. First, ESD hazards arising from a variety of bandwidth extension techniques are identified. Then, we identify the value  $V_{MAX}$ , which is the maximum tolerable voltage at the IO pin under ESD conditions. Next, the available protection devices are characterized using the metrics  $I_{fail}$ /Capacitance and  $R_{on}$ ·Capacitance. Finally, the performance and reliability of each candidate circuit are optimized through co-design.

### SUMMARY OF RESULTS

As reported last year, we developed distributed ESD protection by utilizing the parasitic resistance of a bandwidth extension circuit composed of 3 coupled inductors. The objective is to reduce the ESD-induced gate voltage applied to a wireline receiver's input transistors. The 3-inductor bandwidth extension circuit was named a tri-coil, to distinguish it from the pi-coil circuits that are placed at output pins. Recently, a 65-nm test chip was fabricated and tested. The test chip contains 3 tri-coils with integrated distributed ESD protection. The ESD diodes integrated into tri-coil 1 have a total width of 140  $\mu$ m, those in tri-coil 2 have a total width of 116  $\mu$ m, and those in tri-coil 3 have a total width of 93 µm. For benchmarking purposes, the test chip also contains a Tcoil with non-distributed ESD protection. The width of the ESD diode in the T-coil circuit is 140 µm that is the same as for tri-coil 1. Measurement results in Fig. 1 confirm that the tri-coil with distributed protection provides better voltage clamping than the conventional input circuit using a T-coil. The return loss and bandwidth of the tri-coil and T-coil based input circuits are similar [1], demonstrating that improved ESD reliability can be obtained without sacrificing performance.

An ESD protection circuit is designed to sink HBM current without suffering thermal damage and to limit the CDM-induced voltage below the breakdown voltage of the circuit being protected. It is desired that the protection device sizes, and the associated loading, are the smallest that can meet the protection targets. Dielectric



Figure 1. Pulse I-V of (a) tri-coil 1 and T-coil, and (b) tri-coil 2 and tri-coil 3. 2.5-ns pulse-width, 100 ps risetime. BVOX is the foundry-reported oxide breakdown voltage on a 1-ns time-scale.

breakdown voltage increases with decreasing stress time. Therefore, to optimally size the ESD protection, it is necessary to know the transistor gate dielectric breakdown voltage on the CDM time scale. During CDM, the voltage is near its maximum value for just a few hundred ps. MOS test structures with integrated singleshot sub-1-ns pulse generators have been designed and are undergoing fabrication. In simulation, the pulses are near-ideal square pulses, unlike those produced by a TLP tester (Fig. 2). Our team recognized that a high-power pulse generator is not needed to induce oxide breakdown. Consequently, we use on-chip logic circuits to generate the square pulse. The chip is powered on only briefly, to avoid prolonged stress on the logic circuits.



Figure 2. Post layout simulation of pulses applied to MOS victim devices in 65-nm CMOS. Pulses of progressively higher amplitude are applied until the victim breaks down.

**Keywords:** ESD, CDM, impedance matching circuit, timedependent dielectric breakdown

### INDUSTRY INTERACTIONS

AMD, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] M. Drallmeier, et. al., "Distributed protection for...," to appear in 2023 EOS/ESD Symposium Proceedings.

### SIGNIFICANCE AND OBJECTIVES

We focus on electromigration (EM) failures in the onchip power grid and are developing tools to guarantee chip robustness in the face of EM degradation. Our goal is to provide techniques by which one can ensure EM reliability-by-design. The key advantage is improved accuracy and reduced conservatism.

### **TECHNICAL APPROACH**

It is difficult to achieve EM sign-off on modern chip designs, due to the limitations of traditional empirical models that are built into existing tools. Modern physical EM models allow one to overcome these limitations, but are expensive to use, especially when doing EM simulation on large chip power grids. Instead of simulation, we will develop an "inverse approach": generate design-aware constraints on the circuit currents which, if guaranteed during chip design, would ensure EM safety for the desired lifetime. Since these constraints correspond to the specific design, this has the potential to improve accuracy and reduce pessimism.

#### SUMMARY OF RESULTS

Under previous tasks, using the Korhonen stress-based model for EM (1993), we developed a linear (LTI) system model that describes the time-evolution of the stress vector as a function of line currents. We also used this to build a simulation engine for tracking the evolution of EM over time - the first practical *electromigration simulator*.

In 2021, we discovered that the relations between stress and flux in every interconnect tree metal line are identical to those between voltage and current in a specially designed electrical circuit (an RC network), as shown in Fig. 1. We call this an *equivalent circuit*, and it can be easily and automatically built for any given metal interconnect tree. If we solve the voltage-current problem for the equivalent circuit, then we have automatically solved the stress-flux problem for the interconnect tree. Node voltages in the equivalent circuit give the stresses in the metal network.



Figure 1. The equivalent circuit for a metal line.

We have used this equivalent circuit model to discover the closed-form analytical expression for the stress at all points in the interconnect tree for any given time t, in terms of the source currents applied to the tree, as

 $\sigma(t) = e^{At}\sigma_0 + (I - e^{At})G^+ HMDM^+ i.$ This allows us to express the set (space) of safe DC current vectors that may be applied to the tree without violating the desired lifetime T for the tree, as captured by

 $(I - e^{AT})G^+ HMDM^+ i \le \sigma_{crit} - e^{At}\sigma_0.$ The matrix that multiplies *i* on the left-hand-side in the above is what we call the *K* matrix. The computation of the matrix exponential  $e^{AT}$  is a very expensive, and the eigenvalues of the system matrix A are key to an efficient computation of the exponential. Empirical tests indicate that finding and using only the smallest eigenvalues is sufficient for accuracy. We have adapted the well-known Arnoldi algorithm, along with a shift-and-invert strategy, to generate approximations to the *k* smallest eigenvalues of the system matrix. This makes use of a projection of the full system matrix onto a smaller subspace based on the *k* smallest eigenvalues. Effectively, we are making use of an approximation by projection, as:

$$e^{At} \approx \tilde{Q} e^{\tilde{H}t} \tilde{Q}^*,$$

where  $\tilde{Q}$  and  $\tilde{H}$  are matrices generated by the algorithm, and  $\tilde{Q}^*$  is the complex conjugate transpose of  $\tilde{Q}$ . The  $\tilde{H}$ matrix is of much smaller size, and its exponential can be found more efficiently. Results show good accuracy, as shown by the normalized errors for the K matrix:

Table 1. Error results for a few test cases.

| # Junctions | RMS Error | Max Error | Average Error |
|-------------|-----------|-----------|---------------|
| 300         | 0.0577 %  | 0.3736 %  | 0.0410 %      |
| 700         | 0.3138 %  | 4.1548 %  | 0.1866 %      |
| 1,500       | 1.1444 %  | 27.459 %  | 0.5182 %      |

As for the runtimes, while MATLAB cannot compute the exponential for more than 1500 junctions, we are able to compute it for a grid with 5500 junctions, in 19 hours.

**Keywords:** integrated circuits, electromigration, stress, reliability, current constraints

#### INDUSTRY INTERACTIONS

NXP, Intel, Texas Instruments, Siemens

#### MAJOR PAPERS/PATENTS

[1] Shahriari & Najm, "Fast Sim...", in ISQED-2023.

### TASK 2810.047, ARCHITECTURE AND DFT METHODS FOR IMPROVING LIFETIME RELIABILITY AND FUNCTIONAL SAFETY DEGANG CHEN, IOWA STATE UNIVERSITY, DJCHEN@IASTATE.EDU

### SIGNIFICANCE AND OBJECTIVES

Increasingly more ICs are deployed in mission-critical applications to improve performance, reduce accidents, and save lives. Stringent requirements on lifetime reliability and functional safety (LRFS) are imposed but methodologies are significantly lagging for analog circuits. This project develops cost-effective DfT methods for greatly improving LRFS for analog and mixed-signal circuits.

### **TECHNICAL APPROACH**

We will develop a DfT (Design for Test) architecture and multilevel monitoring and healing solutions to ensure lifetime reliability and functional safety. Digital DfT is assumed available to check our circuits. At power-on, we will use digital-like controls and detectors to verify all analog connectivity and topological correctness, which ensures the functionality of basic analog components. With intrinsic process matching, we then perform accurate AMS BIST and calibration. After that, various health and aging monitors will go online. A concurrent sampling strategy will enable simultaneous measurements of many health and safety conditions and will trigger recalibration and/or safety actions as necessary.

### SUMMARY OF RESULTS

Recently, we began testing the wide-range temperature to digital converter and the built-in NBTI monitor. We demonstrated the proposed DfT methods through PCB measurements and simulations. We demonstrated the proposed simultaneous self-test and self-calibration of both ADC and DAC.

Over the course of this research, we applied the DfT method to a standard LDO defect detection. A wide-range temperature-to-digital converter design not requiring device model details was presented at ISCAS. SRC filed a patent for the design. We developed a digital strategy for checking all component connectivity inside a CDAC based SAR ADC, achieving structural test, defect detection, and defect localization. We developed a fast-sensing method for multi-device NBTI aging detection and completed a test-chip design and fabrication.

The high-resolution concurrent sampling strategy for AMS reliability and safety improvement is shown in Fig. 1. The strategy was demonstrated via PCB design and measurement as shown in Fig. 2.



Figure 1. High-resolution concurrent sampling method.



Figure 2. Analog test PCB and FPGA board for injecting analog faults and validating BISTs.

**Keywords:** lifetime reliability and functional safety, power-on AMS BIST, concurrent sampling

### INDUSTRY INTERACTIONS

Intel, MediaTek, NXP, Siemens, Texas Instruments

- [1] Kushagra Bhatheja, et al, 2022 ITC
- [2] Kushagra Bhatheja, et al, 2022 IEEE ISCAS
- [3] Mona Ganji, et al, 2022 IEEE ISCAS
- [4] Marampally Saikiran, et al, 2022 IEEE ISCAS
- [5] Kwabena Banahene, et al, 2022 IEEE ISCAS
- [6] Marampally Saikiran, et al, 2022 SBCCI
- [7] Marampally Saikiran, et al, 2022 SBCCI
- [8] Mona Ganji, et al, US Patent App. No. 18/095,853

### TASK 2810.048, CHARACTERIZATION AND MITIGATION OF ELECTROMIGRATION EFFECTS IN ADVANCED TECHNOLOGY NODES CHRIS H. KIM, UNIVERSITY OF MINNESOTA, CHRISKIM@UMN.EDU

### SIGNIFICANCE AND OBJECTIVES

Even though accurate Electromigration (EM) models are important for designing reliable chips, only limited silicon data has been reported due to difficulties of wellcalibrated experiments. In this project, EM failure trends and statistics will be collected from a dedicate power grid EM test chip using an automated testing setup.

### **TECHNICAL APPROACH**

Our test will focus on experiments on four different power grid structures fabricated on a single die leveraging circuit-based data collection. The four EM lifetimes and IR drop aggravation trends will be analyzed and compared with EM simulation models.

### SUMMARY OF RESULTS

The experiment will be done on 28-nm power grid test chips. Using voltage scanning circuits that can measure 1024 internal voltage in the grid, allows the recording of  $V_{\text{DD}}$  and GND changes after EM stress for the failure analysis. As shown in Fig. 1, four different DUTs are designed to compare the failure behaviors. Each DUT is a realistic power grid with equivalent guasi-loads, and the DUT temperatures are accurately controlled by a dedicated temperature sensor. Fig. 2 and Fig. 3 show the rail width, cell via count, and rail density differences between the grids. Since we have various test structures and voltage tapping capabilities, we expect to find the dependency between the grid design and the failure time/location. Also, utilizing this Silicon data, the statistics will be used to calibrate the physics-based EM simulation models. For accurate data collection for EM experiments, controlling DUT temperature for consistent stress conditions is critical. To this end, we came up with a multithreaded test program to collect the data only under the stable DUT temperature (Fig. 4 (left)). By using this approach, EM failure statistics, which are sensitive to temperature, could be analyzed under the same stress conditions (Fig. 4 (right)). This experiment is a huge improvement compared to the previous ones since the heater control is no longer affected by the Joule heating of the DUT and the chip's process variation thanks to the dedicated sensor and the test software.



Figure 1. EM test chip overview for realistic power grid structures.







Figure 3. Cell via count, power rail width, and power rail density differences between the power grids.



Figure 4. Multi-threaded test flow (left) and on-chip heater control log for maintaining accurate DUT temperature (right).

**Keywords:** Electromigration, power grid, lifetime, characterization, test structure

### INDUSTRY INTERACTIONS

Intel, NXP, Siemens, Texas Instruments

### TASK 2810.050, INTEGRATING METASURFACES AND MEMS FOR GAS SENSING

SEBASTIAN GÓMEZ DÍAZ, UNIVERSITY OF CALIFORNIA AT DAVIS, JSGOMEZ@UCDAVIS.EDU

### SIGNIFICANCE AND OBJECTIVES

The goal is to demonstrate infrared (IR) sensors operating at room temperature based on metasurfaces (MTSs) integrated within a nanomechanical resonator system (NMEMS) with application in gas sensing. Experimental data demonstrate a noise equivalent power (NEP) of ~80 pW/ $\sqrt{\text{Hz}}$ , which is state-of-the-art for spectrally selective IR sensing.

### **TECHNICAL APPROACH**

tailored and By merging electromagnetic electromechanical resonances, miniaturized and fast IR detectors operating at room temperature have been demonstrated. The selectivity of the IR absorption is improved by optimizing the nanoresonators that compose the metasurface decorating the MEMS, showing absorption with a full-width half maximum below 0.2 um. The MEMSs are based on lateral contour modes, and exhibit quality factors above 2500 at ~200 MHz. Next, dozens of devices will be integrated within a chip to demonstrate a complete gas sensing system. To do this, each NMEMS will be tailored to an IR spectral fingerprint of the targeted gas.

### SUMMARY OF RESULTS

In collaboration with Texas Instruments, we have designed, fabricated, and tested thousands of spectrally selective IR sensors based on NMEMS decorated with ultrathin metasurfaces (Fig. 1). We also developed an automatized electrical-infrared test bench to characterize the response of hundreds of sensors in terms of responsivity and noise (Fig. 2). The system is composed of a blackbody radiator, chopper, lenses, motorized vacuum choke stage, microscope, and probe station – all automatized through a dedicated MATLAB software.

Measured data permits correlating device geometry and performance, and in turn, develops design guidelines. Best detectors exhibit NEP of ~  $80 \text{ pW}/\sqrt{\text{Hz}}$  at room temperature, greatly surpassing the state of the art of this technology and opening exciting possibilities in the field of IR sensing.



Figure 1. (Top) Example of a spectrally-selective IR sensor. Top inset details the MEMS anchors. Central inset shows the metasurface pattern. (Bottom left) RF response of the device, demonstrating quality factors over 2700. (Bottom right) IR absorption of MEMS devices loaded with different metasurfaces. Inset shows SEM image of the fabricated device (left) and metasurfaces (right).



Figure 2. Experimental set-up to measure the responsivity and noise equivalent power of the proposed IR detector.

## **Keywords:** NMEMS, ultrathin metasurfaces, IR sensors, gas sensing, AIN resonators

### INDUSTRY INTERACTIONS

**Texas Instruments** 

### TASK 2810.054, RECONFIGURABLE AC POWER CYCLING SETUP AND PLUG-IN CONDITION MONITORING TOOLS FOR HIGH POWER IGBT AND SIC MODULES

BILAL AKIN, UNIVERSITY OF TEXAS AT DALLAS, BILAL.AKIN@UTDALLAS.EDU

### SIGNIFICANCE AND OBJECTIVES

The long-term reliability of SiC MOSFET is a concern limiting its wide application. This study investigates the performance change in the SiC MOSFETs over aging through AC power cycling. The precursors are identified and condition monitoring circuits, lifetime models, and remaining useful lifetime estimation tools are developed accordingly.

### **TECHNICAL APPROACH**

A smart gate driver board with condition monitoring (CM) circuits is designed. The CM circuits can cover all three primary aging mechanisms. Several devices from different vendors are aged with the AC power cycling test setup and aging precursor shift patterns are captured. The devices are subjected to different aging conditions. At the end of the device's lifetime, various failure analyses are conducted on the aged device. Also, a toolbox for estimating the remaining useful lifetime of devices has been developed in MATLAB, using the aging data gathered through the experiments.

### SUMMARY OF RESULTS

In this research, the AC power cycling test setup is subjected to full power testing. The power modules and discrete devices were aged under controlled conditions, and their parameter changes were monitored using specialized condition monitoring circuits. As shown in Figure 2, the monitoring circuits recorded online current and voltage data for on-resistance monitoring.



Figure 1. AC power cycling test setup.

Three main precursors for SiC MOSFETs have been identified. The on-resistance shift over aging is presented in Figure 3 as an example. Finally, a MATLAB toolbox was developed and trained using real data from the aging test setup to estimate the remaining useful lifetime of the device.



Figure 2. Online R<sub>ds,on</sub> measurement.



Figure 3. Precursor shift over aging.

**Keywords:** SiC MOSFETs, AC power cycling, performance degradation, condition monitoring, reliability

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] "Gate-Oxide Degradation Monitoring of SiC MOSFETs Based on Transfer Characteristic with Temperature Compensation," in IEEE Transactions on Transportation Electrification, 2023.

[2] "A Practical Switch Condition Monitoring Solution for SiC Traction Inverters," in IEEE Journal of Emerging and Selected Topics in Power Electronics, vol. 11, no. 2, pp. 2190-2202, April 2023.

[3] "Investigation and On-Board Detection of Gate-Open Failure in SiC MOSFETs," in IEEE Transactions on Power Electronics, vol. 37, no. 4, pp. 4658-4671, April 2022.

[4] "AC Power Cycling Test Setup and Condition Monitoring Tools for High Power SiC Modules," in IEEE Transactions on Vehicular Technology, 2023, in press.

### SIGNIFICANCE AND OBJECTIVES

Electromagnetic side-channel attack (EM-SCA) exploits electromagnetic interference (EMI) traces in modern electronics to breach encrypted data security. Existing countermeasures face significant trade-offs between the power conversion performance and side-channel security. The proposed solution supports and mitigates such tradeoffs and achieves both efficient power conversion and enhanced EM side-channel security.

### **TECHNICAL APPROACH**

To prevent potential security breaches by SCA, both the EM and power traces of the data encryption system must be statistically de-correlated from the data being encrypted. Prior arts such as power masking and operation randomization of converter achieve such decorrelation, but these approaches often come at the expense of extra power loss and degraded output regulation performance. Alternatively, power-balanced encryption circuits achieve constant power consumption regardless of the encrypted data, but they induce power penalties and large area overheads due to complex digital circuitry. We aim to eliminate the trade-offs between security and performance and find a well-balanced solution.

### SUMMARY OF RESULTS



Figure 1. System architecture of the proposed solution [1].

The proposed power stage architecture shown in Fig. 1 mitigates extra power loss by achieving both power injection and charge recycling. Both the power and EM traces are de-correlated from the encryption core activity by randomized power injection and recycling of the injected power. The proposed power stage consists of a conventional buck converter and a switched capacitor stage in parallel. The encryption interface triggers random power injections from the input to charge the capacitor in parallel. Once the capacitor voltage  $V_{CS}$  is sufficiently charged, the charge recycling is activated, where the main

power input is temporarily disconnected, and the capacitor delivers the load current until discharged to a threshold. Thus, the charge recycling achieves further randomization by random input pulse skipping and energy saving.

The conducted EMI measurement in Fig. 2 demonstrates the effective EM trace randomization by the proposed design. Without the proposed techniques, the EMI peaks in the spectrum are highly correlated to the operating condition of the converter while the proposed technique suppresses EMI peaks and continuously randomizes the spectrum to prevent statistical analysis by EM-SCA. Fig. 3 shows the efficiencies with and without the proposed techniques. Thanks to the efficient energy use by the charge recycling technique, the proposed encrypted power supply achieves the peak efficiency of 90.5% and the maximum power overhead of only 4.9% without any degradation in the output regulation performance.



Figure 2. Conducted EMI measurement at I<sub>CORE</sub> of 200mA (a) without, and (b) with power injection and charge recycling.



Figure 3. Measured efficiency without and with proposed random parallel power injection and charge recycling.

### Keywords: Charge recycling, EM-SCA, SCA Countermeasure

### INDUSTRY INTERACTIONS

IBM, NXP, Texas Instruments

### MAJOR PAPERS/PATENTS

[1] K. Wei, J. W. Kwak and D. B. Ma, "An Encrypted On-Chip Power Supply With Random Parallel Power Injection and Charge Recycling Against Power/EM Side-Channel Attacks," in IEEE Transactions on Power Electronics, vol. 38, no. 1, pp. 500-509, Jan. 2023.

### TASK 2810.057, RELIABILITY STUDY OF E-MODE GAN HEMT DEVICES BY AC TDDB AND HIGH RESOLUTION TEM MOON KIM, UNIVERSITY OF TEXAS AT DALLAS, MOONKIM@UTDALLAS.EDU HISASHI SHICHIJO, UNIVERSITY OF TEXAS AT DALLAS

### SIGNIFICANCE AND OBJECTIVES

The p-GaN gate of E-Mode GaN HEMTs acts as a switch for 2-DEG channel formation in AlGaN/GaN channel. Thus, gate reliability is critical to expanding their use in power applications. This work describes in-situ electrical biasing STEM results to probe the initiation sites of the failure.

### **TECHNICAL APPROACH**

Commercially available p-GaN E-mode GaN HEMTs were characterized by their electrical and physical characteristics. In-situ electrical biasing samples with a gate-drain structure are made using FIB and e-chip. We worked on reducing the leakage current using plasma and FIB oxide deposition on the in-situ samples. The in-situ sample was electrically biased and imaged in real-time in STEM.

### SUMMARY OF RESULTS

The in-situ samples prepared using FIB were treated to limit the current flowing through the sample. This reduces surface leakage and suppresses the high current-density activated phenomenon. The samples are cut using the FIB to isolate the substrate and device structure. By making isolation, we reduced the current conduction areas and thus limited the excess current flowing through the devices. Fig. 1 shows the leakage current conduction (at gate-to-drain voltage = 0V) is reduced up to about 6 orders by combined isolation cuts and plasma/oxygen treatments.



Figure 1. I-V characteristics of the in-situ samples prepared for electrical biasing experiments, showing the reduction in the leakage current conduction in the samples with various isolation cuts and plasma/oxygen treatments.

The I-V characteristics of the electrically biased in-situ samples are shown in Fig. 2(a). The gate-to-drain junction was forward biased from 0-7V and observed in real-time during the biasing in STEM mode. The sample showed no apparent change from 0-4 volts during the bias sweep.

During the 0-7V sweep, we observed some globules/precipitation and cracks on the side of the sample near the gate edge, as shown in Figs. 2(b) and 2(c). We also found structural and elemental defects across the channel and concentrated in the active region.



Figure 2. (a) I-V characteristics of in-situ electrically biased emode GaN HEMT. STEM images showing (b) localized elemental diffusion across the interfaces and (c) cracks formed.

Moreover, we found extensive deformation on the p-GaN/AlGaN/GaN layers. The applied bias increases the stress in the pre-strained heterostructure, leading to defect formation across the interfaces. As a result, we see partial diffusion, localized damage in layers, and elemental diffusion in the AlGaN layer, ultimately responsible for the gate breakdown, as shown in Figure 3. The metal/p-GaN contact is the failure initiation site, propagating down to the heterostructure interfaces.



Figure 3. STEM images showing (a) the damaged p-GaN/AlGaN/GaN interface, including the roughened interface and diffusion path created in the AlGaN, and (b) the overall damages in the p-GaN gate structure in the in-situ electrically biased e-mode GaN HEMT.

## **Keywords:** E-mode GaN HEMT device, Reliability, in-situ electrical biasing STEM

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] A. Mehta et al., "Characterization of GaN E-mode HENT Devices by In-situ STEM Electrical Biasing," Micro. Microanal 28(S1), 2276 (2022).

### SIGNIFICANCE AND OBJECTIVES

To optimize Analog/RF IC testing, machine learningbased solutions are used specifically to enhance yield management. As semiconductor devices become more complex, testing procedures have become intricate and time-consuming. Our solution aims to enhance testing efficiency by minimizing the risk of discarding functional devices (Overkill) or shipping out defective chips (Underkill).

### TECHNICAL APPROACH

To minimize overkill, our proposed approach includes three steps: predicting auxiliary test values with multivariate regression models, clustering predicted and actual outcomes, and combining them using a proximitybased metric to determine recoverable devices. For Underkill, we employ Gaussian Mixture Model (GMM) clustering on probe test measurements from multiple insertions. We isolate devices with a higher probability of on-site failure and utilize adaptive multivariate outlier detection to identify potential customer return devices.

### SUMMARY OF RESULTS

In our efforts to reduce Underkill, we performed our experiments on an industrial dataset from Texas Instruments that consisted of 66 specification tests and 241 auxiliary tests performed on 92,022 devices. Of these devices, we focus on 8,840 (9.6%) devices that pass the specification test but fail the auxiliary tests.

First, we perform our regression modeling and limit agnostic clustering independently, and then, combine the outcomes of the regression and clustering. Upon examination, we get three buckets of devices, of which the two outcomes diverge for one bucket, and we use a twoclass classifier to decide. Finally, using the two-class classifier in addition to our regression and clustering, we recovered 81.6% (highlighted in green) of devices from our focus group as observed in Table 1.

Table 1. Device Classification using a Two-class Classifier.

|                 |      | Specification Tests |       |
|-----------------|------|---------------------|-------|
|                 |      | Pass                | Fail  |
|                 | Pass | 80,261 +            | 1,623 |
| Auxiliary Tests |      | 7,217               |       |
|                 | Fail | 726                 | 2,195 |

In our efforts to reduce Underkill, we proposed a threestep approach; feature selection, clustering using GMM, and adaptive outlier detection. We performed our experiments on an industrial dataset from Texas Instruments consisting of devices from 19 wafers with a recorded customer return on each wafer. First, we perform feature space selection, followed by unsupervised clustering using GMM, and note that we are effectively able to isolate known customer returns as shown in Fig. 1.



Figure 1. Clustering using Gaussian Mixture Modeling.

We use a modified formulation of cluster-based local outlier factor scores to learn the multivariate outlier boundary. Upon applying our proposed methodology, we achieved coverage of 89% - 100% (correctly identified customer returns). Additionally, the outlier detection model incurs an additional yield loss of 3.48% - 1.8% as we progress the train set from 10 wafers to 18 wafers.

**Keywords:** yield recovery, machine learning, adaptive testing

### INDUSTRY INTERACTIONS

**Texas Instruments** 

### MAJOR PAPERS/PATENTS

[1] D. Neethirajan et al., "Machine Learning-Based Overkill Reduction through Inter-Test Correlation," IEEE VLSI Test Symposium (VTS), 2022.

[2] V. A. Niranjan et al., "Machine Learning-Based Adaptive Outlier Detection for Underkill Reduction in Analog/RF IC Testing," IEEE VLSI Test Symposium (VTS), 2023.

### TASK 2810.059, ULTRA-LOW-POWER ROBUST SAR ADC FOR PMCW AUTOMOTIVE RADAR

YUN CHIU, UNIVERSITY OF TEXAS AT DALLAS, CHIU.YUN@UTDALLAS.EDU

### SIGNIFICANCE AND OBJECTIVES

Work on transistor level design of a two-step SAR ADC is presented. We find that distortion performance is improved at a given summing node switch sizing using the proposed soft summing node (SSN) bottom plate sampling. Additionally, a comparison time flash TDC is employed to increase the first stage resolution.

### **TECHNICAL APPROACH**

The efficacy of the proposed SSN in a sample and hold (S/H) circuit was verified with transistor switches. The SAR ADC and comparison time flash TDC were also simulated at a transistor level. Due to the nonlinearity and voltage/temperature variation in the TDC, background calibration was implemented. To verify the calibration algorithm function without excessively long simulation, the TDC characteristics were extracted from transistor simulation and input into behavior simulation of the background calibration.

### SUMMARY OF RESULTS

A schematic of the first stage SAR ADC is shown in Fig. 1 below.



Figure 1. First stage SAR ADC with passive offset cancellation to correct summing-node swing in bottom-plate sampling circuit and flash TDC to assist first stage resolution.

The summing node of the bottom plate sampling circuit, node 2, is a soft summing node. This means that  $M_2$  is small, resulting in smaller nonlinear parasitics on the summing node at the expense of increased swing and therefore distortion on that node. However, since the capacitor,  $C_x$  also samples the summing node voltage at the sampling edge, the swing and distortion on that node are captured and are not seen by the comparator or residue amplifier. Fig. 2 shows in transistor level simulation of a S/H circuit, how with soft summing node cancellation, there is improved distortion performance at a given  $M_2$  size, or similarly, how a smaller size  $M_2$  can achieve the same distortion performance.



Figure 2. Bottom plate S/H distortion comparison from transistor level simulation.

To assist in the first stage resolution so that the design of the residue amplifier and second stage are relaxed, a flash TDC is used to measure the time taken by the comparator during the last bit cycle of the first stage SAR ADC. This comparison time is logarithmically related to the magnitude of the voltage residue seen during the last bit cycle, so time amplification is needed to ensure sufficient resolution and calibration is needed to undo the inherent non-linearity of the comparison time. The convergence of the background calibration can be seen in the ENOB of the first stage in Fig. 3 below.



Figure 3. Convergence of TDC background calibration in behavior simulation with TDC characteristics extracted from transistor level simulation.

In the following year, the plan is to tapeout a prototype chip in a 22-nm process and measure the results in silicon.

**Keywords:** soft summing node (SSN), summing-node swing, summing-node distortion, flash TDC, background calibration

#### INDUSTRY INTERACTIONS

NXP, Texas Instruments

### TASK 2810.064, CHARACTERIZATION AND TOLERANCE OF AGEING IN INTEGRATED VOLTAGE REGULATORS

SAIBAL MUKHOPADHYAY, GEORGIA TECH UNIVERSITY, SAIBAL@ECE.GATECH.EDU

### SIGNIFICANCE AND OBJECTIVES

The project will develop circuit techniques and design methodologies to model, characterize, and tolerate ageing in integrated voltage regulators, including on-chip inductive buck and digital low dropout regulators, used in modern SoCs.

### TECHNICAL APPROACH

The objectives of this proposal are to (1) analyze the effects of ageing in IVRs, (2) design test-circuit to efficiently characterize ageing in IVRs, (3) estimate ageing-induced design margin for IVR and develop circuit techniques to tolerate ageing, and (4) explore on-line tuning for tolerating ageing in IVRs. We will focus on high-frequency inductive buck regulators and DLDOs, both with digital voltage-mode control topologies.

### SUMMARY OF RESULTS

We developed a simulation methodology to analyze the effects of HCI on transient performance and efficiency of a digitally controlled IVR designed in 65-nm CMOS (Fig. 1). Following the development of the simulation framework, an IVR with on-chip reliability monitor in 65-nm CMOS to evaluate Negative Bias Temperature Instability (NBTI) and Hot Carrier Injection (HCI) is taped out (Fig. 2). The chip incorporates a power stage, an Analog-Digital Converter (ADC), a PID compensator, and a Digital Pulse Width Modulation (DPWM) as primary components that ensure IVR functionality and are subject to stress. Any performance degradation due to ageing effects can be detected with the incorporated measurement circuits. Relevant tolerance circuits have also been designed. The bond wires of the package connecting the  $V_{sw}$  and  $V_{out}$ pads serve as output inductance. On-chip MIM and MOS capacitance serve as output capacitance. MOS capacitance is placed in any available space. Thick-oxide MOSFETs are employed for power gating, facilitating rapid on-chip power source voltage switching and eliminating ageing effects on unrelated circuits. The chip and its individual components generally have four modes: Operation mode for regular functioning, Stress mode for high voltage supply stress, Measurement mode for ageing degradation detection, and Stop mode, which halts operation and floats the power supply. The full-chip layout is shown in Fig. 3.



Figure 1. IVR Ageing simulation framework.



Figure 2. Circuit diagram of the test chip.



Figure 3. The layout of the test chip.

Keywords: Integrated voltage regulator, ageing

### INDUSTRY INTERACTIONS

IBM, Intel, NXP

### MAJOR PAPERS/PATENTS

[1] S. Zhang, et. al., "Analysis of the Effect of Hot Carrier Injection in An Integrated Inductive Voltage Regulator," IEEE/ACM ISLPED 2022. TASK 2810.065, POWER-EFFICIENT AND RELIABLE 48-V DC-DC CONVERTER WITH DIRECT SIGNAL-TO-FEATURE EXTRACTION AND DNN-ASSISTED MULTI-INPUT MULTIPLE-OUTPUT FEEDBACK CONTROL MINGOO SEOK, COLUMBIA UNIVERSITY, MGSEOK@EE.COLUMBIA.EDU

### SIGNIFICANCE AND OBJECTIVES

The goal of this project is to extract critical features of the large step-down ratio DC-DC converter and perform optimization. The proposed feature extraction and optimization functions would improve the converter's efficiency and enhance the converter's reliability.

#### **TECHNICAL APPROACH**

We designed a high-efficiency 24V-to-1V buck converter featuring two techniques to improve efficiency and reliability. First, we propose a fast *in-situ* efficiency tracking (FIT) technique, which helps to maintain high power efficiency across load conditions and process, voltage, and temperature variations. Second, we propose a power-FET code roaming technique to avoid excessive aging on specific segments.

#### SUMMARY OF RESULTS

We designed, fabricated, and tested a 24V-to-1V buck converter prototype in a 180-nm BCD process. Through the process, we proposed and verified two optimization techniques for a 24V-to-1V DC-DC converter. We are going to design the second prototype chip focusing on in-situ health monitoring and reliability enhancement of a 48Vto-1V converter in the following year.



Figure 1. (a) Efficiency curves across different numbers of active segments ( $N_{act}$ ). (b) Efficiency curves across three different efficiency tracking methods.

The prototype implements two novel optimization techniques. First, we proposed a fast *in-situ* efficiency tracking technique. Besides, we implemented a power-FET code roaming technique that avoids the excessive aging of specific power FET segments, slowing down on-resistance ( $R_{on}$ ) degradation.

Fig. 1 illustrates the efficiency tracking test results. Fig. 1(a) shows the efficiency curves across different numbers of active segments,  $N_{act}$ . Fig. 1(b) compares the efficiency

curves with three tracking cases: case (1) without efficiency tracking, case (2) with LUT-based initial guess only, and case (3) with the FIT method. With the proposed FIT method, the converter achieves a peak efficiency of 93.89% at 405mA. The FIT improves efficiency by 34.74% compared to case (1) and by 1.67% compared to case (2). It also enables >85% efficiency across a wide range of load currents from 65mA to 5A (76.92×).



Figure 2. (a) The proposed code roaming technique slows  $R_{on}$  degradation by 5.8×. (b) It also improves efficiency.

Fig. 2 exhibits the test results of power FET code roaming. Fig. 2(a) shows the aging measurement on ML2's  $R_{on}$ . We accelerate the aging process by operating the converter at 125°C. The measurement results show that the code roaming technique can slow the  $R_{on}$  degradation by 5.8×. Fig. 2(b) shows the efficiency improvement measurement with the code roaming. We estimate that code roaming mitigates local hotspot development and thereby improves efficiency. The efficiency improvement increases with  $I_{load}$  and achieves 0.18% at  $I_{load}$ =900mA,  $N_{act}$ =4.

**Keywords:** efficiency tracking, LUT-based control, gradient descent, code roaming, reliability enhancement

### INDUSTRY INTERACTIONS

IBM, Intel, Texas Instruments

### MAJOR PAPERS/PATENTS

[1] Z. Wang et al., "93.89% Peak Efficiency 24V-to-1V DC-DC Converter with Fast In-Situ Efficiency Tracking and Power-FET Code Roaming," 2023 IEEE ESSCIRC, September 2023.

### TASK 2810.066, DEMONSTRABLY GENERALIZABLE COMPACT MODELS OF ESD DEVICES

ELYSE ROSENBAUM, UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN, ELYSE@ILLINOIS.EDU

### SIGNIFICANCE AND OBJECTIVES

This project will develop methodologies for creating charge-based compact models of ESD protection devices and verifying that the model correctly represents the device's response to arbitrary stimuli. This will allow designers to use simulation to create protection circuits that can protect I/O pins in advanced nodes without compromising signal integrity.

### **TECHNICAL APPROACH**

Compact models suitable for transient and AC analysis are in development. MOSFETs are used both in the functional circuits and the on-chip protection networks; we are developing an ESD wrapper model that does not compromise the accuracy of AC and transient simulations of normal operating conditions. Transient models of ESD devices are validated using waveforms that are different from those used for parameter extraction. Interconnect models suitable for CDM-ESD simulations are derived.

### SUMMARY OF RESULTS

The ESD diode models reported previously were used to design an integrated distributed protection and impedance-matching network in 65-nm CMOS.

The voltage response of the ESD protection network to a 2.5-ns long 3-A current pulse with 100-ps risetime was measured; it was also simulated using the compact models from this project. The voltage at the input of the active receiver is shown in Fig. 1. Only simulation Model #1 (in yellow) matches the measurement results (in blue). Model #1 includes S-parameter models of the top two power routing metals; those models were extracted from EM simulation using ADS Momentum. Models #2 and #3 include RC-extracted models of the power and ground busses. Using Model #1, we learn that the (detrimental) voltage overshoot at the receiver input does not reflect the characteristics of the protection devices. Instead, it results from parasitic inductance in the current return path, and it can be reduced using an ESD-aware floor plan and routing. Simulation reveals that the ESD diodes' forward recovery transient is compensated by the on-chip decoupling capacitance. That capacitance reduces the voltage drop across the rail clamp at the same instance that forward recovery increases the voltage across the diodes; the diodes and rail clamp are in series on the discharge path.



Figure 1. Measured and simulated voltage transients at the output of the integrated matching and ESD protection circuit.



Figure 2. Response of N-well ESD diode to a decaying sinusoidal pulse. A circuit simulation using a non-quasi-static model matches the measurement results.

Other major activities focused on test structure design and the testing methodology. It was observed that only a non-quasi-static (NQS) ESD diode model matches the measured response to oscillatory TLP, demonstrating the utility of that measurement technique for model validation. New test structures were designed for parameter extraction of an ESD MOSFET model that is scalable to the layout dimensions and number of fingers. An on-chip probe to measure CDM waveforms at the chiplevel was designed and is being fabricated.

Keywords: ESD, CDM, compact models, circuit simulation

### INDUSTRY INTERACTIONS

AMD, Intel, NXP, Texas Instruments

### MAJOR PAPERS/PATENTS

[1] S. Huang, et. al., "Physics-based compact model of N-Well ESD diodes," to appear in 2023 EOS/ESD Symposium Proceedings.

[2] M. Drallmeier, et. al., "Distributed protection for highspeed wireline receivers," to appear in 2023 EOS/ESD Symposium Proceedings.

The main goal of this task is to develop a synthesizable odometer IP for effectively monitoring long-term aging in a real IC product. In collaboration with our industry sponsors, we have designed a synthesized version of the silicon odometer aging sensor in 12nm, modifying it according to the result we found from our 65-nm test chip data.

#### TECHNICAL APPROACH

Our previous silicon odometer circuit based on the beat frequency (BF) detection technique was implemented fully in structural and behavioral verilog. The synthesizable design involves no manual layout and gives the flexibility to port the design to any technology. The design includes produce-friendly features such as a calibration-free power supply, a verilog module for enabling the circuit to come out of the dead zone, along with a technique to remove glitches in the ring oscillators during startup.

#### SUMMARY OF RESULTS

Our original prototype odometer had calibration properties so that the ring oscillator frequency could be tuned to avoid the dead zone in our test setup. We removed the calibration setup and added a dead-zone removal technique. Any design that contains a ring oscillator has the possibility of generating glitches at the startup (usually cannot be seen in pre-layout simulation, but can be caught in post-layout simulation). Our current design includes a glitch removal circuit for the ring oscillator. The design includes ring oscillators, with 4 different threshold voltage (V<sub>th</sub>) flavors – RVT, LVT, SLVT, and HVT, with 3 different standard cells - Inverter, NAND, and NOR. Each of the ring oscillators has 101 stages (stress) and 103 stages (reference) of the specific standard cell. We have 22 odometers for each  $V_{th}$ , a total of 88 odometers are connected in a daisy chain.

We used automatic synthesis and place and route (PnR) tools for the layout of the synthesizable register-transferlogic (RTL) design. In our 12-nm design, we went for a flat PnR structure instead of a hierarchical one (generating the ring oscillator (RO) block first, then using it as a macro in the top-level layout), which made the layout compact and runtime shorter. The power-supplying inverters which have been split into 4 8x inverters are placed close to the ring oscillators so that the connecting wires are shorter. The connection from the inverters to the RO power rails are made wider to reduce the resistance. Fig. 1 shows the current layout structure of the 12-nm LVT odometer.



Figure 1. Layout of 12-nm LVT Odometer, 2 sets of ring oscillators of Inverter, NAND and NOR are placed on the left side, separated from the control circuit on the right.

**Keywords:** Silicon odometer, synthesizable, Verilog, automatic place and route, 12nm test chip

#### INDUSTRY INTERACTIONS

Intel , NXP

# MAJOR PAPERS/PATENTS

[1] T. Islam, J. Kim, D. Tipple, M. Nelson, R. Jin, A. Jarrar, and C.H. Kim, "A Calibration-Free Synthesizable Odometer Featuring Automatic Frequency Dead Zone Escape and Start-up Glitch Removal," International Reliability Physics Symposium (IRPS), 2022.

# TASK 2810.074, THERMAL PERFORMANCE CHARACTERIZATION AND DEGRADATION MONITORING OF LDMOS BASED INTEGRATED POWER IC WITH ON-DIE TEMPERATURE SENSORS

BILAL AKIN, UNIVERSITY OF TEXAS AT DALLAS, BILAL.AKIN@UTDALLAS.EDU

# SIGNIFICANCE AND OBJECTIVES

The Lateral Diffused MOS (LDMOS) device is extensively employed in power integration applications. As current density and switching speed undergo an exceptional rise, the stress induced by voltage overshoot intensifies. Therefore, it is necessary to investigate the robustness of LDMOS devices under the dynamic reliability test.

# **TECHNICAL APPROACH**

To expedite testing and obtain reliable results, a largescale test setup is employed, enabling the simultaneous application of stress to multiple devices. Utilizing a modular design, the system is designed to facilitate straightforward expansion and measurement capabilities. The symmetrical layout of the motherboard ensures that the pulse widths among the distributed modules are nearly the same. The design of customized isolated LDMOS devices aims to explore the impact of the connection and width of the isolation well on the device resilience under dynamic reliability tests.

#### SUMMARY OF RESULTS

Two types of switches, both rated for a voltage of 20 V, have been designed with specific connections and parameters as outlined in Table 1. Six switches from each type are evaluated. The double pulse test (DPT), which assesses the dynamic performance of power devices, is a commonly used technique. The avalanche test is conducted to determine the avalanche current, and the results show the desired load current for DPT is 2A. Therefore, the repetitive DPT is applied to age the device by 2A load current. The key electrical parameters, including the threshold voltage  $(V_{th})$ , the body diode forward voltage ( $V_f$ ), the on-state resistance ( $R_{ds,on}$ ), and the drain-source leakage current (Idss) are periodically measured over the cycles with the Keysight B1506A curve tracer. The test started with  $R_g=5\Omega$  and after 48 Billion cycles, the  $R_g$  changed to 20 $\Omega$ . By increasing the gate resistance, the overlap area is increased, and observed the hot carrier effect. Also, the test started with a cooling system to keep the temperature at 25°C. After 114 Billion cycles were applied to the devices, the fans were turned off to age the devices faster. The junction temperature saturated at 42°C after turning off the fans and the device parameters shift was negligible. To age the device faster, the bus voltage was increased to 35V. V<sub>th</sub> is stable during the test which indicates the hot carrier effect is not significant. The results demonstrate that the only electrical parameter that has experienced shifting is the drain leakage current ( $I_{DSS}$ ). Fig. 1 shows the  $I_{DSS}$  for both types of devices. Out of the switches tested, one device from the SW6 group and two from the SW7 group failed after undergoing 225 Billion cycles. The main reason for failure is associated with the increase of the drain leakage current and breaking down the nbl\_iso in the device. Table 1. Specification of LDMOS devices.







**Keywords:** LDMOS, reliability, dynamic reliability test, degradation mechanisms, aging precursor

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] C. Xu et al., "A Reconfigurable AC Power Cycling Test Setup for Comprehensive Reliability Evaluation," in IEEE Transactions on Industry Applications.

This research seeks to increase lifetime and enable circuit operation closer to the reliability limit to improve performance. To reduce complexity and cost, approaches to estimate noise degradation using surrogate sensors will be investigated. This research may provide a step toward a framework for predicting time to failure.

# **TECHNICAL APPROACH**

Noting that the aging of nano-scale transistors is highly variable and noise is one of the most sensitive parameters to transistor aging, this research will investigate the feasibility of increasing the lifetime of circuits by monitoring noise performance degradation and intelligent circuit reconfiguration. More specifically, PLL's and downconverters using arrays of near-minimum-size transistors that can be used for post-fabrication selection of a subset to reduce noise will be utilized. The feasibility of replacing the transistors with increased noise due to aging with fresh transistors that have lower noise to recover the circuit noise performance will be evaluated.

# SUMMARY OF RESULTS

This research effort involving collaboration with Prof. Y. Makris of UT Dallas and Prof. C. Kim of U. of Minnesota is experimentally evaluating the initial feasibility of increasing lifetime by replacing the degraded devices with low noise devices that are not aged. The PLL is fabricated in 65-nm CMOS. Arbitrary combinations of 64 pairs can be selected for stressing and for VCO operation. The PLL includes an on-chip phase noise measurement circuit that can be used for the on-chip selection of combinations of pairs with low phase noise after fabrication and after stress. This year, the performance of the on-chip phase noise measurement circuit is evaluated.



Figure 1. PLL with an on-chip phase noise measurement circuit that will be used for initial stress and healing experiments to investigate the feasibility of improving lifetime.

Fig. 2 shows the on-chip phase noise measured using the on-chip measurement circuit along with an external ADC versus the phase noise measured using an E5052B Signal Source Analyzer at 500-kHz, 1-MHz, and 2-MHz offset from its carrier near 4 GHz. The deviations between the two techniques are less than ~1.2 dB. The lowest phase noise measured using the on-chip circuit is -119 and -128 dBc/Hz at 1-MHz and 2-MHz offset, respectively. As compared to the state-of-art on-chip phase noise measurement circuits, the minimum measured phase noise with this circuit is 2~3dB lower than the minimum measured phase noise that has been normalized to the same offset and carrier frequency. The phase noise measurement circuit in this work does use an off-chip SAW filter.

The minimum phase noise that can be measured with the measurement circuit is limited by the input-referred noise of the amplifier chain and the frequency divide ratio of PLL. For the measured input referred current noise of 1.7 pA/ $\sqrt{\text{Hz}}$ , the minimum phase noise at 1-MHz offset frequency that this circuit with the 125-MHz frequency reference can measure should be ~-152 dBc/Hz from a 4-GHz carrier or ~-164 dBc/Hz from a 1-GHz carrier.



Figure 2. PLL phase noise measured with the on-chip phase noise measurement circuit versus that measured with an E5052B at 500-kHz, 1-MHz and 2-MHz offset frequencies.

**Keywords:** PLL, downconverter, noise measurements, post-aging selection, lifetime

# INDUSTRY INTERACTIONS

Intel, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] P. Yelleswarapu, A. Jha, R. Willis, Y. Makris, and K. K. O, "Phase Noise Reduction in LC VCO's Using an Array of Cross-Coupled Nano-Scale MOSFETs and Intelligent Post-Fabrication Selection," IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, vol. 70, no. 6, pp. 3244-3256, June 2022.

# TASK 2810.084, SOFT AND HARD ANALOG FAULT DETECTION, INJECTION, COVERAGE, DIAGNOSIS, AND LOCALIZATION STRATEGIES SUITABLE FOR PRODUCTION TEST AND IN-FIELD TEST DEGANG CHEN, IOWA STATE UNIVERSITY, DJCHEN@IASTATE.EDU

# SIGNIFICANCE AND OBJECTIVES

Mission critical applications demand extreme reliability, mandating high electronic fault coverage. Analog circuits account for a tiny portion of transistors but a major portion of failures. This project develops cost-effective strategies for detecting soft and hard analog faults for both production and in-field testing, targeting significantly enhanced reliability and robustness.

#### **TECHNICAL APPROACH**

We will work with industry liaisons to identify the most relevant AMS-IP blocks as research vehicles. Digital-like detectors and injectors will be inserted to boost structural observability and controllability so that high defect coverage becomes possible from the digital detectors, making specification simulation unnecessary. Instead, DC parametric sweep is utilized for coverage evaluation, dramatically reducing simulation time and allowing all single-defect injections evaluated for accurate coverage. Soft defects are modeled as device parametric changes or as open/shorts in sub-unit components. Detectors' output codes and injectors' input control codes will be analyzed to determine the defect location and type for diagnosis.

#### SUMMARY OF RESULTS

We developed a cost-effective defect detection method using digital defectors and digital injectors, and a defect localization and diagnosis method. We applied the methods to multiple Op amp structures, and LDO structures, and validated their superior performance. We have also developed a time-efficient analog and mixedsignal defect simulation framework. We tested the framework using multiple AMS circuits such as Op amps, SAR ADC, FVF LDO, and across multiple defect detection methods such as IOI, OTM, etc.

Fig. 1 illustrates the proposed analog defect method. The circuit under test is a fast-transient LDO, consisting of a transient-enhanced power stage, an error amplifier with slew rate enhancement, and a bias block with reference generation. The pink gates are the digital detectors. The two blue transistors and switches in the error amplifier are an example of digital injectors. The area overhead is <4% and insertion impact is negligible. With the proposed method, no specification simulation is needed. Better than 95% true coverage was achieved.



Figure 1. Analog defect detection and control injection for LDO.





Fig. 2 shows the flow chart of the proposed framework for defect coverage simulation that greatly speeds up the simulation. We modify the netlist once for all defects and simulate the circuit sequentially. It avoids the repetitive work of generating a netlist for each defect and reduces the time overhead for interfacing with data/file-handling system which translates to much faster simulation.

**Keywords:** Hard and soft defect detection, analog defect coverage, defect diagnosis and localization, in-field test

# INDUSTRY INTERACTIONS

IBM, Intel, NXP, Richtek, Siemens, Texas Instruments

- [1] M. Ganji, et al, ISCAS, May 2022.
- [2] M. Saikiran, et al, ISCAS, May 2022.
- [3] M. Saikiran, et al, SBCCI, Aug 2022.
- [4] M. Saikiran, et al, SBCCI, Aug 2022.
- [5] M. Sekyere, et al, IOLTS, Sep 2022.

# TASK 2810.086, MACHINE LEARNING-BASED FUNCTIONAL SAFETY IMPROVEMENT OF AMS COMPONENTS IN AUTOMOTIVE SOCS KANAD BASU, THE UNIVERSITY OF TEXAS AT DALLAS, KANAD.BASU@UTDALLAS.EDU

# [1]

# SIGNIFICANCE AND OBJECTIVES

In this project, we propose a data-driven anomaly detection framework along with signal selection technique catered to AMS Functional Safety (FuSa) for automotive systems. To this end, we have developed a novel explainable unsupervised learning-based early anomaly detection framework catered to automotive AMS circuits along with signal and feature selection.

# **TECHNICAL APPROACH**

The extensive adoption of safety-critical applications in the automotive domain has emphasized upholding the FuSa of the associated electrical and electronic systems. In this work, we augment our existing anomaly detection framework (ITC 2022) by proposing a genetic algorithmbased feature selection approach and a novel signal selection algorithm to furnish maximum detection accuracy, while reducing the associated latency. The proposed explainable AI framework provides insights into the workings of our existing anomaly detection model and improves user interpretability and transparency, which can be provided as feedback to the designer during circuit design and validation.

#### SUMMARY OF RESULTS

The overview of the proposed FuSa violation detection framework is illustrated in Fig. 1. In this work, we have evaluated our solution using a bandgap voltage reference circuit and an operational amplifier circuit, both of which are prevalent in modern automotive systems-on-chips. Additionally, we consider the following unsupervised learning algorithms as our anomaly detection model: (1) Gaussian Mixture Model (GMM); (2) Agglomerative clustering; (3) k-means; and (4) Affinity propagation. The principal reasons for selecting these algorithms are attributed to their proficiency in outlier detection, robustness, and ability to scale with data dimensionality.

From our analysis, we can infer that the proposed feature selection approach furnishes up to 100% anomaly detection accuracy using a 5-dimensional feature space and GMM algorithm. This is a 7.2% improvement in detection performance compared to our existing anomaly detection model. Our signal selection algorithm identifies the best intermediate circuit signal, which when used as an observation signal produces up to 98% detection accuracy and a 2.3X reduction in detection latency. Additionally, the XAI framework identifies the feature split

value for efficient clustering and furnishes feature importance maps that can be used to perform classification of anomalous behavior.



Figure 1. Overview of the proposed FuSa violation detection framework.

We have identified the following action items as our deliverables for next year:

• Perform abstraction from the component level to the SoC level.

**Keywords:** Functional Safety, AMS Circuits, Feature Selection, Signal Selection, Explainable AI

INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

# TASK 2810.089, TECHNIQUES FOR LOW-COST DESIGN, TEST, AND CALIBRATION OF RF MIMO SYSTEMS SULE OZEV, ARIZONA STATE UNIVERSITY, SULE.OZEV@ASU.EDU GEORGE TRICHOPOULOS, ARIZONA STATE UNIVERSITY

# SIGNIFICANCE AND OBJECTIVES

This project aims to achieve the goal of lowering the overall production cost of RF MIMO systems by developing techniques for antenna/radome design, design and judicious insertion of built-in self-test (BIST) and calibration components, and a system-level test development without requiring far-field testing.

#### **TECHNICAL APPROACH**

The proposed approach includes two phases of optimization. In Phase 1, we will develop a system level model that includes imperfections of the RF front-end as well as the antenna and radome. This model will be used to determine what information can be extracted by using only mission-mode signals and what information is needed to achieve the calibration levels that satisfy system-level requirements. The second phase of optimization is to explore the design space of antenna/radome and BIST components to help determine the best solution for the target determined in Phase -1.

### SUMMARY OF RESULTS

We propose to turn an undesirable aspect of multi-path RF systems, mutual coupling between TX and RX antennas, into an advantage from a built-in measurement perspective (see Fig. 1). The proposed method includes antenna mismatches and uses only signal steering elements and simple power monitors. Since radars transmit and receive at the same time, the transmit signal inherently couples to the receiver via antenna propagation. While mutual coupling is minimized during design, it is unavoidable and significant coupled RF power exists in the RX path. In the normal mode of operation, the received signal from mutual coupling is removed in the receiver IF path through high-pass filters. However, this signal is still present up to this high-pass filter, and certainly through the RF front-end. Thus, the mutually coupled signal power can be detected via simple power detectors. We use the mutually coupled signal from TX to RX to determine gain and phase mismatches between antenna elements. We developed an algorithm for test application and gain and phase mismatch calculations. We demonstrated and analyzed the proposed technique in hardware, using a cascaded mm-Wave radar device from Texas Instruments.

We use the TIDEP-01012 automotive radar board with 86 virtual antennas for our hardware experiments. The TX power is set to its higher value of 13dBm. Commercial devices typically include power detectors for BIST. However, the proposed technique requires the measurement of the combined power as well as the received power at each receiver element. Thus, for hardware demonstration, we cannot rely on the existing BIST infrastructure. To circumvent this problem, we use the reflected signal measurements at the end of the IF



Figure 1. OTA measurement of RX or TX mismatches.

chain. Fig. 2 shows the hardware measurement results. The proposed method provides the desired accuracy for SNR levels above 8dB and ADC resolution above 8-bits. Both requirements are easy to meet for ICs.



Figure 2. Hardware results.

Keywords: mm-wave, radar, 5G, BIST

# INDUSTRY INTERACTIONS

**Texas Instruments** 

# MAJOR PAPERS/PATENTS

[1] Ataman, Ferhat Can, Mohammad Aladsani, Georgios Trichopoulos, Chethan Kumar YB, and Sule Ozev, "Mismatch Measurement for MIMO mm-Wave Radars via Simple Power Monitors," in 2023 IEEE European Test Symposium (ETS), pp. 1-6. IEEE, 2023.

Bearing fault detection in fan motors via current signal analysis optimizes operations by enabling early diagnosis and predictive maintenance. This method not only reduces costs but also enhances safety, and extends motor life promoting reliability, efficiency, and costeffectiveness, ensuring uninterrupted motor operations and leading to operational excellence.

# **TECHNICAL APPROACH**

This study utilizes motor current analysis to identify bearing faults in cooling fan motors, aiming to improve system reliability and prevent costly shutdowns or safety risks. It circumvents issues with traditional vibrationbased diagnostics that can cause damage and require sensor installation. Instead, we employ Motor Current Signature Analysis (MCSA), a noninvasive method, to detect these faults early, focusing particularly on mechanical defects.

# SUMMARY OF RESULTS

This project was initiated in the previous year. Tasks 1 and 2 have been completed as expected and following the timeline and we are presently focusing on Task 3. Detailed outlines of each task are provided below:

Task 1 & 2: In the project's initial phase, modular test benches (Fig. 1) were devised to analyze fan motor parameters, replicating common bearing faults for comparison with healthy counterparts. This provided a comprehensive training database, and a detailed report, and informed the development of detection algorithms. After thorough analysis, the most promising time-domain, frequency-domain, and machine-learning algorithms were incorporated into two diagnostic frameworks. Both a simple low-computation flowchart and a complex machine-learning model were created, evaluated, and documented to ensure a robust understanding of their diagnostic capabilities. Fig. 2 provides the confusion matrix, as well as the accuracy garnered from the execution of the machine learning model.

Task 3: In this phase, we aim to fine-tune and optimize the chosen algorithms, focusing on reducing external disturbances and false alarms. Our goal is to ensure their versatility, making them suitable for all small fan motors regardless of power or speed. Upon readiness, we will cooperate with the DRV team to test new products or prototypes, providing support for control or designrelated tasks. A detailed report summarizing rigorous testing, optimization, and the universal applicability of the solutions will be provided at the end of this stage.







Predictions

Figure 2. Confusion matrix for the trained Machine Learning Model.

**Keywords:** Fan motors, bearing fault, time domain analysis, frequency domain analysis, machine learning

# INDUSTRY INTERACTIONS

**Texas Instruments** 

# MAJOR PAPERS/PATENTS

[1] Chen Li, Mojtaba Afshar, and Bilal Akin, "Fault Detection in Small Fan Motors Using MCSA," in IEEE, IEMDC, 2023.

# TASK 2810.091, DEVELOPMENT OF TWO-PHOTON ABSORPTION LASER SYSTEM FOR CREATING SINGLE EVENT EFFECTS ROBERT BAUMANN, UNIVERSITY OF TEXAS AT DALLAS, ROBERT.BAUMANN@UTDALLAS.EDU MANUEL QUEVEDO-LOPEZ, UNIVERSITY OF TEXAS AT DALLAS

# SIGNIFICANCE AND OBJECTIVES

Create a two-photon-absorption (TPA) laser system enabling characterization of integrated circuit (IC) heavyion susceptibility (an issue for spacecraft electronics) WITHOUT the need for a cyclotron. Optimize the TPA laser pulse system to mimic transient charge disturbance produced by the passage of a heavy ion and create sensitivity mapping capability for IC design debugging.

# TECHNICAL APPROACH

Design/build a TPA laser for a single-event effects (SEE) characterization system, based on using readily available off-the-shelf lasers, optics, motion control hardware, and the development of customized machine-vision and control software. Use axicon optics to transform the ellipsoid Gaussian beam profile into a long cylindrical Bessel beam profile to better emulate heavy-ion charge tracks. Work with TI on designing/obtaining process (diodes with technology monitors deepest implant/diffusion structures) and correlate TPA laser pulse energy to equivalent heavy ion linear energy transfer (LET) values.

#### SUMMARY OF RESULTS

PI has spent the last 6 months reviewing the literature on various laser injection techniques. PI has been communicating with the team that has the best TPA implementation in the world (USNRL) and has optimized design parameters based on that interaction. Since January PI has been obtaining quotes and competitive bids or single source justification for purchases of the laser, optics, and mechanical parts needed to build the TPA system.

The excitation source is a Ti/Yb:sapphire femtosecond (< 380 fs) pulse laser system that provides two beams an IR (1040nm) and VIS (SH: 520nm). The laser system has an integral pulse picker allowing a single shot to be produced based on an electrical trigger pulse. The IR beam is input into an optical parameter amplifier (OPA) that provides beam conditioning and allows conversion and selection of the output beam wavelength. For silicon devices, we are planning to operate at 1100-1200nm which allows the laser pulse to travel through the silicon substrate with almost no attenuation (the TPA pulse is injected through the backside of the IC). At the focus of this IR beam, the photon intensity of the beam is such that a large amount

of two-photon absorption occurs in the active device layers of the Silicon IC. To mimic heavy-ion events the pulse shape and energy must be controlled. This is achieved with in-line pulse metrology. Dedicated process test structures from Texas Instruments will be characterized at TAMU Cyclotron with actual heavy ions of various linear energy transfer (LET) as a reference response of the IC. These results will be correlated with TPA laser pulse injections spanning a range of injected pulse energies. The rest of 2023 is focused on obtaining the needed hardware and assembling and optimizing the prototype in the Center for Harsh Environment Semiconductors and Systems (CHESS).

The main activity for 2024 will be focused on the correlation study and development of customized DUT positioning and feature identification software. In 2025 we plan to focus on throughput enhancement to enable faster evaluations. A simplified block diagram of the system is shown below.



Figure 1. Simplified block diagram of the TPA laser injection system for SEE Characterization.

**Keywords:** Harsh environments, Radiation effects, Singleevent-effects, Two-photon absorption (TPA) laser, Spacecraft reliability

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

Safety-critical high-volume long-lifetime applications like automotive propelled strong interests in online ageing monitoring. Conventional methods relying on sufficient design margins lead to a large area and performance overhead, require complex tradeoffs for PT robustness, and are difficult to migrate across technologies. On-chip ageing monitoring can mitigate these problems and extend IC lifetime.

#### **TECHNICAL APPROACH**

Ageing is a random process and any single ageing monitor inevitably produces an inaccurate assessment of chip health. Different ageing mechanisms affect different circuit parameters differently. We propose to develop integrated ageing sensors measuring device parameters directly degraded by ageing in its operating environment after deployment. The proposed ageing sensors will measure a large population of devices under test, thus enabling the study of the random nature of ageing. They also provide a means for tracking the evolution of ageing effects' probability distributions over different use/stress cases. The quantitative correlation of ageing sensor measurements to the product's event log offers useful lifecycle information.

#### SUMMARY OF RESULTS

During the first 5 months, we investigated various methods of modeling and simulating ageing in devices and circuits. We also started the design of low-cost and fast-detecting circuits for measuring device parameters directly suffering from ageing (TDDB and NBTI monitors).

Fig. 1 shows simulation results from a 20-year deterministic NBTI ageing simulation in RelXpert, showing the output of a proposed NBTI sensor. DUT is stressed with a constant DC voltage of 1.2V at  $27^{\circ}$ C. Simulation shows that the sensor can capture information about the NBTI-induced degradation of the threshold voltage, V<sub>th</sub>, even in the presence of slight device ageing for transistors in the sensor. The core idea of the NBTI sensor is shown to the right of the ageing curve.

Fig. 2 shows the basic schematic of a TDDB sensor that is designed to measure the stress-induced leakage current (SILC) through the gate dielectric. The benefits of this SILC sensor include fast operation (each measurement takes 1-2 clock cycles), large dynamic range (SILC from 200pA to 1uA), PT (process and temperature) robustness, and Constant Voltage Stress (CVS) during measurement. The transfer curve below the schematic shows a large dynamics range and log-linear relationship.



Figure 1. 20-year ageing simulation of DUT's  $V_{th}$  change (red) and the measured  $V_{th}$  change (blue). Green is  $V_{th}$  ageing of a transistor in the NBTI sensor shown to the right.



Figure 2. The schematic of a TDDB sensor for fast measurement of stress-induced gate leakage current (SILC) and the SILC sensor's transfer curve shows a large dynamic range and loglinear relationship to digital output code.

**Keywords:** TDDB, NBTI/PBTI, HCI, online ageing monitor, random ageing

#### INDUSTRY INTERACTIONS

IBM, Intel, NXP, Siemens, Texas Instruments

The goal of this project is to develop a comprehensive framework for fault analysis of analog circuits starting from the design time schematic. This framework will include a comprehensive analysis of fault probability and severity, and enable feasibility by simplifying fault simulations through divide-and-conquer approaches.

# TECHNICAL APPROACH

The framework that will be developed in this project is based on inductive fault analysis principles that determine the defect probability from defect densities and layout information. To reduce fault simulation complexity, this project will rely on two concepts, namely fault pruning and fault response modeling to enable accurate high-level simulations. Simulating the entire circuit for many faults is still not feasible after fault pruning. A divide-andconquer approach is used by capturing the fault behavior of subcircuits and determining a non-linear model to capture the fault behavior of groups of faults to reduce the number of simulations further.

# SUMMARY OF RESULTS

While analog circuits generally contain fewer transistors compared to digital circuits, they need to be simulated in multiple modes (e.g., transient, harmonic balance, AC). Moreover, some circuits, such as phase-locked loops and mixers, contain both low-frequency and high-frequency signals, requiring very long simulation times. Fault responses need to be analyzed at the transistor level for accuracy. However, simulating the entire circuit at the transistor level is not feasible for many faults. To analyze the fault response in a detailed fashion, the circuit needs to be partitioned in such a way that a parametric response can be evaluated.

Faults that only alter the system-level parameters are easy to incorporate into the system-level model. However, a large percentage of faults will result in manifestations that are not included in the system-level models, such as MATLAB/Simulink, ADS, or Verilog-A. For these faults, we need to alter the system-level model by adding components that represent the fault responses. We will achieve this goal by generating several mathematical templates (pre-defined parametrizable mathematical functions) and matching these mathematical templates with the difference between the faulty and fault-free circuit response. For each fault that needs to be simulated, these differences will be matched to an existing template and the parameters of that template will be determined. The corresponding highlevel model to that template will be inserted into the system-level model to enable propagating fault response to the system-level. This concept has been illustrated on a charge-pump PLL circuit, which contains both high frequency and low frequency signals resulting in notoriously long simulation times.

The divide ratio of the PLL is kept low (8) to be able to simulate faults at the transistor level for the entire circuit. For PLLs with higher divide ratio, transistor-level fault simulations are simply not feasible. Faults in functional blocks, such as voltage-controlled oscillator or phase/frequency detector are modeled using mathematical templates that include simple addition of constant, non-linear transformation of input or output signals, or adding time to the MATLAB model. Overall, this approach could reduce fault simulation time by 1000x even for this PLL with an extremely low divide ratio.

**Keywords:** analog fault modeling, fault simulation, test metrics evaluation

# INDUSTRY INTERACTIONS

**Texas Instruments** 

# TASK 3160.005, ML-ASSISTED SCALABLE DFT AND BIST OF AMS SYSTEMS

ABHIJIT CHATTERJEE, GEORGIA TECH UNIVERSITY, CHAT@ECE.GATECH.EDU

#### SIGNIFICANCE AND OBJECTIVES

The research will develop efficient machine learningassisted design-for-testability and built-in self-test mechanisms for AMS circuits and systems that are scalable across diverse circuit types and device specifications and can be implemented without the need to incorporate oscilloscopes and other complex test instruments on-chip.

#### **TECHNICAL APPROACH**

We propose to develop a design-for-test and built-in self-test methodology for AMS systems that: (1) scales across diverse types of AMS circuit components allowing testing of multiple design specifications using a single test acquisition, (2) can perform autonomous BIST of AMS systems and components without the need to incorporate expensive test instruments on-chip, (3) guarantees detection of random as well as process variation induced defects and in-field performance degradation with high sensitivity and coverage, and (4) is low-cost/low-overhead, minimally invasive, easy to implement on-chip with maximal use of on-chip hardware resources and minimal additional hardware.

#### SUMMARY OF RESULTS

Prevalent specification-based AMS testing techniques require the use of complex test circuits or regressors that are difficult to implement on-chip as well as suffer from coverage loss when devices are under-specified. Complementary defect-based testing techniques require the simulation of explosively large defect sets under assumed failure mechanisms. We overcome these limitations in our proposed approach OATT; an Outlieroriented Alternative Testing and Tuning methodology. OATT maximizes the number and magnitude of the statistical principal components (PCA) of the time-domain DUT test response vectors across diverse manufacturing process corners. This allows the construction of a multidimensional Gaussian probability density model that characterizes the distribution of DUT responses in the principal component's domain. Outliers of this probability density model are classified as defective devices using calibrated confidence ellipses implicitly detecting devices with parametric as well as hard defects. Enabled by the PCA-based approach, the embedded DUT response is acquired using coherent under-sampling and does not require explicit signal reconstruction. Post-manufacture tuning is performed to minimize statistical distance from the nominal Gaussian model using multi-arm bandit reinforcement learning. Simulation results demonstrate the viability and promise of the proposed approach.

We consider a low noise amplifier that operates at 3-8 GHz, a low dropout regulator and a line driver as test vehicles for our proposed approach. These circuits are simulated using Advanced Design System (ADS) and Cadence Virtuoso. Multi-variable process variations and open and short defects with varied defect resistance values are injected for this study. Fig. 1 shows a test waveform generated for the LNA using multitones.



Figure 1. Optimized test waveform generated for LNA.

Fig. 2 shows coverage of short defects in the LNA for devices selected across diverse process corners after test optimization (this is improved to 95% from less than 60% coverage prior to test optimization).



Figure 2. Short defect coverage after test optimization.

**Keywords:** analog mixed-signal, defects, test generation, design-for-testability, post-manufacture tuning

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments

We aim to build a near-field two-dimensional sensor for imaging biological samples. This imager, by utilizing an array of subTHz split-ring resonators (SRR), will be used to construct a high-resolution image of the sample based on its permittivity. This imager will be ultimately used for studying the cellular components of a heterogeneous tissue and real-time monitoring of cell growth.

# **TECHNICAL APPROACH**

A 200-GHz signal is generated using a compact singlestage frequency quadrupler. This signal subsequently excites a single-ended transmission line loaded with 2x10 SRR's. All SRR's are tuned at the same frequency as the subTHz signal and therefore need to be switchable if one were to measure them individually. Upon introduction of the sample to the SRR's, their resonance frequency shifts which introduces a shift in the amplitude and phase of the signal received at the end of the t-line. A phase detection scheme is used to measure the changes in each SRR response due to its superior SNR to amplitude detection.

# SUMMARY OF RESULTS

A 20x10=200 array was designed in a 28-nm CMOS technology. The top-level block diagram of this imager is illustrated in Fig. 1. Each row has its own subTHz source, the output of which is split into two paths, LO and RF, through a pair of coupled transmission lines. The LO signal is routed directly to the phase detector, while the RF signal travels through the sensor array before driving the phase detector RF port. The differential outputs of the phase detectors go to a 10:1 multiplexer, which allows the use of a single amplifier for processing the baseband signal. Moreover, periodic switching of the resonators between a sensing mode and a reference mode creates the opportunity to employ correlated double sampling to suppress the low-frequency noise of the phase detector and the subsequent stages in addition to other environmental drifts.

Performing phase detection requires the LO and RF signals to be in quadrature, to avoid any DC offsets and the saturation of the baseband. The quadrature generation is achieved through the proper sizing of the coupler t-lines and any inaccuracies are corrected by using digitally controlled t-lines. Here, multiple metal stripes are placed underneath the t-line carrying the RF signal. These metal stripes can be switched to ground, which alters the

propagation constant of the t-line and, as a result, the phase of the RF signal can be tuned.

Fig. 2 shows the HFSS EM model that was used to design and simulate the sensor. Here, the sample is modeled via hemispheres scattered around the imaging area. Using this model, the sensitivity of each SRR can be tested via sweeping the permittivity ( $\epsilon$ ) of the sample, turning ON only the desired SRR, and measuring the output. The result of a post-layout simulation is depicted in the figure inscribed in Fig. 2. This result together with the total integrated noise of the sensor (up to 1-KHz BW) suggests a minimum detectable  $\epsilon$  of 0.5% with no averaging.



Figure 1. Imager top level diagram for a 20x10 array.



Figure 2. HFSS EM model and sensor output for various source frequencies.

**Keywords:** subTHz, permittivity sensor, near-field imager, biosensor, imager

INDUSTRY INTERACTIONS

Intel, Texas Instruments

# TASK 3160.012, SMALL-AREA LOW-POWER DAC DESIGNS WITH IN-FIELD DIGITAL CALIBRATION ENSURING LIFETIME HIGH LINEARITY DEGANG CHEN, IOWA STATE UNIVERSITY, DJCHEN@IASTATE.EDU

# SIGNIFICANCE AND OBJECTIVES

The semiconductor industry has entered the age of AI and IoT, with forecasts soon exceeding 25 billion such devices. Software-based AI systems are speed-limited and power-hungry, while analog hardware accelerators offer great potential. IoT/AI-HW accelerators all require many embedded, reliable data converters, propelling strong needs for ultra-small, low-power and self-healing DACs.

# **TECHNICAL APPROACH**

We investigate segmented DAC architectures that are intrinsically suitable for ultra-small area design with low power consumption. We introduce suitable redundancy to ensure the non-existence of un-calibratable errors with sufficient confidence/yield. We then bootstrap on practical low-cost BIST methods for accurate identification of DAC mismatch errors and other nonlinearities. Lowcost on-chip calibration will be used to reduce all recoverable DAC errors to noise levels. MC and PVT studies will be used to evaluate yield and robustness. A fabricated test chip will be measured to demonstrate the performance density potential of the proposed concepts.

#### SUMMARY OF RESULTS

Thus far, we have investigated various novel segmented DAC architectures that use redundancy to guarantee the absence of non-recoverable errors, such as large gaps in the output voltage (positive jumps in the transfer curve), which ensures that the DAC is calibratable while using small area and small power. Monte-Carlo simulations show that our preliminary designs for the various proposed architectures have nearly no unrecoverable errors and could provide a high yield after calibration. Soon, we will work on developing memory and computationally efficient methodology for practical but accurate DAC test and calibration, and on developing functionalities to enable in-field calibration.

Fig. 1 shows the schematic of one of the 4 DAC architectures under study. It is a 14-bit current DAC and consists of a main binary weighted R2R structure with 2 redundant bits, using MOS transistors only, leading to an ultra-small area. Fig. 2 below the schematic is the INL plot of the uncalibrated DACs in Monte Carlo simulation, exhibiting +/-250 LSB worst-case INL errors. For the worse MC case, we applied the proposed BIST and calibration method. The after-calibration INL plot is shown in Fig. 3, showing that the INL(k) errors at all 2<sup>14</sup> input code k's are all within the desired +-0.5 LSB range.



Figure 1. Example DAC structure with a ultra-small area.



Figure 2. MC simulation shows +/- 250 LSB INL before calibration.



Figure 3. For the worse MC case, calibration reduces INL to +/- 0.5 LSB.

**Keywords:** ultra small area and low power DAC, performance density, BIST, on-chip self-calibration

# INDUSTRY INTERACTIONS

NXP, Texas Instruments

# TASK 3160.020, TRANSIENT RELIABILITY AND CONDITION MONITORING OF GAN HEMTS BILAL AKIN, UNIVERSITY OF TEXAS AT DALLAS, BILAL.AKIN@UTDALLAS.EDU

### SIGNIFICANCE AND OBJECTIVES

GaN power semiconductor devices power high-density, high-efficiency applications such as consumer electronics, data centers, etc. However, the latest JEDEC JEP 180 standard demands further research on withstanding capability of the device to repetitive transients. Also, it is of interest to develop condition monitoring techniques to predict incipient device degradation.

#### **TECHNICAL APPROACH**

The initial phase of the research involves the development of a highly scalable dynamic hightemperature operating life (DHTOL) test setup. This setup would facilitate reliability evaluation on a large number of GaN devices. The intended test bench will incorporate a heating technique to create localized high-temperature testing conditions for the GaN devices, ensuring to avoid failures unrelated to DUTs. The test bench is also planned to include on-board dynamic resistance characterization circuits. Subsequent stages of the research will utilize this test bench to characterize various commercially available GaN devices, aiming to evaluate and compare their reliability.

#### SUMMARY OF RESULTS

A prototype of the highly scalable DHTOL test bench is currently undergoing testing which specifically targets a product-level power rating of 50-100 watts. The architecture of the test bench adopts a motherboard and a daughter-card arrangement. The motherboard serves the purpose of injecting power into each daughter card, which forms the product-level test vehicle. To enable the re-circulation of power, feedback cards are utilized in conjunction with each daughter card.

Two different product test vehicles have been developed and tested within the test bench architecture. One test vehicle is based on a synchronous buck converter, while the other utilizes a QR Flyback topology. Each daughter card within the test bench houses a C2000 controller and discrete ADCs, enabling rapid sensing of current and voltage across the device under test (DUT). To prevent high voltage from appearing across the DUT during off-state, a clamp circuit has also been incorporated. To maintain accuracy, both ADCs for voltage and current measurement are synchronized by the C2000 controller.



Figure 1. Daughter cards: a) Synchronous buck converter and b) QR Flyback converter.

The ADC outputs, representing the dynamic current and voltage values, are sampled, and fed into a discrete DAC stage. The resulting analog value can be easily captured using an external data acquisition system or a multimeter, allowing for continuous monitoring of the dynamic onstate resistance value throughout the entire test cycle. Notably, the test setup demonstrates the capability to capture the dynamic on-state resistance within a timeframe of 200 ns from the DUT's turn-on. In addition to the aforementioned features, the daughter cards also incorporate resistive heaters. These heaters are utilized for localized closed-loop heating of the DUTs. This allows for precise and controlled temperature regulation as per test requirements.



Figure 2. Experimental results of the clamped voltage across DUT and the sensed drain current.

**Keywords:** GaN HEMT, reliability, DHTOL, dynamic resistance, device degradation

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

# **Energy Efficiency Thrust**



| Category                           | Accomplishment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Energy<br>Efficiency<br>(Sytems)   | Search for a middle ground between in-memory computation with conventional digital techniques and programmable accelerators for deep neural networks (DNN), particularly at the Edge such as in IoT applications has led to demonstration of a 28-nm CMOS prototype achieving 3.1 µJ-per-inference (with 90.2% accuracy) on the CIFAR-10 benchmark, as well as commensurate energy savings on all standard tinyML application benchmarks. The key is a processing element (PE) array that intersperses partial product accumulation circuitry with local SRAM kernel storage and digital multipliers, allowing fully unrolled and pipelined operations which reduces the activation memory needed and eliminates hardware, and reduces activation access. (2810.078, B. Murmann, Stanford University)                         |
| Energy<br>Efficiency<br>(Circuits) | Efficiency of power delivery from high-voltage busses to scaled-CMOS-<br>compatible voltages (<1V) is improved by ~10% employing a vertically and<br>heterogeneously integrated architecture leveraging hybrid and switched-<br>capacitor DC-DC converters. This architecture also reduces the number of power<br>pins by at least 2x and reduces thermal dissipation. The SCVR was implemented<br>in a 65-nm CMOS process, while the HVRM in a 180-nm BCD process. (2810.061,<br>H. Le, University of California, San Diego)                                                                                                                                                                                                                                                                                                 |
| Energy<br>Efficiency<br>(Circuits) | A linear regression machine learning model is trained based on simulation of<br>both digital core and power management circuitry capturing the relationship<br>among CPU's power consumption, supply droops, and CPU's internal signals, e.g.<br>Opcode. A machine learning core generates a prediction of the CPU current<br>consumption 2~3 cycles early. The prediction is sent to the buck converter and<br>combined with real-time measured supply voltage to deliver "feedforward"<br>regulation to the incoming current surge. To deal with the inaccuracy of<br>machine learning prediction, an event-based guard-band circuit is also included<br>as the "safety net". This proactive scheme achieves 6%~10% improvements on<br>CPU frequency or converter efficiency. (2810.039, J. Gu, Northwestern<br>University) |





(6)

Integrated Voltage Regulation (IVR) -- remains a critical technology for driving sustained efficiency in SoCs, offering buck converter efficiencies without additional bulky components. The objective of this effort is to devise a domain-scalable run-time programmable IVR fabric that drives and leverages advances in adaptive clocking and SIMO design (Fig. 1).

#### **TECHNICAL APPROACH**

The design effort is organized into four thrusts (1) analyzing the effectiveness of UniCaP in designs with large insertion delay; (2) Demonstrating domain-scalable SIMO implementation, critical to providing the necessary flexibility required for a dynamically programmable IVR fabric; (3) Investigating optimal design-time allocation cross-bar switch allocation which connect modules to domains based on SoC usage profiles; and (4) Designing a tileable buck architecture and that can be configured at run-time either as a single-buck, a phase in a multi-phase buck, or a SIMO converter. The goal of the effort is to implement a test chip demonstration that incorporates all 4 efforts.

#### SUMMARY OF RESULTS



Figure 1. Proposed DRIVR system consisting of multiple buck tiles and domains connected through a partial power crossbar. We have prototyped a 2-tile-8-domain of DRIVR. Domains include a RISC-V processor, cordic, FIR FFT accelerators and dummy domains containing synthetic loads to emulate a multi-domain system.

Goals 1 and 2 of our technical approach have been accomplished – we recently successfully implemented a SIMO-regulated SoC that avoids the prohibitively large Vdd margins required in prior implementations (Fig. 2). We have also developed mechanisms to enable "hotswapping" load modules across voltage domains. We are currently on track to complete the tape-out of the targeted prototype of the proposed DRIVR fabric. The design features two configurable Buck tiles, and 8 load domains which can be configurably connected to the first, second, or both buck tiles depending on a domain-tile assignment determined by the power management control unit.



Figure 2. Simulation waveforms showing the seamless, feedback-controlled handoff of a voltage domain from one buck tile to another without incurring a dead-time, or a short-circuit event.

The tiles can be configured to operate either as two buck voltage domains, a single two-phase buck domain, or a buck-SIMO regulator pair. Similarly, load modules can be configured to be connected to each domain using a configurable power switch. The power switch can be configured as a low-resistance header switch, a digital LDO, or the access device of a SIMO design.

The key challenges facing the DRIVR architecture are (1) transitioning domains from one tile to another at runtime; (2) establishing stable control of a cascaded system of one buck VR driving multiple LDOs; (3) autonomously determining  $V_{dd}$  requirements for the buck based on the target  $f_{clk}$  of assigned domains; and (4) enabling effective and efficient LDO operation despite use of a synchronous digital LDO architecture. The proposed implementation addresses each of these challenges effectively. Silicon is anticipated in November, at which time, the operation and effectiveness of the proposed system will be established.

**Keywords:** UniCaP, SIMO, Configurable Voltage Regulation

#### INDUSTRY INTERACTIONS

Intel , NXP

#### MAJOR PAPERS/PATENTS

[1] Huang, C-H et al., "A Single-Inductor 4-Output SoC with Dynamic Droop Allocation and Adaptive Clocking for Enhanced Performance and Energy Efficiency in 65nm CMOS," ISSCC 2021.

[2] Sun, X. et al., "UniCaP-2: Phase-Locked Adaptive Clocking with Rapid Clock Cycle Recovery in 65nm CMOS," VLSI Symposium 2020.

The primary goal of the project is to create hardwaredesign knowledge and techniques on analog-mixed-signal (AMS) hardware for artificial intelligence (AI) and machine learning (ML) related computing. Specifically, we will create AMS hardware that can provide significant benefits in scaling power consumption in acoustic signal classification tasks.

# **TECHNICAL APPROACH**

We will conduct the planned research as follows. (1) We will design *new feature extraction AMS hardware that uses non-linear circuits*. (2) We will develop the compact model of the non-linear AMS hardware, and by using the developed compact model, we will develop *a training model that can effectively tolerate the variability of AMS computing hardware*. (3) We will design a matching back-end *classifier based on a deep neural network (DNN)*. We will develop the digital circuits to map the model at minimal leakage/power consumption. By combining the developed architecture/techniques, we will prototype one test chip for the AMS front-end and another for the end-to-end multi-keyword recognition systems.

# SUMMARY OF RESULTS

We have designed, prototyped, and tested ultra-lowpower automatic speech recognition (ASR) hardware and submitted our work to [1]. The key task was to design and tape out the new chip titled microASR based on a bioinspired neuron model and digital compute-in-memory (DCIM) hardware.

On-chip ASR is attractive for battery-powered edge devices as it can largely reduce the data upload bandwidth. It also improves latency, security, and privacy and can enable many new applications such as speechbased anomaly detection and surveillance and voicebased user interface in a tiny device.

Recently several on-chip ASR chips have been prototyped. They typically employ recurrent neural networks (RNN) such as LSTM to detect multiple phonemes and construct a word using phonemes. Phonemes have different lengths and complex interactions with each other, making a feedforward model less effective. The state-of-the-art low-power on-chip ASR hardware reported a power consumption of 1.85 mW at a 20.6% phoneme error rate (PER) on the TIMIT dataset and 141  $\mu$ W at 25.0%.

In our microASR chip, we leveraged a bio-inspired neuron model which reduces the model size by 4X compared to LSTM and maintains similar accuracy (Figure 1). We also designed a DCIM hardware featuring time-sharing arithmetic hardware (Figure 2) for better area and energy efficiency. MicroASR successfully scaled the power consumption down to 32  $\mu$ W while still achieving the state-of-the-art PER of 20.8%.



Figure 1. An NN model with the proposed bio-inspired neuron model (left) achieves 4X model size reduction while maintaining a similar PER (right).



Figure 2. Architecture of the proposed DCIM hardware.

We also designed the neuron core to emulate 256 neurons in the same layer and optimize the computation sequence such that the fully-pipelined hardware can process the feature incoming at every 16 ms at the clock frequency of 8.5 kHz. This low clock frequency enables a deep supply voltage scaling and achieving state-of-the-art power consumption.

**Keywords:** low-power automatic speech recognition, bioinspired neuron model, digital compute-in-memory, inmemory-computing

# INDUSTRY INTERACTIONS

#### Intel

#### MAJOR PAPERS/PATENTS

[1] D. Wang, et al., "microASR: 32-μW Real-Time Automatic Speech...," ESSCIRC, 2023.

The enhanced spatio-temporal control afforded by Integrated Voltage Regulation (IVR) in modern SoCs is critical to achieving efficiency, provided they can maintain or improve voltage droop despite reduced available decap. This effort examines computationally intensive IVR control strategies to improve the robustness and settling times of buck and LDO designs.

# **TECHNICAL APPROACH**

The design effort is organized into two key thrusts. (1) Using "computational control" to achieve time-optimal transient response to random switching load current profiles typical of SoCs. (2) Addressing a key weakness of digital LDOs--their prohibitive Power Supply Rejection performance--to advance the state of the art in the transient response of digital LDOs. The main approach toward Thrust 1 will be to evaluate the use of Model Predictive Control (MPC) for rapid transient response. This effort seeks to demonstrate the effectiveness, and the limits of more advanced control strategies on regulator design using test-chips in 65-nm CMOS.

#### SUMMARY OF RESULTS



Figure 1. Autonomous "Regenerative Breaking" SoC, which computationally determines the optimum amount of energy recovery from load domain (C<sub>0</sub>).

We have already realized our objective of demonstrating the effectiveness of computational control under tight latency constraints by demonstrating MPC for a buck regulator in 2021. In 2022, in consultation with our liaisons, we transitioned to computationally enabled system power minimization through "regenerative breaking" (Fig. 1), effectively recycling energy from power gated domains before they enter sleep mode.

In the third and final year of the project, we have furthered our investigation into energy minimization. Our initial effort realized total system (VR + load) minimization of active-mode systems. Our second effort showed how Sleep mode energy could be reduced. For our third effort, we are looking to minimize **total application energy**, minimizing power draw from a battery, not only in Active or Sleep, but across the entire PVT dependent Sleep/Wake/Active/Recycle operation cycle. In addition, our proposed effort will demonstrate multi- $V_{dd}$  domain energy optimization all while using a single VR.



Figure 2. Block diagram of the proposed single VR, multiple  $V_{dd}$  domain system which will rely on a runtime optimizer to control the PVT dependent optimal  $V_{dd}$  of each domain in a manner that is optimal for its operation across the entire Sleep-Wake-Run operating cycle.

Fig. 2 shows a block diagram of our proposed system, which relies on a SIMO regulator to control multiple voltage domains. The Run-time Minimum Energy Point (MEP) tracker minimizes energy dissipation of the overall system through multi-domain  $V_{dd}$  optimization across Sleep-Wake-Run modes. The design consists of 3 voltage domains – a RISC-V core, a memory domain consisting of Instruction and Data memory, and an accelerator domain which contains multiple accelerators.

We are slated for tapeout of the proposed design. **Keywords:** Model-predictive control, Voltage Regulation

# INDUSTRY INTERACTIONS

Intel, NXP

#### MAJOR PAPERS/PATENTS

[1] Huang, C-H et al., "Energy Minimization of Duty-Cycled Systems Through Optimal Stored-Energy Recycling from Idle Domains," ISSCC, vol. 65, pp. 222-224. IEEE, 2022.

# TASK 2810.039, DEVELOPMENT OF COMPACT AND LOW COST FULLY INTEGRATED DC-DC CONVERTER WITH RESONANT GATE DRIVE AND INTELLIGENT TRANSIENT RESPONSE

JIE GU, NORTHWESTERN UNIVERSITY, JGU@NORTHWESTERN.EDU

# SIGNIFICANCE AND OBJECTIVES

Fast and intelligent transient response is crucial for modern SoCs with integrated voltage regulators. In this period of the project, we develop a state-of-the-art droop mitigation technique for a fully integrated DC-DC converter using novel proactive power management techniques with real-time machine learning hardware for power regulation of microprocessors.

# TECHNICAL APPROACH

We utilize machine learning techniques to proactively regulate the supply droops from processors through integrated buck converters. Two major techniques are developed. First, machine learning models are developed to regulate load transients on the chip. The model is trained based on real-time operational features, e.g., Opcode from CPU, and simulated CPU's transient current profile. Second, to overcome the false prediction from the models, an event-based droop control is developed as a "safety net" to guarantee the output voltage within the minimum or maximum voltage limits of the CPU. A test chip has been built to demonstrate the proposed techniques.

# SUMMARY OF RESULTS

In this period of the project, we developed a real-time machine learning technique to regulate supply droop based on the CPU's operating condition. The predicted voltage from the machine learning core is used to modify the PWM control signal of the buck converter. A 65-nm test chip was taped out in May 2022 and successfully tested in August 2022 as a demonstration of the technique. Fig. 1(a) shows the architecture of the test chip. Fig. 1(b) shows the die photo of the test chip with a total dimension of 1.9mm by 2.0mm. A RISC-V CPU is implemented as the current load and prediction target of our machine learning controller. A smaller version of a previously developed buck converter is implemented to provide adequate regulation of the output voltage between 0.6V to 1.2V from a 1.8-V input voltage. A linear regression machine learning model is trained based on simulation of both the digital core and the power management circuitry capturing the relationship among CPU's power consumption, supply droops, and CPU's internal signals, e.g. Opcode. The machine learning core proactively generates a prediction of the CPU's current consumption 2~3 cycles early. The prediction result is sent back to the buck converter and combined with real-time measured supply voltage to deliver "feedforward" regulation to the incoming current surge.



Figure 1. (a) Architecture of the test chip; (b) Die photo; (c) Measured improvements in efficiency and frequency.

To deal with the inaccuracy of machine learning prediction, an event-based guardband circuit is also developed serving as the "safety net" for Vmin and Vmax requirement of the CPU. As shown in Fig. 1(c), the proactive management scheme achieves 6%~10% improvements on CPU frequency or converter efficiency. **Keywords** fast transient response of regulator, machine learning, droop mitigation, RISC-V CPU

# INDUSTRY INTERACTIONS

IBM, Intel, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] X. Chen et al., "A 65nm Fully-integrated Fast-switching Buck Converter with Resonant Gate Drive and Automatic Tracking," CICC, April 2023.

[2] X. Chen et al., "Proactive Power Regulation with Realtime Prediction and Fast Response Guardband for Finegrained Dynamic Voltage Droop Mitigation on Digital SoCs," Symposium on VLSI, June 2023.

[3] U.S. Patent, "Machine Learning Assisted Supply Regulation for Integrated Buck Converters with Microprocessors," patent has been filed.

# TASK 2810.040, HYBRID/RESONANT SC CONVERTERS WITH INTEGRATED LC RESONATOR FOR HIGH-DENSITY MONOLITHIC POWER DELIVERY

JASON T. STAUTH, DARTMOUTH COLLEGE, JASON.T.STAUTH@DARTMOUTH.EDU

# SIGNIFICANCE AND OBJECTIVES

Fully integrated power management is important for a variety of computing and communication applications but is difficult due to the limitations of on-chip passive components. This project explores a new direction using distributed LC-resonators in hybrid switched capacitor architectures to mitigate high-frequency losses and improve efficiency and power density.

# **TECHNICAL APPROACH**

This work involves the co-optimization of high-frequency DC-DC converters based on hybrid-resonant switched capacitor architectures and distributed planarspiral LC resonators. A variety of analytical and numerical methods are used to model skin- and proximity-effect losses in planar magnetics which use capacitive dielectrics to ballast and homogenize current density. Several circuit topologies were studied including nominal 2:1 resonant converters as well as higher conversion ratios using Series-Parallel and Dickson architectures. The project involved integrated circuit design with tapeouts completed in Cadence, and electromagnetic design using Sonnet and Maxwell. Key deliverables include circuit architectures and blocks, electromagnetic models, and component optimization methods.

#### SUMMARY OF RESULTS

This work builds on past efforts in fully integrated voltage regulation (FIVR), with specific aims to (1) improve passive component utilization through the use of electromagnetic ballasting, (2) leverage new architectures that benefit from high energy-density on-chip capacitors, and (3) explore new on-chip control, regulation, and timing optimization such as autotuned zero-current and voltage switching. A 3:1 Dickson converter (Fig. 1) used the merged-LC concept, demonstrating a fully integrated resonator and operating at ~30MHz. The design achieved peak efficiency of >78% with voltage conversion from 3.7V to 1.2V with no off-chip components [1].



Figure 1. 3:1 Dickson converter, ISSCC 2022 [1].



Figure 2. COMPEL 2022 trace ballasting theory validation.

A theoretical analysis was presented in IEEE COMPEL 2022 [2] that outlines design tradeoffs and optimization procedures, verified by EM simulation and measured data. Shown in Fig. 2, test structures were built to verify the theory. Takeaways include that with more finer-pitch traces can extend the impact of ballasting to higher frequencies. For example with a desire to extend to GHz frequencies, techniques that leverage ballasting may be critical to get around the inherent limitations of on-chip magnetics which will otherwise limit performance and viability.

Overall, this work demonstrated many key aspects of capacitive current ballasting for future high-frequency DC-DC converters leveraging spiral magnetics. The benefits of ballasting were shown both theoretically and in practice in integrated circuit design and other hardware examples. Prospects and scaling trends of EM ballasted structures were highlighted to show which process nodes are most advantageous and also other considerations such as the number of traces and capacitance segmentation profiles.

Keywords: Power Management, DC-DC Converters

# INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] P. H. McLaughlin et al., "A Monolithic 3:1 Resonant Dickson Converter with Variable Regulation and Magnetic-Based Zero-Current Detection and Autotuning," ISSCC 2022.

[2] K. Datta et al., "High-Frequency Resonant Switched-Capacitor Converters with Multi-Winding Current Ballasting: Analysis and Optimization," IEEE COMPEL 2022.

[3] P. McLaughlin, et al., "Modeling and Design of Planar-Spiral Merged-LC Resonators in a Standard CMOS Process," IEEE COMPEL 2020.

[4] P. McLaughlin, et al., "A Fully Integrated Resonant Switched-Capacitor Converter with 85.5% Efficiency at 0.47W Using On-Chip Dual-Phase Merged-LC Resonator," ISSCC 2020.

# TASK 2810.042, DIGITALLY ENHANCED HIGH EFFICIENCY, FAST SETTLING AUGMENTED DCDC CONVERTERS BERTAN BAKKALOGLU, ARIZONA STATE UNIVERSITY, BERTAN.BAKKALOGLU@ASU.EDU

### SIGNIFICANCE AND OBJECTIVES

Advanced digital loads present formidable demands on power-supply regulators, requiring the rapid delivery of transient currents. To address this challenge, we have implemented auxiliary circuits employing a novel nonlinear control scheme known for its superior efficiency and design flexibility. Our primary objective is to overcome the traditional trade-off between efficiency, dynamic response, and settling time observed in standard DC-DC converters. To achieve this, we are developing high-speed, digitally controlled augmentation stages with precise load current management and accelerated settling times.

### **TECHNICAL APPROACH**

In our proposed power converter design, an auxiliary stage incorporating a small external inductor operates at a higher switching rate during load-transient responses and shares the load capacitor with the main stage, which employs a larger external inductor. While the main stage prioritizes high efficiency during steady-state operation, the auxiliary stage ensures swift settling during rapid load transients. To facilitate effective control of the auxiliary stage, we will develop an observer-based load capacitor current estimator, enabling it to function as a Current-Controlled-Current-Source (CCCS). This innovative approach results in a fully integrated solution that significantly reduces the required external capacitor size (approximately 3~4 times smaller).

#### SUMMARY OF RESULTS

Our approach introduces a novel nonlinear control scheme. Notably, the timing for digital control is computed through analog methods to optimize cost and energy efficiency. Following this, we progressed to the transistor-level schematic design and layout. The final GDS (Graphic Data System) file successfully passed LVS (Layout vs. Schematic) and DRC (Design Rule Check) verification. Post-layout simulations were performed, resulting in the successful tape-out of our design in June 2022. The micrograph of the IC, featuring section marks, is displayed below, with the die size measuring 3mm x 2.5mm.



Figure 1. Micrograph of the proposed IC.

Our PCB board setup for characterization is illustrated in Fig. 2(a). In Fig. 2(b), the operational waveforms illustrating key technical aspects are presented. Specifically, the waveforms for high-side turn on (HS ON) and load current (IL) confirm that the main stage, coupled with the PLL, operates at the anticipated frequency of 500kHz. The waveform for Vout demonstrates the effectiveness of the hysteretic control loop in maintaining the output voltage at the specified 0.8V, following the  $V_{ref}$ reference. The V<sub>ic sense</sub> waveform reflects the transient detector's ability to generate a precise representation of the capacitor current, which manifests as a reversed ripple signal of the inductor current. As we continue with our efforts, we are actively characterizing and investigating the improvements in load current transient response achieved with the auxiliary stage.



Figure 2. (a) PCB board for characterization and (b) operation waveforms of critical signals (V<sub>ic\_sense</sub>, V<sub>out</sub>, I<sub>L</sub>, HS\_ON).

**Keywords:** augmented converter, fast transient detector, fast settling, high efficiency, digital nonlinear control

#### INDUSTRY INTERACTIONS

IBM, Intel, NXP, Texas Instruments

Charging batteries (in cell phones) with other batteries (battery banks) is the new normal. Volume and charge life are critical in this space. The objective of this research is to develop a small and power-efficient single-inductor voltage regulator that also charges and conditions a battery for maximum energy.

# **TECHNICAL APPROACH**

Charging dynamics in Lithium-Ion batteries are largely unaccounted for by conventional chargers, so they do not charge them maximally. The first step in this research is to study, test, and develop charging methods that expand usable energy storage. The second step is to develop an efficient power supply that charges a battery and supplies a load with one inductor. To preserve energy, power flow in all directions should be efficient: input-output, inputbattery, and battery-output.

# SUMMARY OF RESULTS

The battery-charging power supply in Figs. 1 and 2 was developed and simulated in 2022. The features of the power stage are: one switched inductor  $L_X$  (compact) charges the battery  $v_B$  and supplies the output  $v_0$  (power efficient) with input  $v_{IN}$  and/or battery  $v_B$  power, and the inductor always drains into the output  $v_0$  (accurate/fast).



Figure 1. Proposed battery-charging power supply.



Figure 2. CMOS implementation.

With this configuration,  $L_X$  can energize with  $v_{IN}$  and/or  $v_B$  and drain with  $v_B$  and/or  $v_O$ . This way,  $v_{IN}$  can charge  $v_B$  while supplying  $v_O$  and  $v_B$  can supply  $v_O$  when  $v_{IN}$  is absent –  $v_{IN}$  is a USB source that is not always present. The key to keeping  $v_O$  supplied, even when charging  $v_B$ , is  $v_B$ 's series connection to  $v_O$ .

Since  $v_{IN}$  is not always present and  $v_B$  connects to switching nodes  $v_{SWX}$  and  $v_{SWB}$  that connect to  $v_0$  and ground,  $v_{IN}$  is not always the highest potential and ground is not always the lowest potential. The drivers of the switches that connect to  $v_{IN}$  and the switching nodes that connect to  $v_{IN}$  and  $v_B$  must therefore account for these extreme variations. The design includes  $v_{MIN}$  and  $v_{MAX}$ blocks for this purpose: to sense and determine the highest and lowest potentials that these drivers can use to open and close their switches.

**Keywords:** Li-ion charger, voltage regulator, efficiency

# INDUSTRY INTERACTIONS

Intel, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] Q. Zhi, G.A. Rincón-Mora, and P. Gu, "Autonomous and Programmable 12-W 10-kHz Single-Cell Li-Ion Battery Tester," IEEE Trans. on Instrumentation & Measurement, vol. 71, Apr. 2022.

[2] L. Cui, Q. Zhi, and G.A. Rincón-Mora, "Compact, accurate, and efficient battery-charging CMOS voltage regulator," U.S. patent pending.

Electromagnetic side-channel attack (EM-SCA) exploits electromagnetic interference (EMI) traces in modern electronics to breach encrypted data security. Existing countermeasures face significant trade-offs between the power conversion performance and side-channel security. The proposed solution supports and mitigates such tradeoffs and achieves both efficient power conversion and enhanced EM side-channel security.

# **TECHNICAL APPROACH**

To prevent potential security breaches by SCA, both the EM and power traces of the data encryption system must be statistically de-correlated from the data being encrypted. Prior arts such as power masking and operation randomization of converter achieve such decorrelation, but these approaches often come at the expense of extra power loss and degraded output regulation performance. Alternatively, power-balanced encryption circuits achieve constant power consumption regardless of the encrypted data, but they induce power penalties and large area overheads due to complex digital circuitry. We aim to eliminate the trade-offs between security and performance and find a well-balanced solution.

# SUMMARY OF RESULTS



Figure 1. System architecture of the proposed solution [1].

The proposed power stage architecture shown in Fig. 1 mitigates extra power loss by achieving both power injection and charge recycling. Both the power and EM traces are de-correlated from the encryption core activity by randomized power injection and recycling of the injected power. The proposed power stage consists of a conventional buck converter and a switched capacitor stage in parallel. The encryption interface triggers random power injections from the input to charge the capacitor in parallel. Once the capacitor voltage  $V_{CS}$  is sufficiently charged, the charge recycling is activated, where the main

power input is temporarily disconnected, and the capacitor delivers the load current until discharged to a threshold. Thus, the charge recycling achieves further randomization by random input pulse skipping and energy saving.

The conducted EMI measurement in Fig. 2 demonstrates the effective EM trace randomization by the proposed design. Without the proposed techniques, the EMI peaks in the spectrum are highly correlated to the operating condition of the converter while the proposed technique suppresses EMI peaks and continuously randomizes the spectrum to prevent statistical analysis by EM-SCA. Fig. 3 shows the efficiencies with and without the proposed techniques. Thanks to the efficient energy use by the charge recycling technique, the proposed encrypted power supply achieves the peak efficiency of 90.5% and the maximum power overhead of only 4.9% without any degradation in the output regulation performance.



Figure 2. Conducted EMI measurement at I<sub>CORE</sub> of 200mA (a) without, and (b) with power injection and charge recycling.



Figure 3. Measured efficiency without and with proposed random parallel power injection and charge recycling.

#### Keywords: Charge recycling, EM-SCA, SCA Countermeasure

# INDUSTRY INTERACTIONS

IBM, NXP, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] K. Wei, J. W. Kwak and D. B. Ma, "An Encrypted On-Chip Power Supply With Random Parallel Power Injection and Charge Recycling Against Power/EM Side-Channel Attacks," in IEEE Transactions on Power Electronics, vol. 38, no. 1, pp. 500-509, Jan. 2023.

# TASK 2810.059, ULTRA-LOW-POWER ROBUST SAR ADC FOR PMCW AUTOMOTIVE RADAR

YUN CHIU, UNIVERSITY OF TEXAS AT DALLAS, CHIU.YUN@UTDALLAS.EDU

# SIGNIFICANCE AND OBJECTIVES

Work on transistor level design of a two-step SAR ADC is presented. We find that distortion performance is improved at a given summing node switch sizing using the proposed soft summing node (SSN) bottom plate sampling. Additionally, a comparison time flash TDC is employed to increase the first stage resolution.

#### **TECHNICAL APPROACH**

The efficacy of the proposed SSN in a sample and hold (S/H) circuit was verified with transistor switches. The SAR ADC and comparison time flash TDC were also simulated at a transistor level. Due to the nonlinearity and voltage/temperature variation in the TDC, background calibration was implemented. To verify the calibration algorithm function without excessively long simulation, the TDC characteristics were extracted from transistor simulation and input into behavior simulation of the background calibration.

#### SUMMARY OF RESULTS

A schematic of the first stage SAR ADC is shown in Fig. 1 below.



Figure 1. First stage SAR ADC with passive offset cancellation to correct summing-node swing in bottom-plate sampling circuit and flash TDC to assist first stage resolution.

The summing node of the bottom plate sampling circuit, node 2, is a soft summing node. This means that  $M_2$  is small, resulting in smaller nonlinear parasitics on the summing node at the expense of increased swing and therefore distortion on that node. However, since the capacitor,  $C_x$  also samples the summing node voltage at the sampling edge, the swing and distortion on that node are captured and are not seen by the comparator or residue amplifier. Fig. 2 shows in transistor level simulation of a S/H circuit, how with soft summing node cancellation, there is improved distortion performance at a given  $M_2$  size, or similarly, how a smaller size  $M_2$  can achieve the same distortion performance.



Figure 2. Bottom plate S/H distortion comparison from transistor level simulation.

To assist in the first stage resolution so that the design of the residue amplifier and second stage are relaxed, a flash TDC is used to measure the time taken by the comparator during the last bit cycle of the first stage SAR ADC. This comparison time is logarithmically related to the magnitude of the voltage residue seen during the last bit cycle, so time amplification is needed to ensure sufficient resolution and calibration is needed to undo the inherent non-linearity of the comparison time. The convergence of the background calibration can be seen in the ENOB of the first stage in Fig. 3 below.



Figure 3. Convergence of TDC background calibration in behavior simulation with TDC characteristics extracted from transistor level simulation.

In the following year, the plan is to tapeout a prototype chip in a 22-nm process and measure the results in silicon.

**Keywords:** soft summing node (SSN), summing-node swing, summing-node distortion, flash TDC, background calibration

#### INDUSTRY INTERACTIONS

NXP, Texas Instruments

TASK 2810.061, TWO-STAGE VERTICAL POWER DELIVERY AND MANAGEMENT FOR EFFICIENT HIGH-PERFORMANCE COMPUTING HANH-PHUC LE, UNIVERSITY OF CALIFORNIA AT SAN DIEGO, HANHPHUC@UCSD.EDU PATRICK MERCIER, UNIVERSITY OF CALIFORNIA AT SAN DIEGO

# SIGNIFICANCE AND OBJECTIVES

The project seeks to significantly improve the efficiency of power delivery from high-voltage busses to scaled-CMOS-compatible voltages (<1V) in a vertically and heterogeneously integrated architecture leveraging hybrid and switched-capacitor DC-DC converters. Successfully deployed can reduce the number of power pins by at least 2x and reduce thermal dissipation.

# **TECHNICAL APPROACH**

The new approach utilizes a 2-stage vertical PDM architecture with an optimal tapered current distribution, combining an integrated 4V-to-1V switched-capacitor voltage regulator (SCVR) stage located within the package substrate, underneath the processing die, along with a 20V/48V-to-4V hybrid voltage regulator module (HVRM) stage on the PCB. The SCVR is co-packaged with deep-trench capacitors where both SCVR and the integrated capacitor interposer dies can be thinned so that they can fit within the C4 bump height. The vertical power tree architecture enables ~2x reduction in package PDM pins with 4x interconnect loss reduction; resulting in a ~1.5x increase in available data IO pins.

# SUMMARY OF RESULTS

The Gen 1 designs of the two converter chips for the proposed system were designed and measured (Figs. 1-3). Multiple new circuit, control, and startup techniques have been implemented to improve their performances. The SCVR was implemented in a 65-nm process, while the HVRM in 180-nm, and will use flip-chip bumping to reduce parasitic resistances and inductances. The HVRM full design achieved 94.2% efficiency, while SCVR achieved 92.5% (Fig. 3). For the SCVR design, the team worked with our industry partner, Murata, to co-design a package substrate with new integrated passive devices (iPDs) that are isolated from each other to implement flying capacitors with 1.3uF/mm<sup>2</sup> capacitance density.

In comparison with the state-of-the-art works with similar voltage conversion ratios, this work achieves ~10% higher peak system efficiency, equivalent to ~44% maximum loss reduction, and demonstrates scalability for future heterogeneously integrated systems that are not achievable with the single stage architecture.



Figure 1. Two-stage vertical power delivery architecture with Gen.1 20V-to-1V conversion, implemented in 65-nm (SCVR) and 180-nm (HVRM) processes.



Figure 2. Measured load step response with 1 HVRM + 3 SCVRs.



Figure 3. Efficiency vs. load current for 1 HVRM + 6 SCVRs.

**Keywords:** vertical power delivery, vertical power tree, DC-DC converter, switched-capacitor, hybrid converter.

# INDUSTRY INTERACTIONS

AMD, IBM, Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] C. Hardy et al., "11.1 A Scalable Heterogeneous Integrated Two-Stage Vertical Power-Delivery Architecture for High-Performance Computing," ISSCC 2023, pp. 182-184. TASK 2810.065, POWER-EFFICIENT AND RELIABLE 48-V DC-DC CONVERTER WITH DIRECT SIGNAL-TO-FEATURE EXTRACTION AND DNN-ASSISTED MULTI-INPUT MULTIPLE-OUTPUT FEEDBACK CONTROL MINGOO SEOK, COLUMBIA UNIVERSITY, MGSEOK@EE.COLUMBIA.EDU

#### SIGNIFICANCE AND OBJECTIVES

The goal of this project is to extract critical features of the large step-down ratio DC-DC converter and perform optimization. The proposed feature extraction and optimization functions would improve the converter's efficiency and enhance the converter's reliability.

#### **TECHNICAL APPROACH**

We designed a high-efficiency 24V-to-1V buck converter featuring two techniques to improve efficiency and reliability. First, we propose a fast *in-situ* efficiency tracking (FIT) technique, which helps to maintain high power efficiency across load conditions and process, voltage, and temperature variations. Second, we propose a power-FET code roaming technique to avoid excessive aging on specific segments.

#### SUMMARY OF RESULTS

We designed, fabricated, and tested a 24V-to-1V buck converter prototype in a 180-nm BCD process. Through the process, we proposed and verified two optimization techniques for a 24V-to-1V DC-DC converter. We are going to design the second prototype chip focusing on in-situ health monitoring and reliability enhancement of a 48Vto-1V converter in the following year.



Figure 1. (a) Efficiency curves across different numbers of active segments ( $N_{act}$ ). (b) Efficiency curves across three different efficiency tracking methods.

The prototype implements two novel optimization techniques. First, we proposed a fast *in-situ* efficiency tracking technique. Besides, we implemented a power-FET code roaming technique that avoids the excessive aging of specific power FET segments, slowing down on-resistance ( $R_{on}$ ) degradation.

Fig. 1 illustrates the efficiency tracking test results. Fig. 1(a) shows the efficiency curves across different numbers of active segments,  $N_{act}$ . Fig. 1(b) compares the efficiency

curves with three tracking cases: case (1) without efficiency tracking, case (2) with LUT-based initial guess only, and case (3) with the FIT method. With the proposed FIT method, the converter achieves a peak efficiency of 93.89% at 405mA. The FIT improves efficiency by 34.74% compared to case (1) and by 1.67% compared to case (2). It also enables >85% efficiency across a wide range of load currents from 65mA to 5A (76.92×).



Figure 2. (a) The proposed code roaming technique slows  $R_{on}$  degradation by 5.8×. (b) It also improves efficiency.

Fig. 2 exhibits the test results of power FET code roaming. Fig. 2(a) shows the aging measurement on ML2's  $R_{on}$ . We accelerate the aging process by operating the converter at 125°C. The measurement results show that the code roaming technique can slow the  $R_{on}$  degradation by 5.8×. Fig. 2(b) shows the efficiency improvement measurement with the code roaming. We estimate that code roaming mitigates local hotspot development and thereby improves efficiency. The efficiency improvement increases with  $I_{load}$  and achieves 0.18% at  $I_{load}$ =900mA,  $N_{act}$ =4.

**Keywords:** efficiency tracking, LUT-based control, gradient descent, code roaming, reliability enhancement

#### INDUSTRY INTERACTIONS

IBM, Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Z. Wang et al., "93.89% Peak Efficiency 24V-to-1V DC-DC Converter with Fast In-Situ Efficiency Tracking and Power-FET Code Roaming," 2023 IEEE ESSCIRC, September 2023.

# TASK 2810.067, HIGHLY EFFICIENT EXTREME-CONVERSION-RATIO BUCK HYBRID CONVERTERS PARTHA PANDE, WASHINGTON STATE UNIVERSITY, PANDE@WSU.EDU

DEUKHYOUN HEO AND JANA DOPPA, WASHINGTON STATE UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

The objective of this project is to design and demonstrate a high-performance, energy-efficient voltage regulator with an exceptionally high voltage conversion ratio (VCR). Furthermore, we intend to introduce a new optimization framework enabled by machine learning (ML) technologies.

#### **TECHNICAL APPROACH**

The focus of this research is threefold: (1) to develop a novel high-efficiency, single-input-single-output (SISO) HCR hybrid buck converter, (2) to design a single-input-multi-output (SIMO) HCR hybrid converter, offering simultaneous operation with minimal cross-regulation and rapid response, and (3) to establish a machine learning (ML)-based framework dedicated to optimizing the circuit parameters of SISO and SIMO HCR hybrid converters.

#### SUMMARY OF RESULTS

We have conducted an extensive investigation on a SIMO HCR inductor-first hybrid (IFH) converter. The SIMO converter comprises a power inductor and switchedcapacitor-power-stages (SCPSs), sharing similarities with our previous year's single-output IFH converter design, along with a voltage regulation controller for multiple outputs. The proposed converter has demonstrated its capability to achieve a maximum step-down VCR of 32 while maintaining high efficiency.

One of our goals is to boost the response time. To this end, we have explored a technique to enhance the transient response. The sluggish response in the hybrid



Figure 1. Proposed technique to enhance transient response of the high-conversion-ratio (HCR) inductor-first hybrid (IFH) converter: (a) response enhancement by reducing inductor current (red), (b) recovery time enhancement of switched-cap stage by using higher  $f_{sw}$  (red) during transient time.

converter is constrained by both the switched-inductor and the component switched-capacitor component. With the adoption of an inductorfirst structure, the necessary inductor current alteration is dramatically decreased by up to 30 times compared to



Figure 2. Fabricated SIMO converter chip photo.

conventional and other hybrid structures. The second impediment is the recovery time of the flying capacitor voltage, which is improved by implementing a high switching frequency (fsw) during the transitional phase as shown in Fig. 1.

Moreover, we have also formulated a "Reversed-D Power Loss Reduction" technique. Under conditions of high duty cycle (D), the power stage is reconfigured to control the output from regulator "D" to "1-D". Implementing this technique enables the proposed converter to significantly diminish the hard discharge power loss as well as the voltage drop across the power switches.

The proposed HCR converter has been fabricated using a 180nm CMOS process with an area of  $2\times2.5$ mm<sup>2</sup> including test pads. A depiction of the fabricated chip diephoto is shown in Fig. 2. We employed a 1.2-nH off-chip inductor with a f<sub>SW</sub> of 2.5MHz. The post-layout simulation of the proposed SIMO converter reveals that it can produce three outputs of 0.49V to 0.9V, 0.98V to 1.8V, and 1.39V to 2.7V from a 12V input voltage.

**Keywords:** extremely high conversion ratio, buck converter, hybrid topology, efficiency enhancement

INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

# TASK 2810.068, ACTIVE EMI FILTERING WITH SWITCH-MODE AMPLIFIER FOR HIGH EFFICIENCY ALEX J. HANSON. THE UNIVERSITY OF TEXAS AT AUSTIN. AJHANSON@UTEXAS.EDU

#### SIGNIFICANCE AND OBJECTIVES

Propose and develop a switch-mode active EMI filter to replace a conventional LC filter for a smaller size and higher efficiency.

#### TECHNICAL APPROACH

The proposed active EMI filter (AEF) consists of a class-D switching amplifier (synchronous buck converter) using wide-bandgap (WBG) GaN devices switching at the VHF frequency of 31MHz to keep its own EMI out of the regulated EMI range. We also introduce a new gate driver combining an LVDS-output comparator, a highspeed LVDS-input-output digital isolator, and a ns-range propagation delay gate driver IC to reduce the total loop delay from 22ns to 9ns for achieving higher current attenuation.

#### SUMMARY OF RESULTS

The proposed AEF achieved 30dB current attenuation from the active circuit at the fundamental frequency of 150kHz without fully accounting for the effects of closedloop input impedance from the main boost converter and output-impedances from the high-frequency LC filter at the input and from the line impedance stabilization network (LISN) as shown in Fig. 1. A full closed-loop compensation for the proposed switch-mode AEF is theoretically developed and demonstrated by simulation considering all impedance interferences to the closedloop AEF circuit.



Figure 1. Impedance interference from the HF LV filter, an LISN, and from the input of the main converter to the AEF.

The current attenuation of 30dB is limited by the maximum achievable bandwidth (~1.8MHz) of the AEF's compensator, which is caused by the large loop delay of 22ns of the compensator circuit. We newly implemented a gate driver built by a comparator with low-voltage differential signalling (LVDS) output, a high-speed LVDS-input-output digital isolator, and a ns-range propagation delay gate driver IC to acheieve a minimum loop delay of

less than 9ns. Thus, we increased the compensator bandwidth up to 5MHz and achieved a much higher attenuation of 41.4 dB.



Figure 2. Higher current attenuation is achieved with a larger compensator bandwidth by reducing the loop delay of the compensator from 22ns to less than 9ns.



Figure 3. A new prototype of the proposed AEF.

A new prototype of the proposed AFE is built, operating at the full closed-loop conditions. It is experimentally expected to achieve over 40dB current attenuation at 150kHz, to have a smaller size of less than ~0.4 in<sup>3</sup> (1/8 of the volume of the size-optimized LC filter), and to dissipate less than 2W for 300W output power.

**Keywords:** Active EMI Filter, Synchronous Active EMI Filter, Active Filter with GaN Devices, Fractional-Order Filter, VHF class-D Converter

# INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] D. T. Nguyen, C. Deng, E. Macias, and A. J. Hanson, "Synchronously Switched Active EMI Filter," 2022 IEEE ECCE, Detroit, MI, USA, 2022, pp. 1-8.

[2] D. T. Nguyen, E. Macias and A. J. Hanson, "Active EMI Filter with Switch-Mode Amplifier for High Efficiency," 2022 IEEE APEC, Houston, TX, USA, 2022, pp. 443-450.

[3] A. J. Hanson, D. T. Nguyen, "Synchronous Switch-mode Active Electromagnetic Interference Cancellation Circuit and Method", US Patent App, 18,165,887.

# TASK 2810.072 / 2810.073, AI/ML EDGE HARDWARE FOR ULTRA-RELIABLE WIRELESS NETWORKS DAVID J. ALLSTOT, OREGON STATE UNIVERSITY, ALLSTOTD@OREGONSTATE.EDU YIORGOS MAKRIS, THE UNIVERSITY OF TEXAS AT DALLLAS

# SIGNIFICANCE AND OBJECTIVES

The overall objective of this project is to develop areaand power-efficient on-chip real-time digital predistortion techniques for state-of-the-art RF transmitters using energy-efficient switched-capacitor power amplifiers.

#### **TECHNICAL APPROACH**

These goals are being addressed using a real-valued time-delay neural network that comprises a fully connected input layer, hidden layers, and an output layer. The hidden layers comprise 4 fully connected layers (40 neurons) with activation functions; there are 6 total layers. Good results have been achieved using a non-quantized NN wherein both AM-AM and AM-PM are trained together. Simulations of the DPD (Digital Pre-Distortion) are applied to measured results from the novel power amplifier in Figs. 1 and 2.

#### SUMMARY OF RESULTS

A novel 8-core SCPA has been designed, implemented, and tested with and without DPD as indicated below. The preliminary measured results in Figs. 3 and 4 are encouraging.



Figure 1. Architecture/circuits of the 8-way power combiner.



Figure 2. Layout of the 8-way power combiner.



Figure 3. Test flow for dynamic measurements plus DPD NN.



Figure 4. Measured (a) spectrum, (b) constellation, (c) EVM, and (d) ACLR of the modulated signals before and after DPD.

Future work will consider quantized neural networks. Classifying the output data to the input data represented by the nearest point in the constellation diagram will be studied as a possible more practical alternative.

**Keywords:** Switched-capacitor power amplifier, Class-G PA; backoff efficiency; digital pre-distortion techniques

INDUSTRY INTERACTIONS

Intel, Qualcomm, Texas Instruments

### MAJOR PAPERS/PATENTS

[1] B. Qiao, et al., "An eight-core class-G switchedcapacitor power amplifier with eight power backoff efficiency peaks," *IEEE RFIC Conf.*, 2022, pp. 1-4.

[2] N. Najim, et al., "Machine learning techniques for digital pre-distortion in CMOS switched-capacitor power amplifiers," SRC Techcon, 2022.

# TASK 2810.075, HYBRID STEP-DOWN DC-DC CONVERTER WITH LARGE CONVERSION RATIOS FOR 48V AUTOMOTIVE APPLICATIONS HOI LEE, UNIVERSITY OF TEXAS AT DALLAS, HOILEE@UTDALLAS.EDU JIN LIU, UNIVERSITY OF TEXAS AT DALLAS

### SIGNIFICANCE AND OBJECTIVES

This research aims to develop innovative capacitorassisted hybrid DC-DC converters to provide high power efficiency under large input-to-output voltage conversions in 48V automotive applications. A systematic approach will be developed to realize hybrid converters with a minimal number of low-voltage power FETs and passive components to improve the converter power density.

# **TECHNICAL APPROACH**

We investigated both flying-capacitor multi-level and switched-capacitor-assisted converter topologies to evaluate operation flexibility in different conditions; the requirements of voltage balancing and pre-charging of flying capacitors; the capability of providing high power density; and different power losses. We developed a new technique: capacitor-assisted dual-inductor filtering to combine with the common SC architectures for reducing the output-to-input conversion ratio. The minimum ontime of the power switches can be expanded such that the converter can operate in the MHz range with high efficiency. All flying capacitors do not require voltage balancing in the steady state, thereby reducing the controller design complexity.

### SUMMARY OF RESULTS

The proposed 4:1 Dickson-based capacitor-assisted dual-inductor (CADI) converter is shown in Fig. 1 below. It consists of two inductors, nine power switches, and four flying capacitors. The output-to-input conversion ratio (CR) is reduced by a factor of 6, which helps to expand the minimum on-time of power switches by 12 times as compared with the conventional buck converter. All flying capacitors in the converter do not require balancing in the steady state. Both switching-node voltage swings and the flying



Figure 1. Architecture of the proposed 4:1 Dickson-based CADI converter [1].



Figure 2. Measured power efficiency of the proposed converter at 1MHz.

capacitor voltages are reduced considerably to reduce the converter switching loss and improve the converter reliability, respectively.

The proposed converter prototype was implemented for directly providing voltage conversion to generate an output voltage of 1 - 2V from the input voltage of 40V -65V with the switching frequency of 1MHz [1]. All power switches are discrete MOSFETs from Infineon. The converter delivers the load current I<sub>0</sub> up to 80A. Fig. 2 presents the measured power efficiency of the proposed converter. It achieves over 82% for the load current from 2A to 80A. The peak power efficiency reaches 94.3% at 1MHz for 48V-to-1V conversion. To our best knowledge, the proposed single-stage single-step converter supports the highest output current of 80A, provides the highest power density of 636W/in<sup>3</sup>, and achieves the highest peak power efficiency of 94.3% compared with state-of-the-art designs.

Based on these promising results from the proposed converter, we are further exploring control techniques to regulate the output voltage.

**Keywords:** DC-DC converter, capacitor-assisted dual inductor filtering, high-conversion-ratio step-down converter, hybrid converter, non-isolated converter

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] W. Han et al., "An 80A 48V-input capacitor-assisted dual-inductor hybrid Dickson converter for large-conversion ratio applications," in *Proc. IEEE ECCE*, Oct. 2022, pp. 1 - 5.

# TASK 2810.078, PROGRAMMABLE MIXED-SIGNAL ACCELERATOR FOR DNNS WITH DEPTHWISE SEPARABLE CONVOLUTION LAYERS BORIS MURMANN, STANFORD UNIVERSITY, MURMANN@STANFORD.EDU

### SIGNIFICANCE AND OBJECTIVES

Deep neural networks (DNNs) require massively parallel and energy-efficient multiply-accumulate (MAC) circuitry. In-memory computing (IMC) has shown potential. This work has looked for a middle ground between IMC and standard digital processing by investigating both mixedsignal compute arrays and fully digital approaches. Most recently, we have developed a digital IMC processor that is superior to the traditional analog approach.

#### **TECHNICAL APPROACH**

Our hardware is built to efficiently run bottleneck layers, which are used in modern DNNs like MobilenetV2 (MBNetV2) that target Tiny-ML applications. The key ingredient of our approach is a processing element (PE) array that intersperses partial product accumulation circuitry with local SRAM kernel storage and digital multipliers. The density of these circuits allows us to fully unroll and pipeline the operations of a bottleneck layer to (1) reduce the activation memory needed, (2) eliminate accumulation buffers and (3) eliminate repeat weight and activation accesses (see Fig. 1).

# SUMMARY OF RESULTS

Quantization experiments on a small MBNetV2 (<60kB) for several tinyML applications have shown that 8-bit quantization results in better accuracy than 4-bit quantization for the same network size, motivating the use of digital computing in the final stage of this project. We have thus designed a kernel for a fully digital IMC approach that uses the same type of local memories as the MS approach but uses 8-bit bit-serial multiplications with a digital adder tree and accumulator.

Using the IMC kernel, we developed an end-to-end optimized processor for low-energy operation with tinyML inference workloads. We taped out and measured a prototype in 28-nm CMOS (see Fig. 2). It achieves  $3.1 \,\mu$ J-per-inference (with 90.2% accuracy) on the CIFAR-10 benchmark, as well as commensurate energy savings on all standard tinyML application benchmarks.

As shown in the figure, each compute memory slice contains custom latch array (CLA) memory and bit-serial arithmetic that performs 8bx8b multiplications over a clock cycle. The chip runs at a clock frequency of 10 MHz and achieves a throughput of 400 frames per second. The merits of using CLA-based memory are discussed in [3].



Figure 1. Pipelined machine learning processor architecture for bottleneck layers.



Figure 2. Prototype IC in 28-nm CMOS (4.09 mm<sup>2</sup> core area).

**Keywords:** Deep neural networks, hardware accelerators, in-memory computing, mixed-signal integrated circuits

#### INDUSTRY INTERACTIONS

IBM, NXP, Qualcomm, Samsung, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] W.-H. Yu et al., "A 4-bit Mixed-Signal MAC Array with Swing Enhancement and Local Kernel Memory," 2021 IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), 2021.

# TASK 2810.079, HIGH-POWER-DENSITY IN-PACKAGE SIMO CONVERTERS FOR NEXT-GENERATION MICROPROCESSORS CHENG HUANG, IOWA STATE UNIVERSITY, CHENGH@IASTATE.EDU

# SIGNIFICANCE AND OBJECTIVES

SIMO allows multiple rails sharing one inductor, thus reducing the cost and form factor. However, it has significant limitations of self-/cross-regulation and thus the capability to handle fast and large current steps. This project aims to address these limitations with a significant improvement in load-transient handling capability.

# **TECHNICAL APPROACH**

This project is developing a SIMO converter utilizing voltage-mode hysteretic control and digital LDOs to eliminate both small signal and large signal limitations during large and fast transients. Also, freewheel-free power stage control and bootstrapping for output switches are implemented which improve both efficiency as well as output load range. The project is also developing a 3-level SIMO converter for a higher conversion ratio. Closed-loop control is designed in both flying capacitor voltage calibration and output voltage regulation such that during fast and large load transients, fluctuation at flying capacitor voltage, as well as output voltage, can be minimized.

# SUMMARY OF RESULTS

Both designs have been verified in simulation with a TSMC 180-nm process. A SIMO converter utilizing voltagemode hysteretic control and digital LDOs has been fabricated and verified in measurement, as shown in Fig. 1. The power inductor used is a 1- $\mu$ H 0806 inductor. While the inductor charge/discharge control is synchronized at a higher frequency, the output power distribution control is synchronized at 1/8 of the clock and rotated between the two outputs. The measurement results verify that the converter can operate properly without the free-wheel switch, which improves the power efficiency.

Fig. 2 shows the chip layout of a 3-level SIMO converter for a higher conversion ratio. 1.8-V core devices are used to achieve voltage conversion from 3.3V to around 1V or sub-1V. Due to the unique topology and controller design, a free-wheel transistor is used in the converter such that when both outputs are overpowered, reductant current could be dumped in time without causing voltage fluctuation. A discharge time control logic regulates freewheel current to improve total efficiency. Thanks to the operation principle of 3-level power stage, a much smaller 220-nH 0402 inductor can be used in this converter while achieving similar voltage ripples and faster large signal response. The design has been submitted for fabrication in May 2023.



Figure 1. Chip photo of a SIMO converter utilizing voltage-mode hysteretic control and digital LDOs.



Figure 2. Chip layout of a 3-level SIMO converter for a higher conversion ratio.

**Keywords:** single-inductor multiple-output, voltagemode, digital LDO, bootstrapping, 3-level converter

# INDUSTRY INTERACTIONS

IBM, NXP, Texas Instruments MAJOR PAPERS/PATENTS

# TASK 2810.080, EFFICIENT AND HIGH-DENSITY FULLY IN-PACKAGE GAN-BASED HIGH-RATIO DC-DC CONVERTERS CHENG HUANG, IOWA STATE UNIVERSITY, CHENGH@IASTATE.EDU HOUQIANG FU, ARIZONA STATE UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

Three-level and double step-down (DSD) converters are two of the most popular state-of-the-art topologies for high-ratio step-down conversion. The current objective is to design a topology better than both topologies in terms of efficiency and power density.

#### **TECHNICAL APPROACH**

A 4-phase switched-capacitor (4PSC) topology is proposed (Fig. 1) and compared with DSD and 3-level topologies for 48V to 1V conversion. The proposed topology has only one inductor just like 3-level converters. The converter size and BoM are dominated by the inductor limiting power density of DSD converters due to an extra inductor. Therefore, the proposed topology is based on a single inductor for lower cost and higher density. A larger switch is added to provide lowerresistance current paths for inductor discharge because it occupies most of the switching period.

#### SUMMARY OF RESULTS

The proposed 4PSC topology has an effective frequency of 4X the switching frequency, which means for the sameripple condition it can be switched at 4X lower frequency compared to DSD, reducing switching loss. Moreover, lowvoltage (LV) rating CMOS devices for each topology surpass the performance of available 200V GaN devices, thereby making LV devices in low-cost BCD processes a more suitable choice for the design. The proposed topology can utilize 12-V devices (Fig. 1) as opposed to DSD and 3-level for which a minimum of 24-V devices are required. Simulations have been carried out to verify the analysis in Cadence with a 180-nm BCD process. A chip will be fabricated in TSMC 180-nm BCD with a newer and improved version of the topology.

Simulations are performed to verify the efficiency comparison of the three topologies at different loading conditions and conversion ratios, as shown in Fig. 2. The optimized efficiency of the proposed topology is higher than the DSD (~3% at 1V/5A) and much higher than the 3-level topology under the same-ripple condition when using LV devices for each topology. In addition, LV devices are also observed to show superior efficiency performance compared to 55-V devices at all the loading conditions as shown in Fig. 2, primarily due to lower switching loss.

Besides, a new 48-V to 1-3.3-V direct-down-conversion topology is developed at the PCB level (Fig. 3). The functionality of the board has been verified and the efficiency measurement is also obtained.



Figure 1. Proposed single-inductor 4PSC topology indicating voltage rating of the devices in a 180-nm BCD process.



Figure 2. 48V-1V efficiency results versus (Left) total transistor area for each topology, and (Right) loading current for 4PSC.



Figure 3. Proposed board-level converter design.

**Keywords:** high-ratio, hybrid, 48V, DC-DC converter, point-of-load converter

#### INDUSTRY INTERACTIONS

IBM, Intel, Richtek, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] M. R. Khan, K. Wei, X. Zhang, and C. Huang, "A Single-Inductor 4-Phase Hybrid Switched-Capacitor Topology for Integrated 48V-to-1V DC-DC Converters," IEEE International Symposium on Circuits and Systems (ISCAS) 2023.

# TASK 2810.087, GRID OPTIMIZATION AND SILICON VALIDATION FOR CHIP ROBUSTNESS

FARID N. NAJM, UNIVERSITY OF TORONTO, F.NAJM@UTORONTO.CA CHRIS H. KIM, UNIVERSITY OF MINNESOTA

# SIGNIFICANCE AND OBJECTIVES

On-chip power grid design and sizing can have a direct impact on the supply voltage, circuit timing, and functionality. As the chip ages, even a well-designed grid degrades over time and causes voltage drop violations. We develop methods for grid optimization and design that guarantee grid robustness under electromigration (EM).

#### TECHNICAL APPROACH

Our previous work on grid optimization [1] provided a method for grid fixing by varying the widths of metal lines, to be used once an EM-induced voltage drop violation has been found. In this work, we will build on [1] and address its lack of a void model and its main runtime bottleneck, namely that the objective function computation is slow. We do this by including a realistic physical void model and introducing a novel model reduction in the lines and the trees to speed up the simulation and tackle large grids.

#### SUMMARY OF RESULTS

Metal lines fail when voids form in them, due to high tensile stress in the metal, arising from current densityinduced atomic movement, called electromigration (EM). In previous work (2016-18), we developed a simulator (called EKM) for the stress over time, which predicts when voids will appear and the time to failure (TTF). This simulation engine, which lacks a true void growth model, was used to compute the objective function, i.e., the TTF, in our optimization engine [1].

In 2020, we developed electrical circuit models using RC metal lines that serve as a proxy for the time-evolution of electromigration, so that regular circuit simulation can be used to predict the TTF. In 2021, we developed a reduced model for these RC lines, to reduce the simulation runtime.

In the first year of this project (last year), we finalized a new version of the stress simulator that has three key features: (1) it uses our equivalent circuit models as the basis for formulating the simulation problem, (2) it uses the reduced order model for speeding up the simulation, and (3) it includes a realistic physical void growth model and corresponding simulation model. This was recently published in [2]. We also rewrote our optimization engine [1] to make use of the new simulation capability so that its objective is more realistic due to the realistic void growth and its runtime is faster due to the novel reduced line model. Various algorithmic and heuristic changes were required to combine the two improved capabilities (simulation and optimization). The simulation engine had to be augmented with ways to compute the sensitivities of the conductance matrix, which are needed for the linear program (LP) optimizer. In addition, the time when the LP constraints are enforced has required some heuristic modifications to avoid increased simulation time.

Table 1 shows some results on many test grids, using the reduced model. The "Orig MTF" is the original MTF (mean time to failure) of the grid, which is deemed failed if it's less than 12 years. The "New MTF" is the MTF after optimization, which is above 12 years in all cases. The runtimes and the speedups relative to using the full model are shown, along with the area overheads.

Table 1. Overall speedup results using the reduced line model.

| Grid<br>Size | Orig<br>MTF (y) | New<br>MTF | #LP | Run<br>time | Speed<br>up | Area<br>inc. |
|--------------|-----------------|------------|-----|-------------|-------------|--------------|
|              |                 |            |     |             |             | -            |
| 12k          | 11              | 14         | 1   | 2 mn        | 5X          | 1.5%         |
| 62k          | 10              | 12         | 4   | 38 mn       | 2.2X        | 6%           |
| 37k          | 7               | 12         | 2   | 6 mn        | 2.8X        | 3%           |
| 146k         | 8               | 13         | 2   | 2.9 hr      | 3.5X        | 3%           |
| 560k         | 8               | 14         | 1   | 58 mn       | 3.3X        | 1.5%         |
| 1.2 M        | 10              | 14         | 1   | 9.8 hr      | 1.9X        | 1.5%         |
| 1.7 M        | 9.6             | 14         | 1   | 4.5 hr      | 2.7X        | 1.5%         |
| 1.6 M        | 5               | 14         | 2   | 9.2 hr      | 2.3X        | 3%           |

Going forward, we are testing a new heuristic for reducing the cost of the LP and speeding up the overall optimization flow. We will also explore allowing the optimizer to vary both the line widths and source currents.

**Keywords:** integrated circuits, electromigration, stress, reliability, optimization

#### INDUSTRY INTERACTIONS

Intel, NXP, Siemens, Texas Instruments

#### MAJOR PAPERS/PATENTS

 Z. Moudallal, V. Sukharev, and F. N. Najm, "Power grid fixing for electromigration-induced voltage failures," International Conf. on Computer-Aided Design, Nov 2019.
 B. Shahriari and F. N. Najm, "Fast electromigration simulation for chip power grids," IEEE International Symposium on Quality Electronic Design, April 5-7, 2023.

# TASK 2810.088, GRID OPTIMIZATION AND SILICON VALIDATION FOR CHIP ROBUSTNESS

CHRIS H. KIM, UNIVERSITY OF MINNESOTA, CHRISKIM@UMN.EDU FARID NAJM, UNIVERSITY OF TORONTO

# SIGNIFICANCE AND OBJECTIVES

Electromigration (EM) in a power grid is a significant reliability concern. However, characterizing the power grid EM behavior is non-trivial since EM experiment normally requires accurate die temperature control and monitoring of local voltages. In this work, we analyzed early EM silicon data from a circuit based test vehicle.

# **TECHNICAL APPROACH**

EM test chips are designed to collect IR drop trends under EM stress and analyze time-to-failure (TTF) statistics under different conditions. Experiments are done with multi-threaded automated test software that our group has developed to perform accurate and stable testing on four power grid structures under multiple temperatures and current stress conditions.

#### SUMMARY OF RESULTS

Fig. 1 shows the overview of the EM test vehicle (left) and the die photo of the four DUTs (right). The voltage scanning circuit can continuously measure the 1024 internal voltage, which allows monitoring IR drop changes over time. Leveraging this feature, we have collected the IR drop data for more than 60 DUTs. Since we have stable test flows that can guarantee the DUT temperatures, we are focusing on collecting failure data in various stress conditions. One approach is plotting IR drops over time, as illustrated in Fig. 2. For example, Fig. 2 bottom shows that EM gradually worsens the voltage delivery to each cell. Using this IR drop data, we can define and analyze the power grid failure on four different DUT structures, and it will potentially help to calibrate EM simulator models.



Figure 1. (Left) Proposed 28nm EM test-vehicle including quasi power-grid, on-chip heater, and voltage scanning circuits. (Right) Full-chip layout with four different DUT structures.

Also, by changing the temperature and the stress current/voltage (Table 1), such voltage traces and TTF data will help validate the impact of power grid redundancy on EM lifetime as well as provide various scenarios that can help understand the EM physical models.



Figure 2. (Top) Fresh state IR drop profile. (Bottom) IR drop after EM stress.

Table 1. Time to failure data under different stress conditions. TTF failure criteria is a 10mV voltage shift from a fresh state.

| Stress | Grid         | Temperature  | Average    | # of  |
|--------|--------------|--------------|------------|-------|
| mode   | Grid         | control      | TTF (hour) | chips |
| 100mA  | DUT1         | 400°C Heater | 2.4        | 12    |
|        |              | 350°C Sensor | 1.9        | 10    |
|        | DUT2         | 350°C Sensor | 2.2        | 13    |
|        |              | 325°C Sensor | 15.2       | 5     |
|        |              | 300°C Sensor | 5.2        | 5     |
|        | DUT3         | 350°C Sensor | 0.8        | 5     |
| 80mA   | DUT2         | 350°C Sensor | 3.8        | 2     |
| 120mA  | DUT2         | 350°C Sensor | 1.7        | 1     |
| 10mA   | DUT4         | 350°C Sensor | 2.9        | 4     |
| 1.2V   | DUT2<br>DUT3 | 350°C Sensor | 3.0        | 6     |
|        |              | 350°C Sensor | 1.5        | 4     |

**Keywords:** Power grid, IR noise, electromigration lifetime, silicon validation, physical design

#### INDUSTRY INTERACTIONS

Intel, NXP, Siemens, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] Y. Yi, C. Zhou, A. Kteyan, V. Sukharev and C.H. Kim, "Studying the Impact of Temperature Gradient on Electromigration Lifetime Using a Power Grid Test Structure with On-Chip Heaters," International Reliability Physics Symposium (IRPS), 2023.

[2] A. Kteyan, et al, "Novel Methodology for Temperature-Aware Electromigration Assessment in On-chip Power Grid: Simulations and Experimental Validation," International Reliability Physics Symposium (IRPS), 2022.

[3] V. Sukharev, et al., "Experimental Validation of a Novel Methodology for Electromigration Assessment in On-chip Power Grids," IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, 2021.

# TASK 2810.092, BATTERY-CHARGING CMOS VOLTAGE REGULATOR FOR RESISTIVE LOW-VOLTAGE DC SOURCES GABRIEL A. RINCÓN-MORA, GEORGIA TECH UNIVERSITY, RINCON-MORA@GATECH.EDU

#### SIGNIFICANCE AND OBJECTIVES

Small DC sources are resistive, so the power they supply is limited. This research aims to explore how the maximum power point (MPP) shifts, how this MPP can be tracked, and how a CMOS voltage regulator can track this MPP, supply a load, and recharge a battery.

#### **TECHNICAL APPROACH**

The first objective is to determine how power losses in the system shift the MPP. Understanding this will reveal what should and should not be done when tracking the MPP. Designing and integrating a low-power controller with a power-efficient stage that switches an off-chip inductor is next. The system must track the MPP, regulate the output, draw battery assistance when input power is deficient, and recharge the battery when excess input power is available. The CMOS system should incorporate amplifiers, comparators, references, and other circuit blocks that require low power to operate and short (dutycycled) periods to perform their functions.

#### SUMMARY OF RESULTS

Small thermoelectric generators (TEGs), glucose fuel cells (FCs), and supercapacitors are resistive. TEG and FC voltages are also usually low, between 50 and 500 mV, and sensitive to ambient operating conditions. The battery-charging voltage regulator in Fig. 1 should therefore track operating conditions so the system is always at the MPP.



Figure 1. Battery-charging CMOS voltage-regulator system.

Since the source is resistive, the system cannot afford to sacrifice much input power. This is why a switched inductor is used because it outputs most of the power it receives. Still, ohmic losses that a combined resistance  $R_{\rm H}$  model and voltage losses that a combined voltage  $v_{\rm H}$  model increase the resistance and decrease the voltage that supplies the load, which reduces output power.

The MPP therefore shifts with  $R_H$  and  $v_H$ . But when  $R_H$  and  $v_H$  are small fractions of the source resistance  $R_S$  and voltage  $v_S$ , this shift is subtle. So the loss that results in Fig. 2, when the system does not account for  $R_H$  and  $v_H$ , is less than 2.5% when  $R_H$  and  $v_H$  are less than 10% of  $R_S$  and  $v_S$ .



Figure 2. Maximum power-point loss.

Since  $R_S$  is high for resistive sources and  $v_H$  is typically low in switched-inductor power supplies, MPP is sensitive to  $R_S$  and  $v_S$  and largely insensitive to  $R_H$  and  $v_H$ . This means that the system can track the input's MPP instead of the output's, which is much less involved. In other words, output power  $P_O$  is near its MPP when input power  $P_{IN}$ maxes, which happens when input voltage  $v_{IN}$  is half  $v_S$ .

With this conclusion drawn, the goal for next year is to design the CMOS power stage and develop the CMOS controller that switches an off-chip inductor. No more than one inductor should be used to keep the system compact. The CMOS system should consume very little power to ensure  $P_0$  is close to maximum when  $P_{IN}$  maxes.

Since the effects of  $R_H$  and  $v_H$  on the MPP are minimal, the system should draw the current that keeps  $v_{IN}$  near half  $v_s$ , which is close to  $v_s/2R_s$ . The CMOS system should be optimized to draw this current level. And this optimum should shift and track the MPP when  $v_s$  and  $R_s$  change. **Keywords:** MPP tracker, charger, regulator, harvester

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments

The primary anticipated result is to develop compact and low-power ASR hardware for intelligent IoT devices. Our preliminary studies suggest that the proposed bioinspired neural network model and low-power digital inmemory-computing hardware have a strong potential to improve power efficiency while achieving high inference accuracy.

#### **TECHNICAL APPROACH**

We will conduct the planned research as follows. (1) We will design a *compact ASR model based on advanced and diverse neuron models*. (2) We will develop a *digital IMC-based tinyASR accelerator* featuring an aggressive time-sharing architecture to improve area efficiency and adopt bit-parallel input approximate hardware to improve throughput and reduce approximation error. (3) We will create the model and hardware to perform post-deployment partial retraining to improve robustness against environmental noise and sensor hardware wearout. By combining the developed architecture and techniques, we will prototype one test chip for the tinyASR accelerator and another to upgrade the ASR model with the post-deployment partial retraining capability.

#### SUMMARY OF RESULTS

This year, we have designed, prototyped, and tested ultra-low-power automatic speech recognition (ASR) hardware, and our work is published at [1]. The key task was to design and tape out the new chip titled ultra-lowpower ASR hardware based on bio-inspired neuron model and digital compute-in-memory (DCIM) hardware.

On-chip ASR is a very attractive function for batterypowered edge devices as it can largely reduce the data upload bandwidth. It also improves latency, security, and privacy and can enable many new applications such as speech-based anomaly detection and surveillance and voice-based user interface in a tiny device.

Recently several on-chip ASR chips have been prototyped. They typically employ recurrent neural networks (RNN) such as LSTM to detect multiple phonemes and construct a word using phonemes. We have tested various RNN models on the TIMIT dataset and the experiment shows that our proposed model based on bio-inspired neurons achieves the smallest parameter size and the lowest PER (Fig. 1).

In our ultra-low-power ASR chip, we designed DCIM hardware featuring time-sharing arithmetic hardware for better area and energy efficiency. We also designed the neuron core with 16 processing elements and three register files to emulate 256 neurons in a layer. To achieve fully-pipelined computation between DCIM hardware and neuron core, we explored two time-sharing architectures of DCIM hardware (Fig. 2). Our proposed design can process the feature incoming every 16 ms at the clock frequency of 8.5 kHz. This low clock frequency enables a deep supply voltage scaling and achieves state-of-the-art power consumption.



Figure 1. Parameter count and PER comparison of the proposed bio-inspired model and conventional RNN models.



Figure 2. DCIM with Row-based (left) and column-based (right) time-sharing architecture. Row-based architecture completes the computation of 16 partial products (PP) in 8 cycles and allows neuron core to consume the data immediately.

**Keywords:** low-power automatic speech recognition, bioinspired neuron model, digital compute-in-memory, inmemory-computing

#### INDUSTRY INTERACTIONS

#### Intel, NXP

#### MAJOR PAPERS/PATENTS

[1] D. Wang, et al., "microASR: 32-μW Real-Time Automatic Speech Recognition...," ESSCIRC, 2023. CHRIS KIM, UNIVERSITY OF MINNESOTA, CHRISKIM@UMN.EDU

#### SIGNIFICANCE AND OBJECTIVES

We have been focusing on extracting the potential of CPU performance under cryogenic temperatures (i.e., 77K). The low temperature provides a better subthreshold slope, higher current, and lower interconnect resistance. And all these benefits are independent of technology scaling. Making this a good candidate for performance improvement in the post-Moore's law era.

#### **TECHNICAL APPROACH**

We plan to utilize the self-heating effect during the normal CPU operation to boost its performance under cryogenic temperature (i.e., 77K). For room temperature operation, these amounts of heat need to be removed from the CPU to maintain a functional operating temperature. However, under cryogenic temperatures, this amount of heat could be beneficial and used to elevate junction temperature to reduce threshold voltage for even better performance. A ring-oscillator-based structure is proposed to mimic real CPU self-heating behavior. Post-layout simulations were performed, and we are making progress toward designing a test chip based on this concept.

#### SUMMARY OF RESULTS

Fig. 1 shows the results we have had in the past year. The top figure shows the layout dimension of a previously taped-out chip in a 28-nm CMOS process. We have used that chip to measure the actual power consumption at room temperature. The highest voltage and highest frequency configuration generate 3.54 W/mm<sup>2</sup> of power density under full load. This suggests this RISC-V core is too simple to generate enough self-heating power to elevate itself from the cryogenic temperature region. Additionally, power simulation was performed on a custom taped-out RISC-V CPU core after synthesis and PnR. Simulations confirm that the generated power is not enough for our purpose. The middle figure shows a proposed ring-oscillator structure with three stages of inverters and an enable signal, together with its postlayout power simulation. The power density for the block is 97.6 W/mm<sup>2</sup>. Because ROSC has a very high activity factor, we proposed to use it to mimic an actual CPU selfheating behavior. The bottom figure shows the proposed power control methodology. To stabilize the junction temperature to a desired target temperature, a fully automated negative feedback control system is going to be incorporated. To control the power level, we will use two levels of adjustment: Coarse control is achieved by changing the number of activated ROSC cells in a given unit area. And fine control is achieved by directly adjusting the supply power of the given ROSC array.



Figure 1. (Top) A screenshot of RISC-V core layout from our previous taped out chip, and its power measurement results from three voltage configurations and three frequency configurations. (Middle) A proposed ring-oscillator structure and its post layout power simulation result. (Bottom) Power controlling methodology with fine (supply voltage) and coarse (number of activated cells) control.

**Keywords:** Cryogenic, ring-oscillator array, CPU performance improvement, on-chip-heater, self-heating-effect

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments

There is a large gap between the data rate, interference rejection, and sensitivity of ultra-low power (ULP) receivers, and the radio requirements to meet today's wireless standards. This project will demonstrate new ULP receivers that support higher-order modulation, and higher data rate, at ULP levels.

#### **TECHNICAL APPROACH**

We are evaluating new ULP receiver architectures and signaling that are co-designed to achieve higher-order modulation index, and thus higher data rates. We are initially focusing on QAM and OFDM signaling since these are the most common in wireless standards. We are also initially focusing on energy-detection receivers because these result in the lowest power. After selecting and simulating an architecture and corresponding detected signal energy, we will fabricate a test chip to verify the techniques.

#### SUMMARY OF RESULTS

We have evaluated publications of ULP receivers since 2005 and found that only 9 out of 214 (4%) published ULP receivers support QAM or OFDM. All others are designed for simpler modulation, and 72% of them are either OOK or FSK. As a result, the plot in Fig. 1 highlights that no ULP receivers reported support data rates over 10Mb/s, and power under 1mW.



Figure 1. Simulation results of the final output of the energy detector after receiving an ODFM input with a pilot tone.

Most receivers with power consumption below 1mW, and all below  $100\mu$ W use an energy-detector (ED) in the receive path to down-convert the received signal to the baseband. This is the lowest-power method known for down-conversion. Conventionally, phase and frequency information are lost through an ED, and only amplitude information is left. This is why OOK and FSK using two ED paths with offset bandpass filters, are the most common architecture for ULP receivers.

During this reporting period, we have discovered that phase information can be preserved through an ED, provided there are multiple tones in the received signal. An ED can be modeled as a squaring block, and the intermodulation of tones will be preserved through the ED. By introducing a pilot tone, and transmitting that in addition to an OFDM or QAM signal, we can recover the OFDM in an ED receiver. A block diagram of this receiver is shown in Fig. 2.



Figure 2. Preliminary architecture for an OFDM receiver with an energy-detector for down-conversion.

We have performed simulations of this architecture receiving an OFDM signal along with a pilot tone. The OFDM signal has 12 subcarriers, each subcarrier modulated with QAM. As shown in Fig. 3, we can see the 12 subcarriers after the ED (marked *Data lower sideband* in the figure). We have also simulated a higher-order QAM signal, with a pilot tone, and shown information is preserved. These results are promising for achieving much higher modulation indexes, and thus higher data rates in ED-receivers.



Figure 3. Simulation results of the final output of the energy detector after receiving an ODFM input with a pilot tone.

In the next reporting period, we will report on the design and simulation of a test chip in TSMC 65LP to demonstrate this new signaling.

Keywords: ULP, wakeup receivers, OFDM, QAM

#### INDUSTRY INTERACTIONS

Intel, MediaTek, NXP, Texas Instruments

## TASK 3160.016, MODO: HYBRID SIMO-DLDO DC-DC CONVERTER FOR MULTI-CORE MICROPROCESSORS AND SYSTEM-ON-CHIPS MINGOO SEOK, COLUMBIA UNIVERSITY, MGSEOK@EE.COLUMBIA.EDU

#### SIGNIFICANCE AND OBJECTIVES

We aim to develop a hybrid SIMO-DLDO (MODO) power management architecture that achieves high power conversion efficiency (PCE) by implementing event-driven control to improve dynamic load regulation performance, incorporating dynamic optimal PCE tracking based on load current estimates, and investigating the HV support feasibility.

#### **TECHNICAL APPROACH**

We will conduct the planned research as follows. (1) We will design the first MODO chip using an event-driven technique, focusing on improving the transient response performance. (2) We will tape out the second MODO chip, focusing on dynamic PCE optimization and HV support.

#### SUMMARY OF RESULTS

1

We are pursuing two tasks. First, we propose the analytical models of the maximum load current of the DLDOs employing feedback and feedforward control laws. The developed models shed light on the impact of various technology and design parameters on the maximum load current of a DLDO, with which DLDO circuit designers can use to navigate the design space quickly (Figs. 1 and 2). The following equation can evaluate a DLDO's maximum load current by implementing integral and feedforward control.

$$U_{max} = 2^{N_f} \sqrt{(h_f K_I \bar{I_u} (2^{N_f} - 1))^2 + 2\Delta V C_{out} \bar{I_u} f_{clk} K_I} - h_f K_I \bar{I_u} (2^{N_f} - 1) 2^{N_f} + \frac{1}{1\%} \cdot K_I I_u (t_1)$$
(25)

Second, we designed and taped out a DLDO featuring the load-dependent exponential control law (Fig. 3). Most existing DLDOs employ a constant unit current of power FETs across load current levels. The proposed load-dependent exponential control law modulates the unit current of power FETs exponentially based on the current level of the load current. Besides, we incorporated a non-linear feedforward control and asynchronous switching dual clock scheme to provide a fast and accurate transient response. The prototype achieves 1.94-pC-FoM and 2478X dynamic load range with the worst ripple voltage of  $(2\% V_{out})$ .



Figure 1. (a) Impact of  $C_{out}$  on  $\Delta I$  across different unit current (Iu) values. (b) the impact of  $f_{clk}$  on  $\Delta I$  across three different  $C_{out}$  values.



Figure 2. Model prediction of (a) Iu -  $\Delta I$  with different  $C_{out}$  (b)  $N_f$  -  $\Delta I$  with different  $h_{f.}$ 



Figure 3. Proposed DLDO featuring the load-dependent exponential control law.

**Keywords:** digital LDO, feedforward control, LUT, transient response, dynamic load range

INDUSTRY INTERACTIONS

IBM, Intel, Texas Instruments

Sub-rate serial link transceivers address bandwidth limitations but require routing a high-frequency clock signal exceeding 14GHz. Our objective is to overcome this drawback by designing a fractional frequency multiplier and multi-phase generator that can achieve low jitter (> 100fs<sub>r.m.s</sub>), as well as tight phase matching (< 200fs<sub>pk-to-pk</sub>).

#### **TECHNICAL APPROACH**

We introduce an innovative method for achieving phase-locking using high-gain sampling detectors and lownoise/spur ring oscillators (RO) accompanied by a supply regulation mechanism. This approach ensures reliable operation at frequencies exceeding 14GHz, even under varying process, voltage, and temperature (PVT) conditions. The proposed PLL incorporates a wide bandwidth for suppressing RO noise and employs precise digital quantization error cancellation techniques to achieve exceptional jitter performance. Additionally, the PLL offers accurate eight-phase output through phasespacing error correction methods, enhancing its overall performance.

#### SUMMARY OF RESULTS

Fig. 1 illustrates a simplified block diagram of the proposed type-III integer-N multi-phase generator. The key components include a high-gain sampling phase detector (PD), a loop filter, a low noise voltage-controlled ring oscillator (VCO), and a feedback frequency divider. This design offers several advantages. The high-gain PD ensures a remarkably low in-band noise level (< -135 dBc) and extends the PLL bandwidth to effectively suppress the noise generated by the ring oscillator (RO).



Figure 1. Proposed phase-locked loop.

The proportional path within the system introduces a zero to enhance the stability of the loop. Simultaneously, the integral path biases the PD with a predetermined voltage,  $V_{REF1}$ , significantly improving the reference spur

and deterministic jitter performance. To compensate for significant variations in the VCO output frequency caused by process, voltage, and temperature (PVT) effects, the double-integral path biases the output of the first integrator at  $V_{REF2}$ .

We have developed a behavioral model for the proposed Phase-Locked Loop (PLL) to assess its noise characteristics. Initially, we conducted transistor-level design for various components, such as the sampling PD and the VCO. This allowed us to obtain Power Spectral Densities (PSDs) for different noise sources. Subsequently, these PSD values were integrated into the behavioral model to evaluate the PLL's phase noise performance. The outcomes of these simulations are depicted in Fig. 2.

The results indicate that the proposed PLL has the potential to achieve jitter below 100 fs at 14 GHz output frequency, even when utilizing a ring VCO. However, a low-noise, high-frequency clock with a frequency of approximately 500 MHz is required to achieve such performance. We believe this clock signal is readily accessible in a high-speed serializer-deserializer system.



Figure 2. Simulated PLL phase noise plot at 14-GHz output frequency.

**Keywords:** High-speed serial links, low-jitter ring-based sampling phase-locked loops, multi-phase generators

INDUSTRY INTERACTIONS

Intel

Synthesis, Auto-Place, and Route (SAPnR) dominate SoC methodology, allowing designers to focus on optimizations. However, the automated generation of IVR domains, with VRs and FLL/PLLs, is critically missing. We seek to design a *domain compiler* – an automated framework to produce a comprehensive domain, including FLL/PLLs and VRs.

#### **TECHNICAL APPROACH**

The effort is organized into two thrusts. **Thrust 1**, Synthesizable UniCaP: We will build upon our V<sub>dd</sub> -droop tolerant and fast-response UniCaP-2 construction (Fig 1(b)) to explore and develop a framework that automates the construction of robust, larger, all-digital domains. User-provided constraints (Fig. 1(a)) are used to develop a unified system. **Thrust 2**: Autonomous, all-digital run-time VR loop-gain tuning: We will ensure optimal transient response across PVT conditions, thereby overcoming the problem of poor performance due to margining for worstcase PVT conditions. In the context of UniCaP, improved VR response minimizes performance loss from FIFO saturation, and margins due to memory V<sub>min</sub> constraints.

#### SUMMARY OF RESULTS



Figure 1. (a) Overview of proposed Domain Compiler (b) Simplified schematic of the proposed architecture consisting of integrated LDO/PLL modules in addition to the load domain.

The focus of our effort in Year 1 is the design of an autonomous loop-gain tracker to enable Auto-tuning of the LDO loop gain. Progress on this front has been on track. Fig. 2 shows the schematic of the proposed replicabased I<sub>LSB</sub> sensing architecture. The technique relies on copying the drain and source terminal voltages on the LDO header to a replica device. A digital comparator + chargepump construction is used to provide the high loop-gain needed to minimize tracking error. The current flow through the replica is reflected into a current controlled oscillator (CCO) which integrates the load current into a count value for inference of loop gain. Mismatch errors in between mirror devices and within the comparator are addressed using chopping and 2-pass operation, respectively.



Figure 2. Block diagram of the proposed loop gain detector. The approach relies on the observation that loop gain is dominated by  $PV_{in}V_{out}T$  changes in  $I_{LSB}$  to approximate total LDO loop gain, and use it to adjust  $K_1$  and  $K_P$  parameters for stable, rapid LDO response.



Figure 3. Simulation waveforms showing the 2-pass approach to overcome offset errors in the comparator. The input terminals to the comparator are toggled before the second measurement pass is implemented, resulting in  $V_{dd,replica}$  (blue) settling on either side of the actual  $V_{dd}$  (red) by an amount of the comparator offset. The resulting average of these measurements delivers  $V_{dd}$  with adequate precision.

Fig. 3 shows simulation results of our proposed sensing architecture. Monte-Carlo simulations have demonstrated an ability to track the actual  $I_{LSB}$  to within 3% (3  $\sigma$ ), adequate for our application. Our upcoming effort will focus on functional classification of the LDO header, TRO and TDC circuits.

Keywords: Model-predictive control, Voltage Regulation

#### INDUSTRY INTERACTIONS

AMD, IBM, Intel, NXP, Texas Instruments

Improving transient voltage regulator (VR) performance improved SoC energy efficiency. However, digital VRs, in particular, are fundamentally limited when reacting to load current ( $i_{load}$ ) transients. Meanwhile, the effort to predict current or upcoming load transients as part of VR action has been limited.

#### **TECHNICAL APPROACH**

This task involves two thrusts. **Thrust 1**, stable V<sub>dd</sub> control with data-driven digital  $i_{load}$  sensing and frequency throttling: For rapid LDO response, transitions in key system state bits will be used to infer  $i_{load}$  in the present cycle (Fig. 1(b)). In addition to estimating  $i_{load}$  as a leading indicator of V<sub>dd</sub>, the proposed VR controls both LDO current *supply* ( $i_{LDO}$ ) and  $i_{load}$  *demand* through careful pre-emptive throttling of the clock frequency. **Thrust 2**: We will explore stable control of transient-driven LDO operation, and how  $i_{load}$  sensing accuracy properties will determine the overall effectiveness of the combined VR-load system.

#### SUMMARY OF RESULTS



Figure 1. (a) The proposed VR-domain co-design effort uses  $i_{load}$  estimation obtained from the load to not only provide a feed-forward signal for LDO operation but also allow cycle skipping to pre-emptively reduce load current to become in line with current delivery capabilities. (b) Simulated performance of the proposed architecture, allowing the feed-forward LDO to act early to minimize voltage droop.

This effort is currently in the first year of operation. Our current focus has been on evaluating the potential up-side of feed-forward control on LDOs to provide stable, rapid transient response, and of pro-active controlled cycle skipping to avoid adaptive clocking events which are accompanied with cycle loss due to periods of low-frequency operation. Note that the proposed architecture is not a replacement for adaptive clocking, because  $i_{load}$  estimation cannot be relied upon to accurately estimate  $i_{load}$  load transients. Adaptive clocking provides a critical back-stop to mis-estimations of load current, while VR-domain codesign minimizes the likelihood of performance degrading droop events. We will quantify the merits of each of these coupled threads independently.

Although we anticipate control state to be highly indicative of load-current draw in sophisticated processors, load current in easy-to-build low-complexity processors is relatively steady. This fact neccissitates use of a load emulator which will provde cycle-accurate load current variation that mirrors that of a sophisticated processor (obtained from our industry liaisons).



Figure 2. Block diagram of the proposed load emulator. Lowcomplexity processors do not exhibit sufficient state dependent power dissipation. We will implement an emulator that match the load current draw of sophisticated processors.

#### Keywords: Feed-forward, Current-estimation

#### INDUSTRY INTERACTIONS

AMD, IBM, Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Z. Xie, X. Xu, M. Walker, J. Knebel, K. Palaniswamy, N. Hebert, J. Hu, H. Yang, Y. Chen, and S. Das, "APOLLO: An Automated Power Modeling Framework for Runtime Power Introspection in High-Volume Commercial Microprocessors," in MICRO-54 Oct. 2021, pp. 1–14.

Charging and energizing small devices from portable USB sources is pervasive nowadays. Volume, power efficiency, and response time are critical in this space. The objective of this research is to develop a compact, efficient, single-inductor voltage regulator that charges and monitors a battery while supplying uninterrupted power to the load.

#### **TECHNICAL APPROACH**

The first phase is to design a CMOS power supply that switches one inductor so it charges a battery while supplying the load. The system is compact and efficient because one inductor transfers all the power. The system is also fast because the inductor always drains into the load, even while charging the battery. The second phase is to develop the CMOS controller that switches the inductor. This controller should stabilize the feedback loop so the response time of the system is short. The last phase is to explore electronic markers that reflect the health and state of the battery.

#### SUMMARY OF RESULTS

Power supplies that charge and supply a load from USB sources are bulky, inefficient, or slow. Efficient solutions normally use more than one off-chip inductor. Compact supplies are often inefficient because they charge the battery without an inductor. Compact and efficient supplies in literature time-multiplex one inductor into the battery and load, so the load does not always receive power.

The proposed solution in Fig. 1 switches one inductor  $L_X$ , so the system is compact.  $L_X$  charges the battery  $v_B$  and supplies the load, so the circuit is also efficient. And  $L_X$  always drains into the output  $v_0$ , so the system can respond quickly to load variations (and "dumps").



Figure 1. Battery-charging/monitoring voltage regulator.

The key to this technology is connecting  $v_B$  in series with the load. This way,  $L_X$  charges or drains  $v_B$  while feeding  $v_0$ . So the power that  $L_X$  supplies is always available to  $v_0$ .

MOSFET switches in Fig. 2 can energize  $L_X$  into ground or  $v_0$  with or without  $v_B$  in series. Switches can similarly drain  $L_X$  into  $v_0$  with or without  $v_B$ . When the battery is not full, the system drains  $L_X$  into  $v_0$  through  $v_B$ , so  $v_B$  charges. When fully charged,  $L_X$  drains into  $v_0$  directly (without  $v_B$ ).



Figure 2. Single-inductor battery-charging voltage regulator.

When a USB source is not available, the input  $v_l$  floats. In this case, the MOSFETs energize and drain from  $v_B$  into  $v_0$  (without using  $v_l$ ). This way,  $v_B$  supplies the load.

A provisional patent application was submitted for the CMOS power stage, the non-provisional application is currently being prepared. A CMOS prototype is currently being designed.

The next phase in this research is to design the CMOS controller that switches the power stage. It must keep the feedback loop stable across all the operating conditions. Amplifiers, comparators, drivers, and other circuit blocks in the loop should respond quickly without consuming too much power. The voltage reference and other circuit blocks should similarly dissipate little energy.

Keywords: Voltage regulator, charger, compact, efficient

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments

## **Fundamental Analog Thrust**



| Category                            | Accomplishment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|-------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fundamental<br>Analog<br>(Circuits) | A new hybrid ADC architecture combines a voltage-controlled oscillator (VCO)-based continuous-time delta-sigma modulator (DSM) with a noise-shaping successive approximation register (SAR) quantizer. The key innovation is an anti-aliasing filter (AAF) that connects the VCO front-end with the NS-SAR quantizer, allowing direct sampling of time-domain information as voltage-domain information. A 28-nm CMOS prototype achieves an 84.2-dB signal-to-noise-distortion ratio and an 86.8-dB dynamic range within a 1-MHz bandwidth while consuming 1.62mW at 100MS/s. The Schreier SNDR figure of merit is 172.1 dB. (2810.033, M. Flynn, University of Michigan)                                                            |
| Fundamental<br>Analog<br>(Circuits) | The first temperature- and aging-compensated RC Oscillator (TACO) maintains long-<br>term stability by periodically synchronizing its frequency with a less-aged reference<br>oscillator. To enhance its stability, TACO incorporates resistors with higher<br>activation energy ( $E_a$ ), employs switched dual RC branches to reduce stress induced<br>by DC currents, and applies duty cycling to slow down the aging of the reference<br>oscillator. A prototype 100-MHz oscillator built using a 65-nm CMOS process<br>achieves an inaccuracy of ±1030 ppm over a wide temperature range (-40°C to 85°C)<br>after 500 hours of accelerated aging at 125°C. (2810.036, P. Hanumolu, University<br>of Illinois Urbana-Champaign) |
| Fundamental<br>Analog<br>(Circuits) | A new 50Gb/s multi-carrier transmitter (TX) has been developed, utilizing carrier orthogonality to allow band overlap. It features three 5GS/s bands with BB PAM4, MB, and HB 16-state complex modulation on carriers at 5 and 10GHz. The TX, fabricated in a 22-nm FinFET process, operates at 50Gb/s by activating these three bands simultaneously. The system's performance, evaluated through histogram and constellation results obtained via an oscilloscope measurements, demonstrates BER<10 <sup>-4</sup> over a channel with 5-dB loss at 12.5GHz, thanks to TX FIR equalization and 4-tap ICI cancellation. (2810.062, S. Palermo, Texas A&M University)                                                                 |





(同)

## TASK 2810.019, DESIGN AUTOMATION FOR COVERAGE MANAGEMENT IN ANALOG AND MIXED-SIGNAL SOCS PALLAB DASGUPTA, INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR, PALLAB@CSE.IITKGP.AC.IN ARITRA HAZRA, INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR

#### SIGNIFICANCE AND OBJECTIVES

The objective of this task is to design and implement CAD support for coverage management of AMS artifacts in mixed-signal SoCs. Specific objectives include designing metrics for analog coverage, instrumenting analog coverage collection on standard simulation environments, and reasoning over coverage from IP level to SoC level.

#### **TECHNICAL APPROACH**

Considering the large number of corners and the simulation cost of covering all the corners of a large design, it is desirable to identify a subset of the corners that can potentially expose corner case bugs. In an integrated analog coverage management framework, this choice may be influenced by those corners that take one or more component analog IPs close to their individual specification boundaries. We propose a novel methodology for selecting simulation runs from a set of statistical simulations at the IP level. We show that the selected set of simulation corners offers increased coverage at the module level.

#### SUMMARY OF RESULTS

Our methodology has been verified on an industrial LDO from Texas Instruments. Figure 1 shows the steps in our proposed verification flow. Table 1 shows the impact of the proposed approach in increasing the coverage of two signals at module-level simulations of the LDO.



Figure 1. A stepwise representation of the proposed flow.

As seen from the results, for the signals Ido\_1v8 and Ido\_3v0 the ranges seen at 1300 IP level are [1.7205:1.8715] respectively. On analyzing the coverages

from these IP level simulations our methodology identified 2 simulation corners (one for each extremum) as the best candidates for module level simulations. Two simulations at these chosen simulation corners gave the ranges of Ido\_1v8 and Ido\_3v0 as [1.73:1.86] and [2.4:3.016] respectively. These coverage figures beat the ones obtained from base line simulations (nominal, weak, strong, skewn, skewp) at module level, establishing the efficacy of our proposed methodology.

| Category                  | Signal  | min<br>value | max<br>value |
|---------------------------|---------|--------------|--------------|
| IP level simulation       |         |              |              |
| IP-level Monte Carlo of   | ldo_1v8 | 1.7205       | 1.8715       |
| 1300 simulations          | ldo_3v0 | 2.3985       | 3.0625       |
| Module level simulation   |         |              |              |
| Base line sims.: Nominal, | ldo_1v8 | 1.79         | 1.81         |
| weak, strong, skewn,      | ld0_3v0 | 2.4          | 3.001        |
| skewp proces              |         |              |              |
| 2 Monte Carlo corners     | ldo_1v8 | 1.73         | 1.86         |
| identified by proposed    | ld0_3v0 | 2.4          | 3.016        |
| methodology               |         |              |              |

Table 1. Hierarchical coverage analysis of LDO.

**Keywords:** Coverage management, Analog and mixedsignal, System-on-chip, Glitch and level, Periodic signal

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] S. Sanyal et al., "Analog Coverage-driven Selection of Simulation Corners for AMS Integrated Circuits," April 2023.

[2] S. Sanyal, et. al., "Metrics for Identifying Glitches and Levels in Mixed-Signal Waveforms," US Patent Application Number 18071917, filed November 30, 2022.

[3] A. Chakraborty, et. al., "Tracking Coverage Artifacts for Periodic Signals using Sequence-Based Abstractions," US Patent Application no. 18076547, filed Dec 7, 2022.

## TASK 2810.028, ROBUST ATE MULTI-SITE HW DESIGN TO ENABLE EFFECTIVE ANALOG PERFORMANCE TESTING IN ANALOG-MIXED-SIGNAL (AMS) SOCS

#### DEGANG CHEN, IOWA STATE UNIVERSITY, DJCHEN@IASTATE.EDU

#### SIGNIFICANCE AND OBJECTIVES

Mission-critical applications dictate demanding testing, significantly impacting time-to-market and manufacturing costs. Massively parallel multisite testing offers great improvements in throughput and test cost. This project will develop tools for automatically flagging excessive site-to-site variations, identifying/correcting ATE hardware issues causing s2s variations, and enhancing multi-site ATE hardware robustness and effectiveness.

#### TECHNICAL APPROACH

Existing volume probe/final-test data and ATE hardware files will be used to develop statistical signal processing and machine learning algorithms to automatically process the volume data and flag the issue sites and specifications. These results will enable the identification of sensitive components/nets in the ATE hardware. Targeted extraction/simulation will be run and additional GRR (gauge repeatability and reproducibility) measurements will be taken, to identify hardware root causes. Such learning will lead to improved test board hardware design which is robust to site-to-site variations. A framework for pre-fabrication verification, post-fabrication evaluation and site calibration, and adaptive test flow strategies will achieve robust and effective tests.

#### SUMMARY OF RESULTS

Publications continued with the research results obtained from the prior year. Separate measurement data, in addition to production probe data, from Texas Instruments were also received. These were used to validate the accuracy of the proposed methods for hardware systematic error identification and correction. Fig. 1 compares reference measurement (1<sup>st</sup> bar), probe test data (2<sup>nd</sup> bar), and corrected data with 4 calibration methods (3<sup>rd</sup> to 6<sup>th</sup> bars) for one good site (site 6) and two issue sites (16, 24). Clearly, both good and issue sites have after-correction distributions very similar to the reference distribution.

A technology transfer package was delivered in the summer of 2022 to Texas Instruments. The package included code implementation of all issue site identification algorithms and all hardware systematic error identification and correction algorithms. Both Python code implementation and Matlab implementation were provided. In addition, a video recording is provided as a tutorial on how to install the codes, run the codes, preprocess the data, and interpret the results generated. A code-summary presentation is also included.

Figure 1. Comparison of reference (1st bar), probe data (2nd bar),



and corrected data with 4 calibration methods (3<sup>rd</sup> to 6<sup>th</sup> bars) for one good site (site 6) and two issue sites (16, 24).

Table 1 below shows the quantile-quantile metrics (QQM) quantifying the differences between the distributions at the 3 sites before/after the 4 calibration methods and the distribution of the reference site.

Table 1. QQM comparing the reference distribution at each test site in Data A

| Site | Measured | Calibration Method Results |        |        |        |             |
|------|----------|----------------------------|--------|--------|--------|-------------|
|      | values   | INT                        | LM     | LS     | PM     | PM<br>Order |
| 6    | 1.5545   | 0.2186                     | 0.2090 | 0.3098 | 0.2308 | 5           |
| 16   | 7.5441   | 0.3021                     | 0.2979 | 0.4394 | 0.2784 | 6           |
| 24   | 7.4735   | 0.3459                     | 0.3638 | 0.2009 | 0.1885 | 4           |

**Keywords:** Multi-site testing, test cost reduction, site-tosite variations, and ATE test hardware debug/design.

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] Farayola, et al, "Site-to-Site Variation in Analog Multisite Testing," IEEE Design & Test, 2023.

[2] Bruce, et al, "A Weighted-Bin Difference Method for Issue Site Identification," J. Electron. Test., 2023.

[3] Farayola, et al, "Cross-Correlation Approach to Detecting Issue Test Sites," 2022 DFT.

[4] Farayola, et al, "Hardware Systematic Error Identification and Correction," J Electron Test, 2022.

[5] Farayola, et al, "Optimal Order Polynomial Transformation...," 2022 ITC.

[6] Steenhoek, "Graph Theory Approach for Multi-site ATE Board Parameter Extraction," 2022 ETS.

## TASK 2810.029, 170-260 GHZ WIDEBAND PA AND LNA DESIGN IN SILICON

AYDIN BABAKAHNI, UNIVERSITY OF CALIFORNIA, LOS ANGELES, AYDINBABAKHANI@UCLA.EDU

#### SIGNIFICANCE AND OBJECTIVES

We design a wideband power amplifier (PA) and a lownoise amplifier (LNA) operating in the 170-260 GHz band using a commercial silicon process. Both these blocks are critical components for future high-speed wireless communication links and high-resolution 3D- imaging radars.

#### **TECHNICAL APPROACH**

A single-ended LNA, taped out in a 130-nm SiGe process by iHP Microelectronics, was characterized with VNA measurements, and Y-factor noise figure measurements were performed. A single-ended PA, taped out in a 130nm SiGe process of iHP Microelectronics, was characterized, with VNA, and large-signal measurements.

#### SUMMARY OF RESULTS

Two LNAs were fabricated. LNA 1 was fabricated in a 130-nm SiGe BiCMOS process. The DC power consumption is 46 mW. Fig. 1 shows the measured S-parameters. The LNA shows a peak measured gain of 15.5 dB at 207 GHz. The measured 3-dB bandwidth covers the entire WR5 frequency range, from 140-220 GHz (fractional bandwidth of 44%). A record noise figure of 6.9 dB, 6.1 dB, and 8.2 dB is measured at 150 GHz, 180 GHz, and 210 GHz, respectively.



Figure 1. Measured small signal s-parameters of the LNA. Measurement results show a 3-dB bandwidth of 80 GHz.

LNA 2 was fabricated in the GlobalFoundries 22-nm FDSOI process. This is a 13 stage LNA with a simulated DC power consumption of 22 mW and operates over the frequencies between 190 and 240 GHz. The simulated S-parameters of LNA 2 is given in Fig. 3. Measurement results over the WR3 and WR5 bands (Fig. 4) however showed significantly lower gain and are attributed to modelling errors which cascade over the 13 stages of the LNA.



Figure 2. Noise figure measurements, using the Y-factor technique.



Figure 3. Simulated small signal S-parameters of the LNA 2.



Figure 4. Measured small signal S-parameters of the LNA 2.

Keywords: LNA, PA, Wideband, Noise Figure, SiGe

#### INDUSTRY INTERACTIONS

Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Y. Mehta, S. Thomas and A. Babakhani, "A 140–220-GHz Low-Noise Amplifier With 6-dB Minimum Noise Figure and 80-GHz Bandwidth in 130-nm SiGe BiCMOS," in IEEE Microwave and Wireless Technology Letters, vol. 33, no. 2, pp. 200-203, Feb. 2023, doi: 10.1109/LMWC.2022.3208204

One of the primary hardware costs incurred by artificially intelligent neural networks is due to vectormatrix multiplication (VMM), which is generally performed with floating-point binary numbers and computed exactly with cumbersome multiplication computations that scale with the number of entries in the matrix; *i.e.*  $O(N^{2})$ .

#### TECHNICAL APPROACH

Most research in this area has focused on crossbars composed of memristors and phase change memory (PCM). However, these devices suffer from several major challenges, including imprecise writing, weights drifting over time, limited endurance, and compatibility challenges with modern CMOS processes. Magnetic tunnel junctions (MTJs) provide solutions to the limitations of memristors and PCM but only exhibit binary resistance states. Importantly, it was recently proposed that stochastic MTJ switching enables analog neuromorphic behavior with MTJs.

#### SUMMARY OF RESULTS

Our experiment demonstrates a 4x2 neuromorphic VMM accelerator, with eight two-terminal MTJ synapses as shown in Fig. 1(a). The four input voltages are provided by four power supplies (PS) and the two output currents are read by the source measuring unit (SMU). Each MTJ needs two probes to connect the top and bottom electrodes individually, thereby requiring a total of 16 probes for this 4x2 neuromorphic network.

To evaluate the functionality of the proposed system, we tested the system on the two-pixel by two-pixel supervised image classification task. In this task, four input pixels are fed into the network, which is tasked with recognizing two target images. The network must calculate the Hamming distance between the input and target images, to which a threshold may be applied to identify images that are identical or sufficiently similar to the target image. The synapses in the network were trained offline to weights of either parallel (P) or antiparallel (AP) MTJ resistance states.

Output currents were normalized by the average AP state current. The output current variations resulted primarily due to power supply voltage variation and differences in probe connectivity. Significantly decreased variation is expected for future neuromorphic networks with MTJ arrays directly integrated with peripheral CMOS.

By setting proper thresholds, the post-subtraction results can be categorized into five distinct bins. These postsubtraction currents show high fidelity to the expected VMM results for this four-pixel input task, demonstrating robustness to the device and testing imprecision.

For larger system demonstrations, a printed circuit board (PCB) has been designed. Various depictions of this PCB design are provided in Fig. 1.



Figure 1. (a) Custom PCB designed for wire bonding of MTJs on the die. (b)-(d) After wire bonding, more MTJs can be connected via the pads on the PCB without the probe station.

**Keywords:** STT-MRAM, MTJ, Binarized Neural Network, Vector-Matrix Multiplication, Image Recognition

#### INDUSTRY INTERACTIONS

Texas Instruments, Intel

#### MAJOR PAPERS/PATENTS

[1] P. Zhou et al., "Binarized Neuromorphic Inference Network with STT MTJ Synapses," *Conference on Magnetism and Magnetic Materials*, Oct.-Nov. 2022.

[2] P. Zhou et al., "Experimental Demonstration of Neuromorphic Network...," *IEEE The Magnetic Recording Conference*, Aug. 2022 (invited).

[3] P. Zhou et al., "Binarized Neuromorphic Inference Network with STT MTJ Synapses," *SPIE Spintronics*, Aug. 2022 (invited).

[4] P. Zhou et al., "Binarized Neuromorphic Computing with STT MTJ Synapses," *International Conference on Neuromorphic Systems*, July 2022.

This research will deliver an energy-efficient highspeed, high-resolution ADC architecture for high performance and emerging applications, including medical imaging, 4G/5G infrastructure, radar, production test, and defense. We will expand the bandwidth of our time-interleaved (TI) noise-shaping (NS) SAR architecture by an order of magnitude to extend SAR-ADC efficiency to high speed and high resolution.

#### **TECHNICAL APPROACH**

Our new SAR-based architecture combines timeinterleaving with noise-shaping, to break the tradeoff between speed and accuracy, and enable both high speed and high resolution. The target design space is unserved by state-of-the-art SAR ADCs. Different to conventional SAR and interleaved SAR converters, our approach provides both high resolution and high bandwidth. We expand our revolutionary new interleaved noise-shaping SAR ADC architecture to deliver an order-of-magnitude more bandwidth, as well as improved energy efficiency.

#### SUMMARY OF RESULTS

The Noise-Shaping SAR (NS-SAR) is an emerging ADC architecture that offers both high resolution and high energy efficiency. State-of-the-art NS-SAR ADCs eliminate the need for op-amps, which relaxes design complexity and technology scaling issues. However, existing NS-SAR ADCs, with high FoM, are limited in bandwidth (typically in the MHz range). This makes NS-SAR ADCs unsuitable for applications that need bandwidths in the 10+MHz range, such as wireless communications.

Combining a CT SDM with a NS quantizer (QTZ) provides excellent efficiency, robust operation, and the benefits of a CT input. Nevertheless, existing CT ADCs with NS QTZs are limited in resolution and bandwidth. Conventional methods for increasing resolution and bandwidth are challenging: (1) The usable time for the QTZ is limited; (2) Increasing QTZ resolution reduces the NS QTZ sampling rate; (3) The loading of the NS QTZ limits the performance of the CT front-end; and (4) Increasing the CT loop-filter order can cause excessive out-of-band gain. Introducing a delay in the NS QTZ feedback loop enables complete parallelization of the NS QTZ operations (Figure 1). The extra time from parallelization greatly relaxes loop filtering and residue integration, and enables a high (6-bit) QTZ resolution. Furthermore, the extra time in the feedback loop permits Dynamic Weighted Averaging to eliminate the need for calibration. The prototype comprises a 2nd order CT front-end and a fully interleaved 1st order NS QTZ.

A new hybrid architecture (Figure 2) combines a voltage-controlled oscillator-based continuous-time (CT) delta-sigma modulator (DSM) with a noise-shaping (NS) successive approximation register (SAR) quantizer. The key innovation is an anti-aliasing filter that bridges the VCO front-end with the NS-SAR quantizer, enabling the time-domain information to be directly sampled as the voltage-domain information. The fabricated 28-nm CMOS prototype achieves 84.2dB signal-to-noise-distortion ratio (SNDR) and 86.8dB dynamic range within a 1MHz bandwidth while consuming 1.62mW at 100MS/s. The corresponding Schreier SNDR figure of merit is 172.1 dB. The core circuit occupies only 0.024mm<sup>2</sup>.



Figure 1. CT SDM with new TI NS QTZ merges ELD paths.



Figure 2. 3rd-Order VCO-Based CTDSM with NS-SAR Quantizer.

Keywords: Sigma Delta, ADC, sensor, SAR, IoT

#### INDUSTRY INTERACTIONS

Intel, Mediatek, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] H.W. Chen, S. Lee, and M. P. Flynn, "A 0.024mm2 84.2dB-SNDR 1MHz-BW 3rd-Order VCO-Based CTDSM with NS-SAR Quantizer (NSQ VCO CTDSM)," IEEE Symp. on VLSI Circuits, June 2023.

[2] S. Lee, T. Kang, S. Song, K. Kwon, and M. P. Flynn, "An 81.6dB SNDR 15.625MHz BW 3rd Order CT SDM with a True TI NS Quantizer," IEEE Symp. on VLSI Circuits, June 2022.

On-chip RC oscillators are becoming popular as clock references due to their low power, small area, and cost-efficiency. However, poor frequency accuracy has limited their usage to systems tolerating ~1% inaccuracy. The objective is to achieve <±250ppm frequency accuracy at low power, making it suitable for real-time clock sources.

#### **TECHNICAL APPROACH**

We use a temperature- and aging-compensated RC oscillator (TACO) in which the long-term drift of the main oscillator is compensated by periodically locking its frequency to that of the *less-aged* reference oscillator. The main and reference oscillators are identical, except the reference oscillator is heavily duty-cycled to prevent it from aging. We employ (1) resistors with higher activation energy ( $E_a$ ), such as the n-poly or metal type, (2) switched dual RC branches to reduce the stress caused by DC-current induced electromigration (EM), and (3) duty-cycling to slow down the aging rate of the reference oscillator.

#### SUMMARY OF RESULTS



Figure 1. TACO architecture (top) and switched-RC-based oscillator topology (bottom).

To improve the long-term stability of RC oscillators, a temperature- and aging-compensated oscillator shown in Fig. 1 is employed [2]. It comprises a main temperature-compensated oscillator (TCO), a reference TCO, and an aging compensation logic. The main TCO is always on, and its long-term frequency drift caused by aging is

compensated by periodically locking it to the frequency of the less-aged reference oscillator.

A prototype TACO, with the flexibility to use p-poly, npoly, and via resistors as the reference resistors, was fabricated in a 65-nm CMOS process and packaged in a plastic QFN package. The long-term frequency drift of always-on n-poly-based TCO is significantly higher than when duty-cycled (0.1%), indicating that duty-cycling reduces the aging effect. Similar behavior was observed for VIA-resistor-based TCOs also. Given these learnings, R<sub>0</sub> and R<sub>1</sub> in the main and reference TCOs were implemented using n-poly and VIA resistors, respectively. An accelerated aging test was performed after a two-point trim at 85°C and -40°C. The results obtained from 11 samples are shown on the right in Fig. 2. Always on main TCO is calibrated by locking its frequency to a 0.1% dutycycled reference TCO at one-hour intervals. After 500 hours of aging at 125°C, the temperature stability of uncalibrated main TCO degrades by 731 ppm. With calibration, the degradation is only 227 ppm, representing a 3.2x improvement. The frequency inaccuracy over -40°C to 85°C, including aging effects, is ±1029 ppm.



Figure 2. Die photo (left) and frequency error after aging (right).

**Keywords:** Switched capacitor, switched resistor, Deltasigma modulator, pulse density modulation, RC oscillator

#### INDUSTRY INTERACTIONS

NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] K.-S. Park et al., "A second-order temperature compensated  $1\mu$ W/MHz 100MHz RC oscillator with ±140ppm inaccuracy from -40°C to 95°C," in 2021 CICC, April 2021.

[2] K. -S. Park et al., "A  $1.4\mu$  W/MHz 100MHz RC Oscillator with  $\pm$  1030ppm Inaccuracy from -40°C to 85°C After Accelerated Aging for 500 Hours at 125°C," in 2023 ISSCC, San Francisco, CA, USA, 2023.

Our research goal has been to solidify ring amplifiers as a dominant means for closed loop precision amplification in mixed-mode circuits. We seek to demonstrate performance improvements over traditional amplifiers. High-performance ring amplifier-based ADCs in FinFET technology are explored, addressing the current need for high-speed and high-resolution applications in scalable CMOS.

#### **TECHNICAL APPROACH**

In high-speed ADC utilizing pipelined architectures [1], the use of SAR as the sub-ADC and ringamp based residue amplifier greatly reduces the whole ADC power dissipation. A single-channel GS/s 12-bit high-speed pipelined-SAR ADC with a low-power three-stage structure is shown in Fig. 1. A ring amplifier based pipelined-SAR ADC in a 22-nm FinFET process is used as a demonstration vehicle to assess performance capabilities.

In another implementation, shown in Fig. 2, a multistage noise-shaping (MASH) structure with noise-shaping SAR (NSSAR) ADC has been adopted [2]. The commonly resulting noise leakage due to analog/digital mismatch is found negligible in this new architecture. With 50MHz bandwidth (OSR=3, clocking at 300MHz), this topology has been shown/simulated to achieve 78dB SNDR.

#### SUMMARY OF RESULTS

A pipelined ADC in Fig. 1 has been re-designed and resent for fabrication with better layout strategy. For the layout optimization, multi-layer metal parallel connections and a better signal flow/floorplan have been incorporated. The testing of the chip is still ongoing but we are currently seeing unexplaned leakage current issues and alignment problems in the chip assembly.



Figure 1. Topology of pipelined ADC.

In the noise shaping SAR ADC design, the intrinsically stable 4<sup>th</sup> order MASH shown in Fig. 2 is utilized. Stage1 uses the error feedback (EF) structure. In this architecture, gain (32X) variation of the precision amplifier under process, voltage, and temperature (PVT) can greatly

deteriorate the noise transfer function (NTF). A previous work utilized an open loop dynamic opamp with complex digital calibration to attain effectively the constant/accurate gain under PVT variation. The proposed ring amplifier working in closed loop operation with 32X gain can replace the open loop opamp to provide PVT stable operation. The 1/16 gain coefficient of stage1 is generated by capacitor charge sharing, thus sufficiently accurate. The stage1 feedback opamp is reused to also provide the inter-stage gain. Quantization noise from the stage1 SAR is amplified by the residue amplifier and sent to the second stage. Thanks to the large precise interstage gain, reaping the benefits of ringamp precision and efficiency, stage2 can accommodate reduced accuracy, low power consumption noise shaping circuits. In stage2, the filter is realized by a low-gain but fast-settling ring amplifier, of even much lower power and simpler structure.



Figure 2. Topology of proposed NSSAR ADC architecture.

The NSSAR ADC has been implemented in a 22-nm FinFET process. The simulation results show that even with ultra-low OSR (3), 78-dB SNDR can be achieved. The simulated total power consumption is 6.7mW (clocking at 300MHz).

#### Keywords: MASH, Noise shaping SAR, Low OSR

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] C. C. Lee and M. P. Flynn, "A SAR-assisted two-stage pipeline ADC," *IEEE J. Solid-State Circuits*, Apr. 2011.

[2] H. Hu, V. Vesely and U. Moon, "Ultra-low OSR calibration free MASH noise shaping SAR ADC," *IEEE Int. Symp. Circuits Syst.*, 2022.

This project was motivated by the need to address analog design challenges in terms of widening the productivity gap and meeting the desired fast turnaround time while guaranteeing design quality and robustness. Specifically, we developed a reinforcement learning (RL) algorithm and design tool for automated analog sizing optimization considering multiple competing design objectives.

#### TECHNICAL APPROACH

Analog circuit design productivity has been a key issue in modern chip design. In particular, analog design requires intensive human knowledge and involvement. While analog designs have found their way to a broader range of application domains, meeting the desired designturn-around-time and design quality is becoming more difficult. To this end, we developed a reinforcement learning (RL) based multi-objective analog sizing method and tool. The proposed RL algorithm is shown in Fig. 1.

#### SUMMARY OF RESULTS



Figure 1. Proposed reinforcement learning algorithm for multiobjective analog design optimization.

As shown in Fig. 1, the proposed reinforcement learning (RL) approach for multi-objective analog optimization

consists of a sample-efficient and easy to train, multiobjective reinforcement learning (MORL) algorithm formulation to form a well approximated Pareto set of the analog circuit, where the actions of the RL agent are continuous valued [1].

Once the RL design agent is trained, its predictive power can be leveraged to support automated optimization of user-defined design preferences among multiple objectives. Experimentally, Fig. 2 demonstrates the working of the proposed RL approach by comparing it with several methods on optimization of the gain and bandwidth of a hysteresis comparator. We compare our method (RL-train) with genetic algorithm (NSGA-II), Monte-Carlo sampling, and Bayesian optimization (BO). As can be see, RL-train produces the best overall pareto front among all these four methods. Future work will investigate sample-efficient RL based analog design optimization while considering the impacts of process variations.



Figure 2. Optimization of a hysteresis comparator.

**Keywords:** analog optimization, reinforcement learning, multi-objective optimization, design productivity

#### INDUSTRY INTERACTIONS

Intel, NXP

## TASK 2810.044, HIERARCHICAL CHARACTERIZATION AND CALIBRATION OF RF/ANALOG CIRCUITS USING LIGHTWEIGHT BUILT-IN SENSORS

SULE OZEV, ARIZONA STATE UNIVERSITY, SULE.OZEV@ASU.EDU

#### SIGNIFICANCE AND OBJECTIVES

This project will explore a hierarchical calibration approach, including local calibration and system-level target setting, where information from built-in monitors is used to match the performance of individual blocks, as well as to guarantee that the entire system functions in cohesion even if constraints or operating conditions change dynamically.

#### TECHNICAL APPROACH

Simple built-in sensors that measure current, DC voltages, and signal amplitude are used to set local circuit parameters, such as bias or matching components, to optimize the performance to specifications. System-level calibration relyies on a statistical model (e.g. machine learning), whereas circuit-level calibration can be conducted with simplified mathematical models.

#### SUMMARY OF RESULTS

Mismatch information can indirectly measured using multiple low-overhead sensors such as RMS, peak or power detectors. These detectors are strategically placed in different parts of the circuit to maximize measurement sensitivity. Existing methods aim at directly determining the matching parameters and they require external calibration, which makes them unsuitable for BIST applications. Our work solves the necessity of calibration by referencing all measurements to a small set of parameters and conducting measurements in two steps. In each step reflection coefficient of one load is measured. Since each measurement is referenced to the same parameters, these measurements can be used to assess the mismatch of the two loads and give information on how both loads compare to each other. Since we take ratios of measurements, the impedance in the power/RMS conversion disappears. Thus, the proposed technique work with peak, RMS, or power detector. We also fabricated and tested the design example, our experimental results confirm that this method can be used for mismatch detection and return loss estimation.

Fig. 1 shows the 4-detector implementation of the impedance matching measurement system. The signal source, whose amplitude and impedance are not known is applied from two points that need to be matched. Power detectors are placed in the matching network composed of multiple unit cells (in this example, the unit length of

the transmission line). Measurements from each side are divided by the measurement of the reference detector, removing unknowns related to the source as well as detector gains. The resulting ratios are used to determine the unit cell parameters and the matching between the two sides.



Figure 1. Implementation of the proposed system.

Hardware experiments are done with the fabricated board (Fig. 2). An adjustable load is constructed with a phase shifter (HMC647A) and an open ended lossy transmission line. In this embodiment, we have used RMS voltage detectors (LTC5532) for the measurements. The created loads are measured by a VNA (Agilent E8361A) to provide baseline measurements. Using the load measurements, conjugate matching is calculated with respect to the VNA measurement plane. The BIST technique is able to match the VNA results within 1 degree in phase and within 0.1dB in magnitude.



Figure 2. Designed board with detectors.

Keywords: RF BIST, cascade, automotive radar, tuning

#### INDUSTRY INTERACTIONS

NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Avci, Muslum Emir, Sule Ozev, and YB Chethan Kumar, "Mismatch Detection using Periodic Structures," United States Patent and Trademark Office Serial number 17/490,644, filed on Sep 2021.

[2] Avci, Muslum Emir, and Sule Ozev, "Low-Overhead RF Impedance Measurement Using Periodic Structures," IEEE Transactions on Microwave Theory and Techniques, 2023.

The goal of this effort is to use the Texas Instruments PLM to create a holographic display for Augmented Reality (AR) and Head-up Display (HUD) applications. This use of the Texas Instruments PLM has the potential for a high impact in both the scientific and consumer electronic worlds.

#### **TECHNICAL APPROACH**

This task seeks to use corrected holographic couplers to realize the image guide formed on curved surfaces, allowing real-world surfaces to be used as the image guide for a distortion-free AR head-up display. A holographic coupler is angularly multiplexed to increase the field of view (FOV), on top of its dispersive nature and broad spectral width of LEDs. PLM is used to add focal power to the diffracted wave, adding depth perception to the reconstructed image in the preliminary experiment. This allows the 3D see-through AR display, by time multiplexing, wherein different slices of 3D objects are imaged at different distances one after another.

#### SUMMARY OF RESULTS

The image guide, a plate that conveys the image to the front of the viewer's eyes, had been conventionally flat. In real-world objects, like eyeglasses or visors, the surfaces are not flat. If such surfaces are used as the image guide, the total internal reflection (TIR) adds focal power and the direction of the ray will change as the ray repeats bounce. This causes the "multiple vision" of the projected image, as the rays exit from the image guide at the extraction hologram are not parallel to each other. By adding focal power to the holographic coupler on a curved image guide, the resulting reconstructed images are distortion-free and not duplicative as shown in Figure 1. The results agree well with the ray tracing simulations. Correction to the hologram was added to the reference beam as the holographic coupler is being fabricated. Cylindrical as well as spherical image guides were successfully fabricated.



Figure 1. See-through display with curved image guide.

Dispersive nature of holographic coupler limits the angular bandwidth, and hence limits FOV. The angular bandwidth of the constructed coupler was ~3.5deg. for monochromatic (laser) light. Use of LED, having  $1/e^2$  width of nearly 50nm, can broaden the FOV to ~18deg. The FOV is further doubled by stacking two holographic couplers on top of the other, as shown in Figure 2, realizing an FOV of 35 degrees. These new results can be combined to realize wide-FOV distortion-free see-through AR display devices.



Figure 2. Expansion of FOV by angular multiplexing the holographic couplers.

PLM manipulates the wavefront, and by way of adding curvature (focal power) to the reconstructed wave, it adds the depth perception to the viewer's eyes. An image of a 3-D object is reconstructed by producing the images of slices of the object at different depth and imaging at planes at different distances. Quick switching between different slices would collectively reconstruct the image of the 3D object with depth perception. PLM allows rapid switching of the images owing to its high-speed response, realizing the reconstruction of the 3-D image as if multiple slices are projected simultaneously.

These results pave ways to realistic, 3-D (depth perception enabled), see-through AR display with highquality images.

**Keywords:** AR display, image guide, holographic display, 3D display, phase light modulator

INDUSTRY INTERACTIONS

**Texas Instruments** 

Texas Instrument Phase Light Modulator (TI PLM) and Digital Micromirror Device (DMD) controls light in a very fast and flexible manner and is a key enabler of automotive and consumer optical devices such as lidar, head-up display, headlight, augmented reality (AR) display being used with advanced opto-electronics and cameras.

#### **TECHNICAL APPROACH**

We have unlocked unexplored capabilities of Texas Instruments PLM and DMD for lidar and AR display by synchronous pulse illumination to PLM and DMD. For lidar, we demonstrated a new solid-state lidar optical architecture with two DMDs, for the transmitter and receiver and a SiPM (Silicon Photo Multiplier) detector array. The lidar system achieves (1) All-MEMS/Solid-state with minimum to no moving components, (2) a Field of view of 35 degrees, (3) Resolution of 0.15 degrees, (4) The largest aperture area (>140mm<sup>2</sup>) of solid-state transmitter and receiver, and (5) with the highest Technology Readiness Level (= 9, commercialized) of Automotive certified DMD.

#### SUMMARY OF RESULTS

New ALL-MEMS and Solid-state lidar architecture with high TRL MEMS device: We have developed a lidar system with two TI-DMDs (Fig. 1). Lidar systems are categorized into (a) point-and-shoot lidar with mechanical scanning for long-range object recognition, and (b) flash lidar with flood illumination and 2-dimensional detector array for short range object recognition. The developed lidar system falls into a new category, scanning field of view (FOV) lidar. The system simultaneously achieves a large full FOV (35 deg) and high resolution (0.15 deg).

New concept development, Scalable and arrayed TI-DMD, and TI-PLM for lidar and display: Over the past 3 years of the research period, we understand that applications of PLMs and DMDs span over variety areas such as automotive (image generation for HUD, adaptive headlight, and lidar), consumer electronics (Augmented Reality display projector engine), manufacturing (additive 3D printer), and industrial instruments (spectrometer, high-speed camera). Also, new applications are emerging such as optical communications between high-speed drones, fast reconfigurable optical add-drop filters for telecommunication, imaging through the fog for automotive, and massive arrayed holographic projection.





Figure 1. (a) The optical architecture and (b) image captured by the system of the All-MEMS solid-state lidar with Texas Instruments DMD.

#### Keywords: lidar, MEMS, solid-state, SiPM, automotive

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] J. Chan et al., "Wide Field of View Real-time flash DMD-Lidar with 2D Multi-Pixel Photon Counter," Paper 12435-18, SPIE Photonics West (2023).

[2] X. Deng et al., "Beam tracking and image steering by TI PLM based on camera input for lidar and AR applications," Paper 12435-20 SPIE Photonics West (2023).

## TASK 2810.056, MILLIMETER WAVE PACKAGING RESEARCH – ANTENNA IN PACKAGE

RASHAUNDA HENDERSON, UNIVERSITY OF TEXAS AT DALLAS, RMH072000@UTDALLAS.EDU HONGBING LU AND MARK LEE, UNIVERSITY OF TEXAS AT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

Antenna in package (AiP) is an enabling technology for RF front-ends. We focus on the design and characterization of broadband planar antennas integrated with flip-chip enhanced QFN packages operating from 90 to 220 GHz. Material characterization has been used to predict electrical performance under mechanical stress and high temperatures.

#### **TECHNICAL APPROACH**

Interconnect transitions and antenna designs have been optimized using Ansys HFSS. The effort employs broadband electrical properties of the packaging materials used as substrates for the antennas. To validate the simulations, two test vehicles have been created using a coplanar waveguide-fed slot bowtie antenna and a microstrip-fed E-shaped patch antenna. To avoid parasitic radiation that would inhibit the antenna patterns, novel backside transitions have been incorporated with waveguides to transition the input signal. Nanoindentation tools have been used to characterize temperature-dependent mechanical properties of the materials.

#### SUMMARY OF RESULTS

Fig. 1 shows the cross-sectional design of the enhanced QFN package assembled onto a printed circuit board. The encapsulation material covers the antenna that can be printed on the top metal layer. An external waveguide electromagnetically couples the signal through the package to a stripline to grounded coplanar waveguide (GCPW) transition that feeds the antenna under test. QFN pads are realized in Wall 1, perpendicular to the view. As opposed to equally spaced vias used to create substrate integrated waveguide (SIW), vias in this process can be stitched together to create a wall, which provides better isolation compared to traditional packaging.

Eacapsulation
Antenna in Metal
Layer 1
Metal Layer 2
Wall 2
Metal Layer 1
Wall 1 (QFN Pads)
PCB Ground
PCB
External Waveguide

Figure 1. Cross-section of flip-chip enhanced QFN package using two metal layers.

By utilizing the waveguide feed from the backside of the packaged antenna, we can avoid the parasitic radiation caused by a traditional ground-signal-ground (GSG) probe that connects directly to the planar antenna and interferes with the radiation pattern. The planar antennas that have been studied include a slot bow-tie antenna and an E-patch. Table 1 lists the simulated 10-dB input match bandwidth (BW) for both designs in the WR8 90-140 GHz and WR5 140-220 GHz bands. The packaged antenna designs have been optimized and incorporated into a layout to create two test vehicles for performance validation.

Table 1. Antenna area and 10-dB BW in WR5 and WR8 bands.

| Antenna | WR8<br>(mm²) | WR8<br>(GHz) | WR5<br>(mm²) | WR5<br>(GHz) |
|---------|--------------|--------------|--------------|--------------|
| Bow-tie | 4.2          | 43           | 1.5          | 80           |
| E-patch | 6.2          | 50           | 1.88         | 48           |

Fig. 2 shows the test vehicles that allow for GSG characterization and waveguide design. Novel assembly methods have been utilized to complete this effort.



Figure 2. The left image shows TV1 using WR5 band bow-tie antennas and the right image shows WR8 and WR5 E-patches.

**Keywords:** Enhanced QFN package, slot bow-tie, E-patch, transmission line, chip-package transitions

#### INDUSTRY INTERACTIONS

NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

 A. Jogalekar et al., "Methods to Characterize Radiation Patterns of WR5 Band Integrated Antennas in a Flip-Chip...," 2022 IEEE 31<sup>st</sup> EPEPS, Oct. 2022, San Jose, CA.
 A. Jogalekar et al., "Slot Bow-Tie Antenna Integration in Flip-Chip and Embedded Die Enhanced QFN Package for WR8 ...," 2022 IEEE 72<sup>nd</sup> ECTC, June 2022, San Diego, CA.

To optimize Analog/RF IC testing, machine learningbased solutions are used specifically to enhance yield management. As semiconductor devices become more complex, testing procedures have become intricate and time-consuming. Our solution aims to enhance testing efficiency by minimizing the risk of discarding functional devices (Overkill) or shipping out defective chips (Underkill).

#### TECHNICAL APPROACH

To minimize overkill, our proposed approach includes three steps: predicting auxiliary test values with multivariate regression models, clustering predicted and actual outcomes, and combining them using a proximitybased metric to determine recoverable devices. For Underkill, we employ Gaussian Mixture Model (GMM) clustering on probe test measurements from multiple insertions. We isolate devices with a higher probability of on-site failure and utilize adaptive multivariate outlier detection to identify potential customer return devices.

#### SUMMARY OF RESULTS

In our efforts to reduce Underkill, we performed our experiments on an industrial dataset from Texas Instruments that consisted of 66 specification tests and 241 auxiliary tests performed on 92,022 devices. Of these devices, we focus on 8,840 (9.6%) devices that pass the specification test but fail the auxiliary tests.

First, we perform our regression modeling and limit agnostic clustering independently, and then, combine the outcomes of the regression and clustering. Upon examination, we get three buckets of devices, of which the two outcomes diverge for one bucket, and we use a twoclass classifier to decide. Finally, using the two-class classifier in addition to our regression and clustering, we recovered 81.6% (highlighted in green) of devices from our focus group as observed in Table 1.

Table 1. Device Classification using a Two-class Classifier.

|                 |      | Specification Tests |       |  |
|-----------------|------|---------------------|-------|--|
|                 |      | Pass                | Fail  |  |
| Auxiliary Tests | Deve | 80,261 +            | 4 633 |  |
|                 | Pass | 7,217               | 1,623 |  |
|                 | Fail | 726                 | 2,195 |  |

In our efforts to reduce Underkill, we proposed a threestep approach; feature selection, clustering using GMM, and adaptive outlier detection. We performed our experiments on an industrial dataset from Texas Instruments consisting of devices from 19 wafers with a recorded customer return on each wafer. First, we perform feature space selection, followed by unsupervised clustering using GMM, and note that we are effectively able to isolate known customer returns as shown in Fig. 1.



Figure 1. Clustering using Gaussian Mixture Modeling.

We use a modified formulation of cluster-based local outlier factor scores to learn the multivariate outlier boundary. Upon applying our proposed methodology, we achieved coverage of 89% - 100% (correctly identified customer returns). Additionally, the outlier detection model incurs an additional yield loss of 3.48% - 1.8% as we progress the train set from 10 wafers to 18 wafers.

**Keywords:** yield recovery, machine learning, adaptive testing

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] D. Neethirajan et al., "Machine Learning-Based Overkill Reduction through Inter-Test Correlation," IEEE VLSI Test Symposium (VTS), 2022.

[2] V. A. Niranjan et al., "Machine Learning-Based Adaptive Outlier Detection for Underkill Reduction in Analog/RF IC Testing," IEEE VLSI Test Symposium (VTS), 2023.

Although there has been tremendous progress in ADC performance, the research community has focused on energy efficiency, and in particular energy Figure of Merit (FoM). This research will redefine ADCs as information extraction tools, dramatically increasing their capability and utility in communication and sensing systems.

#### **TECHNICAL APPROACH**

Existing sensor interfaces also create too much data and provide too little information. Machine learning approaches, such as feature extraction and classification, can overcome bandwidth and power limitations; however, traditional machine-learning methods are expensive. We propose a new class of intelligent and aware ADCs that directly extract information. We also use neural networks to correct and improve ADC performance.

#### SUMMARY OF RESULTS

Time-interleaving of SAR ADCs is an essential technique for high-speed analog-to-digital conversion. Traditional interleaved SAR ADCs require a large die area and have limited conversion speed due to the overhead of multiple switched-capacitor DACs. Additionally, difficulties with matching between interleaved ADC channels limit performance. We tackle these challenges with: (1) a timeinterleaved charge-injected cell (CIC) SAR ADC which benefits from hardware sharing of CIC cells for a small area and (2) a hardware-friendly neural-network calibration scheme.

The 8x interleaved ADC is built with four units (Fig. 1) – each unit functions as two sub-ADCs. Each ADC channel has its own comparator, sampling capacitors, and SAR logic but two ADC channels share a single CIC DAC. The CIC DAC is implemented with 10 unit-charge injection cells, connected to the DAC lines of the two shared ADCs through NMOS switches. Each sub-ADC pair produces a complete output every four clock cycles. For example, in the overall ADC, sub-ADC 0 and sub-ADC 4 share the same DAC. One sub-ADC samples and generates the 3 MSBs while the other sub-ADC produces the 3 LSBs. The sharing of CIC hardware provides a significant reduction in area. Furthermore, hardware sharing helps reduce the mismatch between sub-ADCs.

Neural networks can effectively learn non-linear functions between two sets of data. Their basic structure consists of an input layer, a series of hidden layers that

perform data computations, and an output layer. Each hidden layer multiplies the input by a matrix of learned weights and then passes it through a non-linear activation function, such as a rectified linear unit (ReLU) or tanh function. This allows the network to learn complex nonlinear function mappings. In this work, we train a feedforward network with a 1D convolutional layer followed by two fully connected layers to map the output of the TI-ADC to an ideal quantized version of the signal. In this way, we can calibrate multiple error sources present in the ADC without having to first characterize them explicitly.



Figure 1. 8x interleaved CIC SAR ADC.



Figure 2. 6GS/s ADC with neural network calibration.

Keywords: ADC, bitstream, neural network, calibration

#### INDUSTRY INTERACTIONS

Intel, NXP, MediaTek

#### MAJOR PAPERS/PATENTS

[1] T. Kang, S. Lee, S. Song, M. R. Haghighat, M. P. Flynn, " A Multimode  $157\mu$ W 4-Channel 80dBA-SNDR Speech-Recognition Frontend with Self-DOA Correction Adaptive Beamformer," ISSCC, Feb 2022.

[2] E. Ware, J. Correll, S. Lee, and M. P. Flynn, "6GS/s 8channel CIC SAR TI-ADC with Neural Network Calibration," IEEE European Solid-State Circuits Conference (ESSCIRC), September 2022. TASK 2810.062, MULTI-CARRIER DAC-BASED TRANSMITTER ARCHITECTURES FOR 100+GB/S SERIAL LINKS SAMUEL PALERMO, TEXAS A&M UNIVERSITY, SPALERMO@ECE.TAMU.EDU SEBASTIAN HOYOS, TEXAS A&M UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

Clock jitter places fundamental performance limitations on common wireline transmitters, necessitating clock generation and distribution circuitry that achieve rms jitter of a few hundred femtoseconds. The DAC-based transmitter design techniques in this project aim to significantly improve jitter robustness and reduce system equalization complexity.

#### **TECHNICAL APPROACH**

A new multi-carrier DAC-based transmitter architecture is in development that is capable of providing jitter robustness for baseband and coherent multi-tone modulation applications. The transmitter utilizes novel techniques to improve the wireline polar transmitter speed and efficiency, including a high-speed injectionlocked oscillator-based digital phase modulator and DACbased FIR filtering in the segmented output driver. Efficient digital FIR filtering and linearization techniques, including a look-up table equalizer and an output stage pre-distortion DAC, are also in development.

#### SUMMARY OF RESULTS

Figure 1 shows the proposed first-generation 50Gb/s multi-carrier TX that leverages carrier orthogonality to allow band overlap, with three 5GS/s bands of BB PAM4 and MB and HB 16-state complex modulation on respective 5 and 10GHz carriers [1]. The 312.5MHz DSP generates per-band 16 parallel 7b amplitude plus 2b predistortion codes and 16 parallel MB and HB 7b phase codes that then pass through 16:2 MUXes before the final 2:1 serialization in the parallel DAC-based output stages. This results in three independent 5GS/s signals that are



Figure 1. Multi-carrier DAC-based transmitter.



Figure 2. Multi-band 50Gb/s measurement results. then current-mode combined with the driver outputs to form the multicarrier signal. The BB segment utilizes a conventional CML-based DAC driver, while the MB and HB segments use polar CML DAC drivers to combine the symbol amplitude and phase values. The TX output network employs a  $\pi$ -coil network for bandwidth extension and bleeder circuits to maintain proper output common-mode level.

The proposed multicarrier TX was fabricated in a 22-nm FinFET process and occupies a 0.18mm<sup>2</sup> area. 50Gb/s operation is achieved by simultaneously activating the three bands. Fig. 2 shows the histogram and constellation obtained by utilizing an oscilloscope by continuously sampling the TX output. This sampled data is postprocessed with a multicarrier RX system model that consists of mixers, integrators, quantizers, and interchannel-interference (ICI) cancellers. All the bands achieve BER<10<sup>-4</sup> over a channel having 5dB loss at 12.5GHz with the TX FIR equalization and 4-tap ICI cancellation. Operating the TX at 40Gb/s avoids the oscilloscope bandwidth issue, with BER<10<sup>-4</sup> achieved with a channel having 17dB loss at 10GHz. The TX consumes 84mW from the 0.85V nominal and 1.5V bleeder supply voltages. A second-generation multicarrier TX is currently being designed in a 16-nm FinFET process that will increase the data rate to 128Gb/s.

**Keywords:** Digital-to-analog converter, frequencyinterleaving, jitter, transmitter, serial link

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] I.-M. Yi et al., VLSI Sym., June 2023, Kyoto, Japan.

## TASK 2810.063, ANALOG AND DIGITAL ASSIST TECHNIQUES TO IMPROVE MIXED-SIGNAL PERFORMANCE DENNIS SYLVESTER, UNIVERSITY OF MICHIGAN, DMCS@UMICH.EDU DAVID BLAAUW, UNIVERSITY OF MICHIGAN

#### SIGNIFICANCE AND OBJECTIVES

Subthreshold voltage references operate at nW levels while maintaining low TC and PSRR, but suffer from worse process variation and larger current/power variation across temperatures. We propose a duty-cycled subthreshold voltage reference with programmable output voltage for ultra-low power IoT applications.

#### **TECHNICAL APPROACH**

The reference voltage 'seed' is initially generated by a 2-transistor (2T) voltage reference, but we power-gate the 2T and make its duty cycle inversely proportional to temperature. As a result, power variation is significantly reduced for the proposed subthreshold reference. Moreover, we propose a programmable voltage converter that takes the voltage seed to generate a higher reference voltage while maintaining a low TC. This significantly improves the output voltage range of the 2T reference and can be used to compensate for process variation.

#### SUMMARY OF RESULTS

The proposed voltage reference is fabricated in 0.18-µm CMOS with 0.026mm<sup>2</sup> (including on-chip flying capacitors and a 1pF output capacitor). Across 15 dies, the average value of V<sub>ref</sub> for the proposed circuit is 1.086V, ~2.7× over the conventional 2T reference, while the standard deviation of the proposed circuit is proportionally increased by 2.9×, indicating that the voltage programmer does not introduce added variation across chip samples. The circuit has a 461ppm/°C TC (using just one clock frequency for all temperatures), while the conventional 2T reference has a 106ppm/°C TC at the same  $V_{dd}$ . Despite the higher TC, the proposed circuit limits the range of current consumption across -20°C to 100 °C to 1.5-5nA, which is a ~400× reduction in current spread compared to the conventional 2T reference. At 25 °C, the proposed voltage reference consumes 4.6nW including timing sequence generation. Better temperature performance can be obtained using multiple clock frequencies across operating temperatures; we did not implement this in a closed-loop fashioned on this test chip but it is a possible extension. Measurements in the comparison table show the gain possible using this technique (in an open-loop manner).



Figure 1. (a) 2T voltage reference; (b) Current variation of 2T reference across T; (c) proposed subthreshold voltage. Table 1. Comparison table, (b) 4 clock frequencies are used.

|                        | This<br>work              | ESSCIRC<br>'18 | ТСАS-<br>II '18 | ISCAS<br>'21        |
|------------------------|---------------------------|----------------|-----------------|---------------------|
| Process (nm)           | 180                       | 180            | 130             | 180                 |
| Area (mm²)             | 0.026                     | 0.0012         | 0.003           | 0.002               |
| Chip Samples           | 15                        | 27             | 45              | -                   |
| V <sub>dd</sub> (V)    | 2 - 4                     | 0.5 - 2.5      | 1.1 - 2.4       | 1.0                 |
| Power (nW)             | 4.6                       | 0.65           | 27.5            | 1.35                |
| σ/μ (%)                | 0.41                      | 0.30           | 5               | 2                   |
| Temp.<br>Range (°C)    | -20 -<br>100 <sup>b</sup> | -40 - 125      | -40 - 85        | -10 - 110           |
| TC (ppm/°C)            | 176 <sup>b</sup>          | 152.8          | 100             | 400                 |
| LS (%/V)               | 2.2                       | 0.031          | 2               | 0.7 - 2             |
| PSRR (dB)<br>@ 100Hz   | -38                       | -61.5          | N/R             | N/R                 |
| Programmable<br>Output | Yes<br>64 ×<br>0.11%      | No             | No              | Yes<br>16 ×<br>5.2% |

**Keywords:** Subthreshold voltage reference, Ultra-low power circuit, Temperature coefficient, PSRR, Voltage programmability

#### INDUSTRY INTERACTIONS

NXP

#### MAJOR PAPERS/PATENTS

[1] Y. Peng et al., "A 4.6nW subthreshold voltage reference with 400X current variation reduction and 64-step 0.11% output voltage programmability," European Solid-State Circuits Conference (ESSCIRC), 2023, Lisbon, Portugal.

## TASK 2810.071, ACCURATE COMPACT TEMPERATURE SENSORS FOR THERMAL MANAGEMENT OF HIGH PERFORMANCE COMPUTING PLATFORMS

RANDALL GEIGER, IOWA STATE UNIVERSITY, RLGEIGER@IASTATE.EDU DEGANG CHEN, IOWA STATE UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

The objective is to develop a strategy for designing very compact densely distributed temperature sensors for real-time power-thermal management with the accuracy needed for reliably managing failure mechanisms inherent in Silicon. Significance is in providing sensor output as key input into robust power/thermal management controller.

#### **TECHNICAL APPROACH**

Very compact temperature sensors that can be widely dispersed at critical locations throughout an integrated circuit will be designed. Tentatively these sensors will be a single small MOS transistor or pairs of MOS transistors where temperature is embedded in the I-V characteristic of these devices. Located at a less-critical location where area requirements are relaxed will be a Temperature Management Controller (TMC) that extracts temperature data from an array of temperature sensors. The interrelationship between the temperature of the TMC and the temperature at remote temperature sensor locations will be managed with an appropriate calibration algorithm.

#### SUMMARY OF RESULTS

The target performance of the compact temperature sensors is absolute accuracy of 100mK over the critical temperature window from 75°C to 95°C with accuracy reduced to 3°C at temperatures below 50°C. This should provide the temperature accuracy needed for managing the variation in the thermal-restricted mean time to failure (MTTF) of an integrated circuit to approximately 10% of a target MTTF value.

Two compact temperature sensors suitable for multisite temperature sensing were reported previously. A patent is pending on these sensors. Experimental results for the one shown in Fig. 1 were obtained. After a twopoint calibration, the temperature error for 7 samples designed in a 180-nm process is shown in Fig. 2. The measured temperature error over the critical 75°C to 95°C range was less than  $\pm$ 90mK. Over the -20°C to 100°C range the measured error was less than 4.5°C.

Simulation results for the second temperature sensor designed in a 180-nm process predict an accuracy of 60 mK over the same critical 75°C to 95°C window. The circuit was fabricated and being characterized.



Figure 1. Compact two-transistor temperature sensor array.



Figure 2. Measured results for 7 Samples from 180-nm CMOS.

A new compact low-cost all-MOS low power temperature sensor with digital output has been designed. An embedded successive approximation 7-bit calibrator provides digital output without the need of an explicit ADC. In this design, two critical temperaturesensing transistors are biased to operate in weak inversion. Simulations predict a temperature error bounded by  $\pm 1.8^{\circ}$ C over the -40°C to 125°C temperature range with a power dissipation of 15uW, a conversion time of less than 20usec, and an area of under .09mm<sup>2</sup> in a standard 180-nm CMOS process.

**Keywords:** temperature sensor, thermal mask, power/thermal management, reliability

#### INDUSTRY INTERACTIONS

#### **Texas Instruments**

#### MAJOR PAPERS/PATENTS

Patent Pending: Sept. 2022 "Compact Temperature Sensors for Power/Thermal Management."

## TASK 2810.076, HIGH PRECISION POSITIONING TECHNIQUES BASED ON MULTIPLE TECHNOLOGIES AND FREQUENCY BANDS NAOFAL AL-DHAHIR, UNIVERSITY OF TEXAS AT DALLAS, ALDHAHIR@UTDALLAS.EDU MURAT TORLAK, UNIVERSITY OF TEXAS AT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

WiFi ranging provides ubiquitous high-accuracy localization for IoT applications. It utilizes timestamps and channel frequency response measurements to achieve sub-sample ranging resolution. However, WiFi ranging is primarily limited by the operating bandwidth whereas WiFi IoT devices that can only operate on 20 MHz channels are limited to meter-level ranging accuracy.

#### **TECHNICAL APPROACH**

To overcome single-channel bandwidth limitations of WiFi measurements, devices can frequency hop across multiple channels to obtain measurements over larger bandwidths. However, these measurements cannot be directly stitched together due to changes in local oscillator phase offsets and time offsets per channel. To overcome these challenges, we propose a two-way CFR approach in which timestamp information is added to the CFR phase. We build a software-defined radio WiFi testbed and take real-world measurements in indoor environments to evaluate our methods. The MUSIC algorithm is employed for ranging with a complexity reduction strategy to accommodate large bandwidths with many subcarriers.

#### SUMMARY OF RESULTS

Utilizing our proposed multi-channel CFR (Channel Frequency Response) stitching techniques, we demonstrate 90<sup>th</sup> percentile and root mean square errors (RMSE) of 9 cm and 9.2 cm, respectively, for stitched bandwidths of at least 320 MHz in Line of Sight (LOS) conditions. These results with comparisons for different total bandwidths are included in Fig. 1. Increasing the total stitched bandwidth to 745 MHz by including all 20 MHz WiFi channels available in the 5 GHz WiFi band, we further reduce the RMSE by 2 cm.

In terms of complexity, we demonstrate more than 3.7 orders of magnitude complexity reduction compared to traditional MUSIC as shown in Fig. 2 which plots the number of complex multiplications vs. the downsampling rate used for our complexity reduction strategy. The maximum downsampling rate possible without loss of accuracy scales with the bandwidth, and a downsampling rate of 28 is selected for the 320-MHz total bandwidth.



Figure 1. Absolute CDF and RMSE results for all measurements taken in LOS conditions using a  $4 \times 2$  antenna configuration and different single-channel and stitched bandwidths.



Figure 2. The complexity of our MUSIC-based distance estimation algorithm for different downsampling rates and total bandwidths using a frequency subcarrier spacing of 78.125 KHz.

**Keywords:** Centimeter-level, Internet of Things (IoT), localization, ranging, WiFi

#### INDUSTRY INTERACTIONS

NXP Semiconductors, MediaTek, Texas Instruments

### TASK 2810.081, DEVELOPMENT OF 70-95 GHZ TERABIT BEAMFORMER HUEI WANG, NATIONAL TAIWAN UNIVERSITY, HUEIWANG@NTU.EDU.TW TIAN-WEI HUANG AND KUN-YOU LIN, NATIONAL TAIWAN UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

We propose to develop a 70-95 GHz beamformer for extremely high-speed communication applications, possibly for B5G/6G systems. The proposed beamformer can provide a 25-GHz bandwidth (30% fractional bandwidth), which covers three point-to-point bands at Eband/W-band.

#### **TECHNICAL APPROACH**

The key components of the planned 70-95 GHz beamformer are taped out in a 65-nm CMOS process, including the bi-directional phase-shifter, bi-directional PA-LNA, and up and down-conversion mixer. The fabricated chips are already out, and the measurement is carried out.

#### SUMMARY OF RESULTS

#### Switchless Bi-directional PA/LNA

Two versions of the PA-LNA are fabricated for packaging and on-wafer test, as shown in Fig.1. The measured peak gain in LNA mode is 14.6 dB at 83 GHz and the 3-dB bandwidth covers 75.8-95.0 GHz. The simulated minimum noise figure is 7.8 dB at 82 GHz. The measured peak gain in PA mode is 14.5 dB at 83 GHz and the 3-dB bandwidth is 75.1-95.0 GHz.



Figure 1. Layout of PA-LNA (a) for packaging (b) for on-wafer test.

#### **Bi-directional Phase Shifter**

The layouts of two versions of the 5-bit bi-directional phase shifter is shown in Fig. 2. The proposed phase shifter provides 32 different phase states with a maximum 13.5° and 10.46° RMS phase error, respectively. The measured insertion loss varies from 12.7 to 27.3 dB and 13.8 to 28.1 dB, respectively. The maximum amplitude error at 70 GHz is 2.9 dB and 2.93 dB in backward mode, respectively. The gain control range in the Tx mode is 5.54, 4.81, and 3.26 dB at 70, 82.5, and 95 GHz, respectively.





Figure 2. Layouts of taped-out bi-directional phase shifter (a) for packaging (b) for on-wafer test.

#### Sub-harmonic Up/Down Converter

The simulated conversion gain of the up-conversion mixer with a 5-stage PA is higher than 23 dB with 1.3dB variation and IRR is higher than 33.8 dB. The layout of the up-mixer+PA is shown in Fig. 3.



Figure 3. Layout of taped-out mixer+PA.

The conversion gain of the down-conversion mixer is higher than -0.5 dB in the frequency range of 80-95.1 GHz, and the IF bandwidth is 6.3 GHz. The IRR is better than 34 dB. Fig. 4 shows the circuit layout.



Figure 4. Layout of the taped-out down-conversion mixer.

#### Integrated TRx Design

We will also start to integrate a single-chip 1-way TRx after characterizing and trouble-shooting each individual component.

**Keywords:** Beamformer, Phased Array, Bi-directional, W-band, CMOS

#### INDUSTRY INTERACTIONS

MediaTek

## TASK 2810.082, ADAPTIVE DIGITAL CANCELLATION OF DYNAMIC ERROR FROM CLOCK SKEW, COMPONENT MISMATCHES, AND ISI IN HIGH-RESOLUTION RF DACS

IAN GALTON, UNIVERSITY OF CALIFORNIA AT SAN DIEGO, GALTON@UCSD.EDU

#### SIGNIFICANCE AND OBJECTIVES

This project is developing digital calibration techniques that adaptively measure and cancel both static and dynamic error from clock skew, component mismatches, and inter-symbol interference (ISI) in current-steering RF DACs. It will provide experimental validation via two 22nm CMOS DAC ICs that the techniques enable performance beyond the current state-of-the-art.

#### TECHNICAL APPROACH

This project consists of three parts. Part 1 is developing a high-speed current-steering 3-GHz DAC IC with a target worst-case Nyquist-band SNDR of 72 dB enabled by a recently developed subsampling mismatch-noise cancellation (MNC) technique. RZ signaling is being used to prevent ISI from limiting the DAC performance. Part 2 has developed a subsampling ISI cancellation (ISIC) technique, and Part 3 will develop a second-generation version of the Part 1 DAC IC which includes both the subsampling MNC and ISIC techniques. The ISIC technique eliminates the need for RZ signaling, which will enable a doubling of the sample rate to 6 GHz without degrading the DAC's performance.

#### SUMMARY OF RESULTS

We are still working on the Part 1 research and we have completed the Part 2 research.

For the Part 1 research we are in the late stages of developing the first-generation 3 GS/s RF DAC IC in Global Foundries 22-nm FDSOI CMOS (22FDX) with a target Nyquist-band SNDR of 72 dB enabled by MNC and RZ signaling. The target specifications are beyond the current state-of-the-art, so the design is challenging and has taken longer than initially anticipated. So far, we have finished the system-level design and the bit-level register-transferlevel digital design, and we have completed the design and layout of the most critical analog blocks. We are currently working on the synthesis and automatic placeand-route realization of the on-chip digital calibration engine, the final design and layout of the remaining analog blocks and high-level layout of the IC, and are continuously running post-extraction simulations to detect potential issues from layout parasitics.

We completed the Part 2 research ahead of schedule, and the results achieved are better than initially anticipated. Specifically, we have developed an enhanced version of the originally-anticipated ISIC technique that adaptively measures and accurately cancels error from ISI over a high-resolution DAC's first Nyquist band, thereby circumventing the need for return-to-zero (RZ) pulse shaping. It is an extension of the MNC technique. While the MNC technique suppresses both static and dynamic error over the DAC's first Nyquist band from component mismatches and clock skew, it does not mitigate ISI. The ISIC technique can be implemented by itself or together with the MNC technique, and, like the MNC technique, it can be operated in both foreground and background calibration modes. When implemented together, the ISIC and MNC techniques can operate simultaneously and share the same analog circuitry without interfering with each other.

The Part 2 research results are better than initially anticipated because the proposed ISIC technique has a different form than initially envisioned and the new form offers two unexpected benefits. One benefit is that the ISIC technique can be run simultaneously and share circuitry with the MNC technique as mentioned above. The other benefit is that the ISIC technique's convergence rate is significantly-higher than we initially thought possible. It is on par with that of the MNC technique. These benefits are convenient given that the two techniques will be implemented together in the Part 3 IC. We have written a paper, which is currently under review by the IEEE Transactions on Circuits and Systems I: Regular Papers, that presents a rigorous theoretical analysis of the ISIC technique and demonstrates the technique's performance in conjunction with the MNC technique via simulation results [1].

**Keywords:** DAC, current-steering, ISI, mismatch-cancellation, ISI-cancellation

#### INDUSTRY INTERACTIONS

MediaTek, NXP

## TASK 2810.083, AUTOMATED LAYOUT OF ANALOG ARRAYS IN ADVANCED TECHNOLOGY NODES SACHIN S. SAPATNEKAR, UNIVERSITY OF MINNESOTA, SACHIN@UMN.EDU RAMESH HARJANI, UNIVERSITY OF MINNESOTA

#### SIGNIFICANCE AND OBJECTIVES

This project automatically synthesizes high-quality layouts for analog circuit arrays. This involves (1) optimal design of array components considering parasitics, layoutdependent effects, process variations, temperature, electromigration, and voltage drops (2) optimal building block selection in circuits for performance/variability, exploring tradeoffs at the building block level.

#### **TECHNICAL APPROACH**

The design of array structures requires consideration of potentially conflicting factors: matching and resilience to systematic/random process variations, compact layout, low-parasitic routing, and thermal, electromigration, and IR drop constraints on wires that can degrade performance. These array structures are embedded into larger circuits, and the performance constraints on the array depend on the type of circuit and its usage in the larger system. The project develops modeling techniques to capture these constraints and use them to build optimal arrays that balance the requirements of these performance requirements with matching constraints, using approaches such as common-centroid (CC) or interdigitated layout.

#### SUMMARY OF RESULTS

Common centroid (CC) layout is an integral method for ensuring matching in capacitive arrays in analog circuits. We develop fast constructive procedures for CC placement and routing for binary-weighted capacitors in chargesharing digital-to-analog converters (DACs). Traditional methods focus on the impact of mismatch on metrics such as the integral nonlinearity (INL) and differential nonlinearity (DNL) of the DAC. These methods distribute capacitors in the array to achieve good levels of dispersion, increasing the correlation between capacitance variations and therefore reducing mismatch. Our primary results include:

(1) Developing methods that balance variability with 3dB frequency for capacitive arrays in both binary-weighted and split DACs. Better dispersion requires routing with numerous via: We propose spiral layout and block chessboard layout methods [1] to find a good balance between all design objectives.

(2) Developing systematic approaches to find the minimum unit capacitor value in a capacitor array, accounting for

process variations, thermal noise, and flicker noise: This method has been applied to capacitor arrays in both binary-weighted and split DACs.

(3) Incorporating the impact of nonlinear gradients in transistor arrays: we build common-centroid layouts for transistor arrays that cancel out linear gradients, but further optimize them to address second-order gradients. These nonlinearities arise, for example, because a linear gradient in the transistor threshold voltage translates to a quadratic variation in drain current in circuits where long-channel devices are used to mitigate random variations, or due to nonlinear process gradients.



Figure 1. Error and noise of split DAC for (a) spiral and (b) block chessboard placements.

For the optimized minimum unit capacitance, Fig. 1 shows the noise due to thermal noise, flicker noise, |DNL|, and |INL| for the spiral and block chessboard methods. For DACs with more bits, for both placements, the  $3\sigma$  of thermal and flicker noise play a larger role. We find that these variations are larger for split DACs than binary-weighted DACs since they use fewer unit capacitors and are hence more sensitive to routing parasitics.

**Keywords:** Common-centroid, capacitor arrays, digital-toanalog converters, matching, 3dB frequency

#### INDUSTRY INTERACTIONS

Intel, NXP

#### MAJOR PAPERS/PATENTS

 N. Karmokar, A. K. Sharma, J. Poojary, M. Madhusudan, R. Harjani, and S. S. Sapatnekar, "Constructive Placement and Routing for Common-Centroid Capacitor Arrays in...," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023 (Early Access).
 N. Karmokar, R. Harjani, and S. S. Sapatnekar

"Minimum Unit Capacitance Calculation for Binary-Weighted Capacitor Arrays," Proceedings of Design, Automation and Test in Europe, 2023.

Greatly reduce time and number of circuit simulations while providing better information regarding the impacts of excursions of individual parameters on various aspects of analog, mixed-signal, and radio-frequency analysis and design.

#### TECHNICAL APPROACH

Traditional transient adjoint sensitivity entailed running an adjoint circuit backward in space and time and then convolving all of its responses with those of the original entailing inordinate time and storage requirements. With the proposed approach, an overhead of approximately one additional simulation yields the sensitivities of all circuit responses to all circuit parameters. Three such simulations provide reasonably accurate exhaustive fault injections for test and yield analysis as well.

#### SUMMARY OF RESULTS

Overall accomplishments to May 2023 are detailed in the dissertations of Jiahua Li [3] and Nisharg Shah [4].

The first of these includes two applications that are not deliverables for this research project: time domain transient noise analysis, which better characterizes lowfrequency noise in nonlinear circuits; and forward Euler pseudo-transient analysis to attain dc convergence efficiently and reliably.

The second includes many applications that are deliverables for this project: parameter-centric and device-centric fault and yield analyses. Accuracy has been enhanced by adding two simulations to the originally anticipated one. Time domain transient sensitivity comes with an overhead of roughly one simulation. So, the equivalent of six simulations provides the information derived from tens or even hundreds of simulations entailed for the traditional Monte Carlo analyses.

Much more work in all the above areas must be performed to move them from academic proofs of concept to industrial applications.

Keywords: Adjoint Sensitivity, Transient, Fault, Test, Yield

INDUSTRY INTERACTIONS

NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Jiahua Li & Ron Rohrer, "Efficient Static-driven Integration for Step-function Transient Simulation," IEEE TCAD, vol. 41, no. 7, July 2022, pp 2213-2222.

[2] Jiahua Li, Danyal Ahsanullah, Zhengqi Gao & Ron Rohrer, "Circuit Theory of Time Domain Adjoint Sensitivity," accepted for publication in IEEE TCAD.

# TASK 3160.006, MACHINE-LEARNING BASED ANALOG MIXED-SIGNAL DESIGN TOOL

SHUO-WEI CHEN, UNIVERSITY OF SOUTHERN CALIFORNIA, SWCHEN@USC.EDU SANDEEP GUPTA AND TONY LEVI, UNIVERSITY OF SOUTHERN CALIFORNIA

#### SIGNIFICANCE AND OBJECTIVES

Analog mixed-signal (AMS) modules are typically human-designed. This custom-design process is expensive, leading to a long time-to-market, and suboptimal design. This task aims to develop low-cost AMS circuit modeling, sizing, and layout flow to achieve complete design automation from specification to GDS while achieving state-of-the-art performance.

#### TECHNICAL APPROACH

The whole AMS design automation flow can be divided into four main steps including dataset generation, circuit modeling, sizing, and layout. To reach the optimal design with the lowest possible design cost, we develop a specific algorithm for each step. For dataset generation, we apply CEPA and BOAS to find the optimal sample region. For circuit modeling, we are developing GNN-based modeling to utilize the circuit topology information and reduce the required training samples. With the circuit modeling, parallel Monte Carlo gradient descent is applied to find the optimal design and a layout automation tool is finally applied to generate the GDS.

#### SUMMARY OF RESULTS

For the first 6 months, we've spent most of our effort developing the GCN (Graph Convolution Network) based circuit modeling algorithm. Instead of regarding the circuit under modeling as a black-box function, this approach can utilize the circuit topology information to significantly improve the circuit modeling accuracy and successfully extrapolate the model to unseen design space for performance improvement. The graph that represents the circuit can be directly generated from the netlist, with transistor pins and passive devices represented as nodes, and wires as edges. Different from an FCNN which connects all circuit parameters, graph convolution only updates the node information based on neighboring nodes, mimicking signal propagation and loading effects in an AMS circuit. The GCN structure used for circuit modeling is shown in Fig. 1. The input graph has multiple layers, with each layer representing one input feature element of each node. Graph convolution is performed on the graph. Multiple layers of graph convolution can be applied to model direct and high-order loading effects. The output feature vectors from the graph nodes are then combined and flattened into a one-dimensional vector. Finally, several fully connected layers are used to process the features into final output performance metrics. Compared to existing FCNN and Circuit Connectivity Inspired Neural Network (CCINN) modeling approaches, the GCN can effectively extract circuit operation features, reduce unnecessary neuron connections, mitigate overfitting, and achieve significant modeling accuracy improvement (2.2x-4.5x) in various design examples. When extrapolating the GCN model to unseen design space, it maintains similar accuracy when predicting the performance of more optimized designs beyond the training dataset, indicating that the GCN model can be applied for circuit performance enhancement without the extra circuit simulation overhead.



Figure 1. GNN architecture used for circuit modeling.



Figure 2. (a) Testing MSE, (b) average gain prediction error, (c) average UGB prediction error of the GCN, FCNN, and CCINN in the amplifier modeling test.

**Keywords:** Analog Mixed-Signal, Circuit modeling, GNN, Graph convolution, Performance extrapolation

#### INDUSTRY INTERACTIONS

IBM, NXP

#### MAJOR PAPERS/PATENTS

[1] Mohsen Hassanpourghadi, Rezwan A Rasul, and Mike Shuo-Wei Chen, "A Module-Linking Graph Assisted Hybrid Optimization Framework for Custom Analog and Mixed-Signal Circuit Parameter Synthesis," ACM Transactions on Design Automation of Electronic Systems, Sep. 2021.

[2] Shiyu Su, Qiaochu Zhang, Juzheng Liu, Mohsen Hassanpourghadi, Rezwan Rasul, and Mike Shuo-Wei Chen, "TAFA: Design Automation of Analog Mixed-Signal FIR Filters Using Time Approximation Architecture," in IEEE/ACM 27th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2022.

## TASK 3160.007, AI-ASSISTED AND LAYOUT-AWARE ANALOG SYNTHESIS AND OPTIMIZATION WITH DESIGN INTENT DAVID PAN, UNIVERSITY OF TEXAS AT AUSTIN, DPAN@ECE.UTEXAS.EDU YAOYAO JIA, UNIVERSITY OF TEXAS AT AUSTIN

#### SIGNIFICANCE AND OBJECTIVES

Topology selection is the first and most crucial step in the analog circuit design process since the best circuit sizing and layout tool can only produce as good a result as the chosen topology allows. In this report, we summarize our work on circuit template library generation and circuit topology selection.

#### **TECHNICAL APPROACH**

To build circuit topology templates, we use the designer's knowledge coded in our flow that makes use of primitive circuit building blocks for constructing valid topologies. These topologies are then optimized using state-of-the-art analog sizing tools based on deep neural networks [1]. However, selecting the best topologies from a library of topologies is a multi-dimensional problem that requires a careful understanding of specification and topology performance. Recent advances in machine learning algorithms have enabled the embedding of very large information networks into low-dimensional vector spaces. We use the idea of graph embedding to select the best topologies from a set of topologies.

#### SUMMARY OF RESULTS

We have implemented the library generation part in Python, which is called the DNN-Opt framework [1], for optimal sizing of topologies. More than 300 OTA topologies have been built into the library. For each topology, we have about 1000 data points representing a wide range of performance choices. These data points have been filtered from the initial 1500 data points ensuring transistors are in saturation, an important requirement for analog circuits.

Our topology selection approach, implemented in C++, consists of two phases, dominant topology selection, based on a binary search algorithm, and topology embedding using a state-of-the-art graph embedding algorithm. All experiments are conducted on a Linux environment with an Intel Core 3.3-GHz CPU with 128-GB memory. Table 1 presents a case demonstrating our dominant topology selection approach. We aim to maximize the gain bandwidth (GBW) as the objective. By querying with a DC gain <= 80dB and IQ<sub>A</sub> <= 10uA, our algorithm predicts two topologies: PMOS input two-stage OTA followed by NMOS input two-stage OTA. We rank these topologies based on their objective specification, which, in this case, is DC gain. Note that, our algorithm

meets the designer's expectations and also provides choices.

Table 1. Test case and predicted results.

| Objectives<br>and<br>Constraints                             | Designer<br>Expected<br>Topology   | Model<br>Prediction | Performance<br>Values                                               |
|--------------------------------------------------------------|------------------------------------|---------------------|---------------------------------------------------------------------|
| Max. GBW,<br>Gain ≥ 80 <i>dB</i> ,<br>IQ <sub>A</sub> ≤ 10μA | Two-stage<br>common-<br>source OTA | PMOS<br>input       | Gain = 83.3dB,<br>$IQ_A = 9.8uA$ ,<br>GBW = 3864820                 |
|                                                              |                                    | NMOS<br>input       | Gain = $80.2$ dB,<br>IQ <sub>A</sub> = $8.4$ uA,<br>GBW = $1227590$ |

Next, our algorithm takes performance data points as input and outputs a graph to be fed to the embedding algorithms. Results for seven topologies are presented here. Note that depending on the strength of a topology, the number of data points can vary, as a stronger topology can have more data points spanning over a wider range of performance values. It is more likely to be the case that performance points from the same topology appear closer in the embedding space, for example, nodes from topology 2 appear together, and so do nodes from topology 3. The reason for this is that, for example, a single-stage amplifier would have very different gain, or bandwidth characteristics than a two-stage amplifier. In the embedding space, users can select good topologies around the dominant topology.



Figure 1. Example showing the embedding of seven topologies.

**Keywords:** Analog Amplifier Design, Topology Selection, Graph Embedding, Dimension Scaling, Binary Search

#### INDUSTRY INTERACTIONS

IBM, Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] A. Budak et al., "APOSTLE: Asynchronously Parallel Optimization for Sizing Analog...," ASPDAC, 2023.
[2] A. Budak et al., "Joint Optimization of Sizing and Layout for AMS Designs: Challenges and...," ISPD, 2023.

## TASK 3160.008, HIGH-SPEED DAC WITH HIGH OUTPUT POWER AND LINEARITY

SHUO-WEI CHEN, UNIVERSITY OF SOUTHERN CALIFORNIA, SWCHEN@USC.EDU

#### SIGNIFICANCE AND OBJECTIVES

Achieving high linearity and output power in high-speed DACs is a significant design challenge due to matching and dynamic errors. This necessitates a highly efficient DAC architecture with calibration techniques to achieve 20-50GS/s, >8b dynamic range in low-voltage CMOS which can support wideband signals with bandwidths up to half-Nyquist sampling rate.

#### **TECHNICAL APPROACH**

Different state-of-the-art power-DAC architectures were first explored to determine the theoretical limits of the proposed calibration techniques via analysis and MATLAB modeling. Then, we will design DAC prototypes in advanced CMOS nodes with superior linearity but modest output power. Finally, we will enhance the DAC's output power using digital pre-distortion techniques and improve power efficiency with sub-harmonic switching, multi-phase LO vector summation, and I/Q LO sharing. Switched-capacitor-based DACs will be implemented for high-speed mm-wave applications for their advantages over current-steering DACs, such as support for technology scaling, compatibility with complex envelope modulations, and improved linearity in advanced nodes.

#### SUMMARY OF RESULTS

The DAC employs a hybrid architecture with a tunable bandpass delta-sigma modulator for LSBs and MSBs combination at the output. This enables noise shaping around the RF signal's center frequency. The proposed BP-DSM can be programmed for carrier frequency and signal bandwidth, facilitating noise-notch filtering. The PA's LO signal is modulated using a non-uniform PWM pattern derived from a low-pass FIR filter impulse response, creating a bandpass filter response around the LO center frequency (known as time approximation filtering). Combined with the hybrid-DAC output, it further reduces out-of-band noise. The voltage mode class-D PA driver utilizes sub-harmonic switching, reducing dynamic losses and enhancing PA back-off efficiency. Phase interleaving and multi-phase LO switching techniques improve system efficiency, while I/Q LO sharing reduces overlap and maintains output power without compromising efficiency.



Figure 1. (top) Block-level Architecture of Proposed Power-DAC. (bottom) MATLAB Simulation Results of DAC Spectrum & EVM Plot.

We aim to investigate efficiency enhancement techniques like IQ cell sharing, utilizing 25% duty cycle LO. Feasibility will be assessed considering circuit challenges at mm-Wave frequencies. Digital techniques will be incorporated to improve sub-harmonic spur cancellation. These ideas will be implemented in the mm-wave PA prototype, with test structures employed for risk mitigation. Some concepts may have higher implementation overhead at mm-wave frequencies, necessitating further exploration of circuit implementation techniques.

**Keywords:** Hybrid digital-to-analog converter (DAC), time-approximation filter (TAF), sub-harmonic switching (SHS), calibration, efficiency enhancement

#### INDUSTRY INTERACTIONS

NXP, MediaTek

#### MAJOR PAPERS/PATENTS

[1] S. Su, M. S-W. Chen, "A 16-bit 12-GS/s Single-/Dual-Rate DAC With a Successive Bandpass Delta-Sigma Modulator Achieving <-67-dBc IM3 Within DC to 6-GHz Tunable Passbands," JSSC 2018.

[2] S. Su, M. S-W. Chen, "A Time-Approximation Filter for Direct RF Transmitter," JSSC 2021.

[3] A. Zhang, M. S-W. Chen, "A Watt-Level Phase-Interleaved Multi-Subharmonic Switching Digital Power Amplifier," JSSC 2019.

## TASK 3160.009, 100+GS/S TIME-DOMAIN ANALOG-TO-DIGITAL CONVERTERS

SAMUEL PALERMO, TEXAS A&M UNIVERSITY, SPALERMO@ECE.TAMU.EDU SEBASTIAN HOYOS, TEXAS A&M UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

Conventional successive approximation register (SAR) ADCs require a high interleaving factor due to their limited conversion speed, which inevitably increases implementation complexity and front-end loading. The time-domain (TD) ADC design techniques in this project aim to significantly improve efficiency at high sampling rates relative to their SAR counterparts.

#### TECHNICAL APPROACH

A new low-power time-interleaved TD-ADC is in development with advances in the interleaver, unit ADC, and calibration techniques. The ADC utilizes a TD-interleaver based on a voltage-to-time converter (VTC) that is capable of efficiently achieving high sample rates and novel techniques to improve unit ADC speed and efficiency, including a coarse time-to-digital converter (TDC), time residue generation block, time amplifier (TA), and fine TDC. Efficient TD-ADC calibration techniques for VTC gain and TDC time resolution mismatch, TA gain error, and time-interleaving errors are also in development.

#### SUMMARY OF RESULTS

Fig. 1 shows the proposed 112GS/s 7-bit timeinterleaved ADC that leverages a time-domain interleaver to improve efficiency at high sample rates. This TD-ADC architecture has 16-way interleaved Rank 1 T/H blocks that are partitioned into four groups driven by four parallel input buffers. These T/H blocks are clocked by  $f_s/8$  $\Phi_{T/H}$  pulses having a 25% duty cycle to avoid sampling crosstalk between the T/H channels in a group. The T/H sampled input voltages are delivered to two parallel VTCs that operate at  $f_s/32$  and generate two clock-like fullswing pulses that have a voltage-dependent time difference. These pulses are then buffered by inverters to



Figure 1. TD-ADC with time-domain interleaver.



Figure 2. VTC: (a) schematic, (b) transfer characteristic, and (c) INL performance.

drive a unit TDC that performs 7b quantization at the  $f_s/32$  rate, which is 3.5GS/s. For the complete 32-way interleaved ADC, the sample rate is 112GS/s. The time-domain interleaver provides significant improvements in energy efficiency and bandwidth mismatch robustness due to the ability to leverage inverter-based buffering.

Fig. 2 shows the VTC circuit that utilizes crossing detectors to achieve high linearity. The VTC outputs ( $T_P$  and  $T_N$ ) are pre-charged to  $V_{SS}$  by  $\Phi_D$  while the T/H circuit tracks the voltage inputs ( $V_{INP}$  and  $V_{INN}$ ). After the tracking phase is completed, the VTC starts to discharge the  $V_{SP}$  and  $V_{SN}$  nodes. The VTC output  $T_P$  ( $T_N$ ) has a rising-edge transition to  $V_{DD}$  when  $V_{SP}$  ( $V_{SN}$ ) crosses  $V_D$ . Linear voltage-to-time conversion is achieved by discharging with a constant current. Preliminary simulation results at a 3.5GS/s sample rate show a linear 0.35ps/mV VTC conversion gain, which corresponds to 1.92ps/LSB for a 7b ADC having a 700mV<sub>ppd</sub> input dynamic range. A maximum time error of less than 0.7ps (0.36LSB) is achieved over the input dynamic range.

**Keywords:** Analog-to-digital converter, serial link, timedomain circuits, time-to-digital converter, voltage-to-time converter

INDUSTRY INTERACTIONS Intel, NXP, MediaTek

A synthesizable PLL is a crucial building block for any modern communication systems and digital SoC's. A lowphase noise PLL utilizes unscalable inductors. Inductors don't scale with technology scaling and require significant design time. Using a ring-based oscillator alleviates the abovementioned concerns, but comes with the cost of high-phase noise.

#### TECHNICAL APPROACH

We target to divide the PLL into two portions. The first is the synthesizable part utilizing regular PnR and the second portion uses a machine learning design along with an analog layout generator as in Fig. 1. The input reference clock is mostly clean and does not require a high-order digital filter, however, the TDC quantization noise is limited by the resolution imposing a tougher requirement on the digital filter. Thus, creating a TDC with a fine resolution will help in reducing the order of the digital filter potentially reducing power and area.

#### SUMMARY OF RESULTS

Analog layout generator scripts were developed and a couple of test cases showed promising results. A time comparator was imported as a CDL netlist and the scripts provided a layout that is LVS/DRC clean with comparable parasitics values to the analog layout created by the designers. A new TDC architecture was proposed that could achieve the target high resolution. The MSB of the TDC is built using selective delay tuning (SDT), while the LSB is constructed with a vernier structure where one side has buffers that have a slightly lower supply voltage. Shifting the supply for an inverter provides fine resolution, however, it comes with the added cost of an additional LDO to provide the shift supply voltage.





Figure 1. (a) Architecture of Proposed Segmentation between Analog and Digital portions of the PLL. (b) Injection locking PLL with highlighted blocks under investigation.

The usage of floating inverter amplifier here maybe led to a cost-effective solution, but this is still under investigation. We aim to explore different directions in optimizing the DTC and TDC to reach a novel phase noise and spur level while maintaining the entire PLL synthesizable with reasonable area and power consumption.

**Keywords:** Phase-Locked-Loop (PLL), Multiplying-Delay-Locked-Loop (MDLL), Injection Locking

#### INDUSTRY INTERACTIONS

IBM, Intel, MediaTek, NXP

#### MAJOR PAPERS/PATENTS

[1] Qiaochu Zhang, Hsiang-Chun Cheng, Shiyu Su, and Mike Shuo-Wei Chen, "A Fractional-N Digital MDLL with Injection Error Scrambling and Background Third-Order DTC Delay Equalizer Achieving –67dBc Fractional Spur," in *IEEE International Solid-State Circuits Conference (ISSCC)*, Feb. 2023.

[2] Qiaochu Zhang, Shiyu Su, Cheng-Ru Ho, and Mike Shuo-Wei Chen, "A Fractional-N Digital MDLL with Background Two-Point DTC Calibration Achieving -60dBc Fractional Spur," in *IEEE International Solid-State Circuits Conference (ISSCC)*, Feb. 2021.

#### SIGNIFICANCE AND OBJECTIVES

Sub-rate serial link transceivers address bandwidth limitations but require routing a high-frequency clock signal exceeding 14GHz. Our objective is to overcome this drawback by designing a fractional frequency multiplier and multi-phase generator that can achieve low jitter (> 100fs<sub>r.m.s</sub>), as well as tight phase matching (< 200fs<sub>pk-to-pk</sub>).

#### **TECHNICAL APPROACH**

We introduce an innovative method for achieving phase-locking using high-gain sampling detectors and lownoise/spur ring oscillators (RO) accompanied by a supply regulation mechanism. This approach ensures reliable operation at frequencies exceeding 14GHz, even under varying process, voltage, and temperature (PVT) conditions. The proposed PLL incorporates a wide bandwidth for suppressing RO noise and employs precise digital quantization error cancellation techniques to achieve exceptional jitter performance. Additionally, the PLL offers accurate eight-phase output through phasespacing error correction methods, enhancing its overall performance.

#### SUMMARY OF RESULTS

Fig. 1 illustrates a simplified block diagram of the proposed type-III integer-N multi-phase generator. The key components include a high-gain sampling phase detector (PD), a loop filter, a low noise voltage-controlled ring oscillator (VCO), and a feedback frequency divider. This design offers several advantages. The high-gain PD ensures a remarkably low in-band noise level (< -135 dBc) and extends the PLL bandwidth to effectively suppress the noise generated by the ring oscillator (RO).



Figure 1. Proposed phase-locked loop.

The proportional path within the system introduces a zero to enhance the stability of the loop. Simultaneously, the integral path biases the PD with a predetermined voltage,  $V_{REF1}$ , significantly improving the reference spur

and deterministic jitter performance. To compensate for significant variations in the VCO output frequency caused by process, voltage, and temperature (PVT) effects, the double-integral path biases the output of the first integrator at  $V_{REF2}$ .

We have developed a behavioral model for the proposed Phase-Locked Loop (PLL) to assess its noise characteristics. Initially, we conducted transistor-level design for various components, such as the sampling PD and the VCO. This allowed us to obtain Power Spectral Densities (PSDs) for different noise sources. Subsequently, these PSD values were integrated into the behavioral model to evaluate the PLL's phase noise performance. The outcomes of these simulations are depicted in Fig. 2.

The results indicate that the proposed PLL has the potential to achieve jitter below 100 fs at 14 GHz output frequency, even when utilizing a ring VCO. However, a low-noise, high-frequency clock with a frequency of approximately 500 MHz is required to achieve such performance. We believe this clock signal is readily accessible in a high-speed serializer-deserializer system.



Figure 2. Simulated PLL phase noise plot at 14-GHz output frequency.

**Keywords:** High-speed serial links, low-jitter ring-based sampling phase-locked loops, multi-phase generators

INDUSTRY INTERACTIONS

Intel

#### MAJOR PAPERS/PATENTS

#### [1]

#### SIGNIFICANCE AND OBJECTIVES

Cryogenic-CMOS has emerged as a promising solution for power efficient High-Performance Computing systems. Monolithically integrated low-latency and large-capacity on-chip memory operating reliably at cryogenic conditions is essential to circumvent the memory wall. We aim to investigate reliable, low-voltage cryogenic memory circuits by leveraging contention-free, pseudo-static bitcells, and flip-flops.

#### **TECHNICAL APPROACH**

Cryogenic CMOS offers steeper subthreshold swing, extremely low leakage, enhanced mobility, low interconnect resistance, and improved reliability. Leveraging these attributes for high-density embedded DRAM (eDRAM) at 77K can offer lower cooling costs than 4K and a long retention time due to significantly lower leakage current. Our approach leverages a high ON/OFF current ratio, high read/write speed, low operating voltage, non-destructive read, and FinFET process compatibility of gated thyristor-based capacitor-less RAM (TRAM) for future cryogenic memory applications. First, we present important physics-based insights regarding the operation of a TRAM cell and device optimization for cryogenic applications through comprehensive TCAD simulations. Second, we propose lower bandgap Germanium (Ge) channel TRAMs to reduce the operating voltage further.

#### SUMMARY OF RESULTS

The TRAM device on Silicon-on-Insulator (SOI), as shown in Fig. 1(a), is realized with Synopsys Sentaurus structure editor tool. Fig. 1(b) shows the simulated static bi-stable I-V curve at fixed  $V_G$ =-1.5V for a wide temperature range from 300K to 77K. The static  $V_{H}$ determines the operating voltage of TRAM in the presence of a fast gate pulse transition. With the calibrated simulation setup, the data retention of Silicon TRAM (Si-TRAM) is studied, and the leakage mechanisms are analyzed for different base doping. It is observed that as base-doping increases, data 1 retention time decreases due to higher recombination of carriers (fig. 1(c)), and data 0 retention time increases due to increased barrier height leading to lower diffusion of carriers (fig. 1(d)). Fig. 1(e) illustrates the bistable hysteretic characteristics of the Ge thyristor device at different temperatures from 300K to 77K. It is observed that both  $V_{FB}$  and  $V_{H}$  of the Ge-TRAM are lower compared to Si-TRAM, indicating GeTRAM's lower operating voltage. The Ge-TRAM can be operated at 0.6V at 300K and 0.75V at 77K using early cell turn-on. Although data retention in Ge-TRAM can be very sensitive to random dopant fluctuations at 300K, more than  $10^5$  s of retention time for data '1' and data '0' is observed at 77K (fig. 1(f)).



Figure 1. (a) The TRAM structure on SOI. (b) static bi-stable I-V characteristics of Si-TRAM. Modulation of p-base electrostatic potential over (c) data '1' retention and (d) data '0' retention. (e) Satic I-V characteristics of Ge-TRAM indicating lower voltage of operation. (f) Long (>10<sup>5</sup>s) data '1' and data '0' retention time of Ge-TRAM at 77K. I(k) = cell current.

Our next task will further simulate circuit behaviors of pseudo-static memory bitcells, flip-flop and logic circuits in low operating voltage and low temperature down to 77K to illustrate feasibility of using baseline memory and logic component in cryogenic computing.

**Keywords:** Cryogenic Memory, Capacitor-less 1-T DRAM, Thyristor, Pseudo CMOS logic, data retention

#### INDUSTRY INTERACTIONS

IBM, Intel

#### MAJOR PAPERS/PATENTS

[1] Saikat Chakraborty and Jaydeep P. Kulkarni, DRC 2022, doi: 10.1109/DRC55272.2022.9855655.

[2] S. S. Teja Nibhanupudi, Siddhartha Raman Sundara Raman, et al., JxCDC, doi: 10.1109/JXCDC.2021.3130839.

#### SIGNIFICANCE AND OBJECTIVES

The objective is to develop circuits and system-level solutions to improve the overall performance of the ADCs in scaled CMOS nodes. We aim to target categories (1) Nanowatt power compact CT- $\Delta\Sigma$  analog-to-digital converters (ADCs) using direct VCO chopping for sensor interface and (2) CT- $\Delta\Sigma$  ADCs using time/frequency/phase.

#### **TECHNICAL APPROACH**

The use of time/frequency has shown many advantages over that of voltage-based circuits, especially in nanometer CMOS nodes. In most recent works, time/phase or frequency has been used as an integral part of ADCs, often used in the backend quantizer of  $\Delta\Sigma$  ADCs. However, these techniques can be extended to provide various other benefits. Our approach is to explore (1) the direct benefits of time/frequency in  $\Delta\Sigma$  ADCs and (2) explore additional benefits that can be obtained using the abovementioned approaches.

#### SUMMARY OF RESULTS

Our first work is based on VCO-based chopping recently presented at ESSCIRC targeting nanowatt power sensor interface [1]. Traditionally, VCO's have been well utilized as a quantizer in  $\Delta\Sigma$  ADCs to provide 3-6 bits of quantization as well as additional order of noise-shaping. In this work, we leveraged the VCO quantizer to not only quantize the input signal but also to perform chopping signals for the front-end digital-to-analog converter (DAC). The proposed structure uses multi-path chopping as its core. In the multi-path chopping, the chopping stage is divided into M-equal paths in parallel, and M chopping clocks are needed with a phase shift of  $2\pi/M$ , each chopping 1 out of M paths. This technique spreads the chopping-induced modulated tones and will reduce their magnitude. However, it requires many additional clock cycles therefore increasing the circuit complexity and power consumption.

Illustrated in Fig. 1, we leverage the VCO quantizer to provide the chopping clocks for each slice of the amplifier. In other words, the front-end amplifier is divided into 32 slices, with a 32-phase VCO quantizer where each VCO output directly controls the chopping of one slice of the amplifier. Therefore, the output spectra are modulated in a spread spectrum form as illustrated in Fig. 2. In the prototype measurement, the options included traditional chopping, multi-path chopping as well and proposed VCO-based chopping. The measured intermodulation tone magnitude reduction was -44dB and -36dB compared to

traditional and multi-path chopping, respectively. Our future work will aim to further reduce the intermodulation tones by using a two-step quantizer, therefore eliminating the signal component in the VCO which can cause in-band harmonics.



Figure 1. (top) Conventional chopping signal chain (bottom) proposed VCO based chopping resulting in spread spectrum modulation with reduced out of band tones.



Figure 2. Measured spectra comparison between (left) conventional and multi-path (right) VCO-based chopping.

**Keywords:** Noise-Shaping, VCO-Based, Chopping, Delta-Sigma, Analog-to-Digital Converter

#### INDUSTRY INTERACTIONS

MediaTek, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Y. Chen et al., "A Direct Sensor Readout Circuit Using VCO-Driven Chopping with 42dB SNR at 800µVpp Input," ESSCIRC 2202, Milan, Italy.

#### SIGNIFICANCE AND OBJECTIVES

Synthesis, Auto-Place, and Route (SAPnR) dominate SoC methodology, allowing designers to focus on optimizations. However, the automated generation of IVR domains, with VRs and FLL/PLLs, is critically missing. We seek to design a *domain compiler* – an automated framework to produce a comprehensive domain, including FLL/PLLs and VRs.

#### **TECHNICAL APPROACH**

The effort is organized into two thrusts. **Thrust 1**, Synthesizable UniCaP: We will build upon our V<sub>dd</sub> -droop tolerant and fast-response UniCaP-2 construction (Fig 1(b)) to explore and develop a framework that automates the construction of robust, larger, all-digital domains. User-provided constraints (Fig. 1(a)) are used to develop a unified system. **Thrust 2**: Autonomous, all-digital run-time VR loop-gain tuning: We will ensure optimal transient response across PVT conditions, thereby overcoming the problem of poor performance due to margining for worstcase PVT conditions. In the context of UniCaP, improved VR response minimizes performance loss from FIFO saturation, and margins due to memory V<sub>min</sub> constraints.

#### SUMMARY OF RESULTS



Figure 1. (a) Overview of proposed Domain Compiler (b) Simplified schematic of the proposed architecture consisting of integrated LDO/PLL modules in addition to the load domain.

The focus of our effort in Year 1 is the design of an autonomous loop-gain tracker to enable Auto-tuning of the LDO loop gain. Progress on this front has been on track. Fig. 2 shows the schematic of the proposed replicabased I<sub>LSB</sub> sensing architecture. The technique relies on copying the drain and source terminal voltages on the LDO header to a replica device. A digital comparator + chargepump construction is used to provide the high loop-gain needed to minimize tracking error. The current flow through the replica is reflected into a current controlled oscillator (CCO) which integrates the load current into a count value for inference of loop gain. Mismatch errors in between mirror devices and within the comparator are addressed using chopping and 2-pass operation, respectively.



Figure 2. Block diagram of the proposed loop gain detector. The approach relies on the observation that loop gain is dominated by  $PV_{in}V_{out}T$  changes in  $I_{LSB}$  to approximate total LDO loop gain, and use it to adjust  $K_1$  and  $K_P$  parameters for stable, rapid LDO response.



Figure 3. Simulation waveforms showing the 2-pass approach to overcome offset errors in the comparator. The input terminals to the comparator are toggled before the second measurement pass is implemented, resulting in  $V_{dd,replica}$  (blue) settling on either side of the actual  $V_{dd}$  (red) by an amount of the comparator offset. The resulting average of these measurements delivers  $V_{dd}$  with adequate precision.

Fig. 3 shows simulation results of our proposed sensing architecture. Monte-Carlo simulations have demonstrated an ability to track the actual  $I_{LSB}$  to within 3% (3 $\sigma$ ), adequate for our application. Our upcoming effort will focus on functional classification of the LDO header, TRO and TDC circuits.

Keywords: Model-predictive control, Voltage Regulation

#### INDUSTRY INTERACTIONS

AMD, IBM, Intel, NXP, Texas Instruments

MAJOR PAPERS/PATENTS

#### **APPENDIX I PUBLICATIONS OF TXACE RESEARCHERS**

## **Conference Publications**

- Sanyal, S., Hazra, A., Dasgupta, P., Morrison, S., Surendran, S., Balasubramanian, L. and Rahman, M. M. (2023). Analog Coverage-driven Selection of Simulation Corners for AMS Integrated Circuits. 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), Antwerp, Belgium, pp. 1-6, IEEE.
- Steenhoek, A., Farayola, P.O., Bruce, I., Chaganti, S.K., Obaidi, A.O., Sheikh, A., Ravi, S. and Chen,
   D. (2022). Graph Theory Approach for Multi-site ATE Board Parameter Extraction. 2022 IEEE European Test Symposium (ETS), Barcelona, Spain, pp. 1-2, IEEE.
- [3] Farayola, P.O., Bruce, I., Chaganti, S.K., Sheikh, A., Ravi, S. and Chen, D. (2022). Optimal Order Polynomial Transformation for Calibrating Systematic Errors in Multisite Testing. *2022 IEEE International Test Conference (ITC), Anaheim, CA, USA*, pp. 509-513, IEEE.
- [4] Farayola, P.O., Bruce, I., Chaganti, S.K., Sheikh, A., Ravi, S. and Chen, D. (2022). Cross-Correlation Approach to Detecting Issue Test Sites in Massive Parallel Testing. 2022 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), Austin, TX, USA, pp. 1-6, IEEE.
- [5] Lee, S., Kang, T., Song, S., Kwon, K. and Flynn M. (2022). An 81.6dB SNDR 15.625MHz BW 3rd Order CT SDM with a True TI NS Quantizer. 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Honolulu, HI, USA, pp. 54-55, IEEE.
- [6] Hu, H., Vesely, V. and Moon, U.-K. (2022). Passive Third Order Continuous-Time ΔΣ Modulator with Q Enhancement Technique. 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, pp. 1234-1238, IEEE.
- [7] Hu, H., Vesely, V. and Moon, U.-K. (2022). Ultra-Low OSR Calibration Free MASH Noise Shaping SAR ADC. *2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA*, pp. 1244-1248, IEEE.
- [8] Chen, X., Shoukry, A., Jia, T., Zhang, X., Magod, R., Desai, N. and Gu, J. (2023). A 65nm Fullyintegrated Fast-switching Buck Converter with Resonant Gate Drive and Automatic Tracking. 2023 IEEE Custom Integrated Circuits Conference (CICC), San Antonio, TX, USA, pp. 1-2, IEEE.
- [9] Datta, K., Mclaughlin, P., Stauth, J.T. (2022). High-Frequency Resonant Switched-Capacitor Converters with Multi-Winding Current Ballasting: Analysis and Optimization. 2022 IEEE 23rd Workshop on Control and Modeling for Power Electronics (COMPEL), Tel Aviv, Israel, pp. 1-8, IEEE.
- [10] Avci, M.E., Ozev, S. and Kumar, Y.C. (2022). Fast RF Mismatch Calibration Using Built-in Detectors. 2022 IEEE 40th VLSI Test Symposium (VTS), San Diego, CA, USA, pp. 1-7, IEEE.
- [11] Shahriari, B. and Najm, F.N. (2023). Fast Electromigration Simulation for Chip Power Grids. 2023 24th International Symposium on Quality Electronic Design (ISQED), San Francisco, CA, USA, pp. 1-8, IEEE.
- [12] Bhatheja, K., Chaganti, S., Chen, D., Jin, X.R., Dao, C.C., Ren, J., Kumar, A., Correa, D., Lehmann, M., Rodriguez, T. and Kingham, E. (2022). Low cost high accuracy stimulus generator for on-chip spectral testing. 2022 IEEE International Test Conference (ITC), Anaheim, CA, USA, pp. 514-518, IEEE.
- [13] Bhatheja, K., Strong, M. and Chen, D. (2022). Level Shifters for Charge Constrained Applications. 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, pp. 1203-1204, IEEE.

- [14] Banahene, K., Strong, M., Gadogbe, B., Chen, D. and Geiger, R. (2022). Hardware Security Vulnerability in Analog Signal Chain Filters. 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 667-671, IEEE.
- [15] Ganji, M., Saikiran, M. and Chen, D. (2022). A Wide-Range Low-cost Temperature to Digital Converter Independent of Device Models. 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, pp. 605-609, IEEE.
- [16] Saikiran, M., Ganji, M. and Chen, D. (2022). Robust Built-in Defect-Detection for Low Drop-Out Regulators using Digital Mismatch Injection. 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, pp. 1580-1584, IEEE.
- [17] Saikiran, M., Ganji, M. and Chen, D. (2022). A Time-Efficient Defect Simulation Framework for Analog and Mixed Signal (AMS) Circuits. 2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI), Porto Alegre, Brazil, pp. 1-6, IEEE.
- [18] Saikiran, M., Ganji, M. and Chen, D. (2022). Digital Defect-Oriented Test Methodology for Flipped Voltage Follower Low Dropout (LDO) Voltage Regulators. 2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI), Porto Alegre, Brazil, pp. 1-6, IEEE.
- [19] Blanche, P.A., Zhang, T. and Draper, C. (2022). Holographic curved waveguide combiner for AR/HUD with 2D pupil expansion. ODS 2022: Industrial Optical Devices and Systems, San Diego, California, USA, vol. 12231, pp. 45-49, SPIE.
- [20] Ketchum, R.S., Zhang, T. and Blanche, P.A. (2022). Camera feedback optimization of computergenerated holograms displayed by the Texas Instruments phase light modulator for AR/HUD applications. ODS 2022: Industrial Optical Devices and Systems, San Diego, California, USA, vol. 12231, pp. 50-54, SPIE.
- [21] Blanche, P.A., Zhang, T. and Draper, C.T. (2022). 2D Pupil Expansion in Plastic Curved Holographic Waveguide Combiner for AR/HUD. *Digital Holography and Three-Dimensional Imaging, Cambridge, UK*, pp. M5A-1, Optica Publishing Group.
- [22] Zhang, T., Draper, C., Blanche, P. and Kaneda, Y. (2023). 2D curved holographic waveguide combiner for augmented reality with pupil expansion. *Optical Architectures for Displays and Sensing in Augmented, Virtual, and Mixed Reality (AR, VR, MR) IV, San Francisco, California, USA,* vol. 12449, pp. 323-327, SPIE.
- [23] Zhang, T. and Kaneda, Y. (2023). Method for large field of view and eye-box of holographic waveguide display based on LED illumination. *Practical Holography XXXVII: Displays, Materials,* and Applications, San Francisco, California, USA, vol. 12445, pp. 181-185, SPIE.
- [24] Takashima, Y., Deng, X., Tang, C.I., Kang, E., Choi, H., Chan, J.C.W., Chen, J., Friedman, B., Lee, T.L.T., Liu, P. and Nero, G.M. (2023). AR and lidar applications enabled by beam and image steering by MEMS SLM. *Practical Holography XXXVII: Displays, Materials, and Applications, San Francisco, CA, USA*, vol. 12445, pp. 88-93, SPIE.
- [25] Nero, G., Pei, Y., Zhang, T., Deng, X., Tang, C.I., Lieu, P., Lee, T., Lunin, M. and Takashima, Y. (2023). Field-of-view expansion via diffractive image steering and prism array. Optical Architectures for Displays and Sensing in Augmented, Virtual, and Mixed Reality (AR, VR, MR) IV, San Francisco, CA, USA, vol. 12449, pp. 328-333, SPIE.
- [26] Pei, Y., Nero, G., Zhang, T., Deng, X., Lieu, P., Lee, T., Lunin, M. and Takashima, Y. (2023). Evaluation of pulsed laser sources for a solid-state diffractive image steering and foveation by Texas Instruments digital micromirror device. *Optical Architectures for Displays and Sensing in Augmented, Virtual, and Mixed Reality (AR, VR, MR) IV, San Francisco, CA, USA*, vol. 12449, pp. 319-322, SPIE.

- [27] Chan, J., Tang, C.I., Deng, X., Lee, T. and Takashima, Y. (2023). Wide field of view real-time flash DMD-lidar with 2D multi-pixel photon counter. *Emerging Digital Micromirror Device Based Systems and Applications XV, San Francisco, CA, USA*, vol. 12435, pp. 86-87, SPIE.
- [28] Deng, X., Tang, C.I. and Takashima, Y. (2023). Beam tracking and image steering by Texas Instruments phase light modulator based on camera input for lidar and AR applications. *Emerging Digital Micromirror Device Based Systems and Applications XV, San Francisco, CA, USA*, vol. 12435, pp. 88-93, SPIE.
- [29] Raghu, V.B., Deng, X., Kang, E., Tang, C.I. and Takashima, Y. (2022). Large etendue laser beam steering by 2D MEMS resonant mirror and digital micromirror device for time-of-flight lidar and AR display. ODS 2022: Industrial Optical Devices and Systems, San Francisco, CA, USA, vol. 12231, pp. 79-85, SPIE.
- [30] Chan, J.C.W., Tang, C.I., Deng, X. and Takashima, Y. (2022). DMD-based diffractive FOV expansion for real-time flash lidar with 2D multi-pixel photon counter. *ODS 2022: Industrial Optical Devices and Systems, San Francisco, CA, USA*, vol. 12231, pp. 86-91, SPIE.
- [31] Jogalekar, A.N., Medina, O.F., Blanchard, A., Henderson, R., Iyer, M.K., Ali, H., Murugan, R. and Tang, T. (2022). Methods to Characterize Radiation Patterns of WR5 Band Integrated Antennas in a Flip-Chip Enhanced QFN Package. 2022 IEEE 31st Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS), San Jose, CA, USA, pp. 1-3, IEEE.
- [32] Jogalekar, A.N., Medina, O.F., Blanchard, A., Henderson, R., Iyer, M.K., Tang, T., Murugan, R. and Ali, H. (2022). Slot Bow-Tie Antenna Integration in Flip-Chip and Embedded Die Enhanced QFN Package for WR8 and WR5 Frequency Bands. 2022 IEEE 72nd Electronic Components and Technology Conference (ECTC), San Diego, CA, USA, pp. 371-376, IEEE.
- [33] Medina, O., Jogalekar, A., Henderson, R. and Iyer, M. (2022). Design and Testing Methodology for Broadband Antennas-in-Package Operating in the WR8 and WR5 Frequency Bands. *SRC Techcon, Austin, TX*, SRC.
- [34] Jogalekar, A., Medina, O., Henderson, R. and Iyers, M. (2022). Slot Bow-Tie Antenna in Flip Chip and Embedded Die Enhanced QFN Package for mmWave Applications. *SRC Techcon, Austin, TX,* SRC.
- [35] Niranjan, V.A., Neethirajan, D., Xanthopoulos, C., Webster, D., Nahar, A. and Makris, Y. (2023). Machine Learning-Based Adaptive Outlier Detection for Underkill Reduction in Analog/RF IC Testing. 2023 IEEE 41st VLSI Test Symposium (VTS), San Diego, CA, USA, pp. 1-7, IEEE.
- [36] Ware, E., Correll, J., Lee, S. and Flynn, M. (2022). 6GS/s 8-channel CIC SAR TI-ADC with Neural Network Calibration. ESSCIRC 2022-IEEE 48th European Solid State Circuits Conference (ESSCIRC), Milan, Italy, pp. 325-328, IEEE.
- [37] Hardy, C., Pham, H., Jatlaoui, M.M., Voiron, F., Xie, T., Chen, P.H., Jha, S., Mercier, P. and Le, H.P. (2023). 11.1 A Scalable Heterogeneous Integrated Two-Stage Vertical Power-Delivery Architecture for High-Performance Computing. 2023 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, pp. 182-184, IEEE.
- [38] Nguyen, D.T., Deng, C., Macias, E. and Hanson, A.J. (2022). Synchronously Switched Active EMI Filter. *2022 IEEE Energy Conversion Congress and Exposition (ECCE), Detroit, MI, USA*, pp. 1-8, IEEE.
- [39] Qiao, B., Kayyil, A.V. and Allstot, D.J. (2022). An Eight-core Class-G Switched-capacitor Power Amplifier with Eight Power Backoff Efficiency Peaks. *2022 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), Denver, CO, USA*, pp. 251-254, IEEE.
- [40] Najim, N., Kayyil, A., Allstot, D. and Paramesh, J. (2022). Machine learning techniques for digital pre-distortion in CMOS switched-capacitor power amplifiers. *SRC Techcon, Austin, TX*, SRC.

- [41] Han, W., Chen, C., Liu, J. and Lee, H. (2022). An 80A 48V-Input Capacitor-Assisted Dual-Inductor Hybrid Dickson Converter for Large-Conversion Ratio Applications. 2022 IEEE Energy Conversion Congress and Exposition (ECCE), Detroit, MI, USA, pp. 1-5, IEEE.
- [42] Khan, M.R., Zhang, X. and Huang, C. (2022). Analytical Comparison of 3-Level 2-Phase and Double-Step-Down Topologies for Integrated High-Ratio DC-DC Converters in BCD and GaN Process. 2022 IEEE Energy Conversion Congress and Exposition (ECCE), Detroit, MI, USA, pp. 1-8, IEEE.
- [43] Lin, Y.H., Chao, T.Y., Hsiao, S.C., Huang, Y.J., Liang, Y.J., Tsai, J.H., Alsuraisry, H. and Huang, T.W. (2022). A 24-GHz 65-nm CMOS 3-D Radial and Vertically Stacked Transmitter Front-End IC for Vital-sign Detection Radar Applications. 2022 Asia-Pacific Microwave Conference (APMC), Yokohama, Japan, pp. 282-284, IEEE.
- [44] Huang, B.W., Fu, Z.H. and Lin, K.Y. (2022). A Millimeter-Wave Ultra-Wide Band Power Amplifier in 0.15-μm GaAs pHEMT for 5G Communication. 2022 Asia-Pacific Microwave Conference (APMC), Yokohama, Japan, pp. 97-99, IEEE.
- [45] Huang, B.W., Fu, Z.H. and Lin, K.Y. (2022). A 28/38 GHz Dual-Band and Dual-Mode CMOS Power Amplifier Using Constant Optimal Load Impedance Method. 2022 Asia-Pacific Microwave Conference (APMC), Yokohama, Japan, pp. 100-102, IEEE.
- [46] Chen, Y.M., Lee, L.Y., Chen, T.H., Wu, C.S. and Wang, H. (2022). A Ka-band High-OP 1dB Power Amplifier in 0.15-μm GaN Process for 5G Base Station and Satellite Communications Applications. 2022 Asia-Pacific Microwave Conference (APMC), Yokohama, Japan, pp. 593-595, IEEE.
- [47] Wang, Y., Tripathi, P., Lei, H.W. and Wang, H. (2022). Broadband SPST Switches in 250-nm GaN HEMT Process. 2022 Asia-Pacific Microwave Conference (APMC), Yokohama, Japan, pp. 680-682, IEEE.
- [48] Ng, Y.S., Wang, Y., Khyalia, S.K., Chen, C.N., Tang, T.C., Chang, Y.W., Lu, H.C., Huang, T.W. and Wang, H. (2022). A 38-GHz Millimeter Wave Transmission System for Unmanned Aerial Vehicle in 65 nm CMOS. 2022 17th European Microwave Integrated Circuits Conference (EuMIC), Milan, Italy, pp. 181-184, IEEE.
- [49] Huang, W.Z., Li, M.H., Ng, Y.S., Wang, Y. and Wang, H. (2022). A V-band Double-Transformer-Coupling and Current Steering VGLNA in 90-nm CMOS. 2022 17th European Microwave Integrated Circuits Conference (EuMIC), Milan, Italy, pp. 134-136, IEEE.
- [50] Chen, S.Y., Wu, C.S., Lee, L.Y., Chen, T.H., Chen, Y.M. and Wang, H. (2022). 25-31-GHz Low Noise Amplifiers in 0.15-μm GaN/SiC HEMT Process. 2022 IEEE International Symposium on Radio-Frequency Integration Technology (RFIT), Busan, Korea, Republic of, pp. 27-29, IEEE.
- [51] Tsai, W.H., Wang, Y. and Wang, H. (2022). A Switchless Bidirectional Distributed Amplifier with 2-dB Gain Variation in One Decade Frequency Range in 90-nm CMOS Process. 2022 IEEE International Symposium on Radio-Frequency Integration Technology (RFIT), Busan, Korea, Republic of, pp. 71-73, IEEE.
- [52] Karmokar, N., Harjani, R. and Sapatnekar, S.S. (2023). Minimum unit capacitance calculation for binary-weighted capacitor arrays. 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), Antwerp, Belgium, pp. 1-2, IEEE.
- [53] Sekyere, M., Saikiran, M. and Chen, D. (2022). All Digital Low-Cost Built-in Defect Testing Strategy for Operational Amplifiers with High Coverage. 2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS), Torino, Italy, pp. 1-5, IEEE.
- [54] Arunachalam, A., Kizhakkayil, A., Kundu, S., Raha, A., Banerjee, S., Jin, R., Su, F. and Basu, K. (2022). Unsupervised Learning-based Early Anomaly Detection in AMS Circuits of Automotive SoCs. 2022 IEEE International Test Conference (ITC), Anaheim, CA, USA, pp. 229-238, IEEE.

- [55] Shahriari, B. and Najm, F.N. (2023). Fast Electromigration Simulation for Chip Power Grids. 2023 24th International Symposium on Quality Electronic Design (ISQED), San Francisco, CA, USA, pp. 1-8, IEEE.
- [56] Kteyan, A., Sukharev, V., Volkov, A., Choy, J.H., Najm, F.N., Yi, Y.H., Kim, C.H. and Moreau, S. (2023). Electromigration Assessment in Power Grids with Account of Redundancy and Non-Uniform Temperature Distribution. *Proceedings of the 2023 International Symposium on Physical Design*, pp. 124-132, ACM.
- [57] Yi, Y.H., Kim, C., Zhou, C., Kteyan, A. and Sukharev, V. (2023). Studying the Impact of Temperature Gradient on Electromigration Lifetime Using a Power Grid Test Structure with On-Chip Heaters. 2023 IEEE International Reliability Physics Symposium (IRPS), Monterey, CA, USA, pp. 1-5, IEEE.
- [58] Li, C., Afshar, M. and B. Akin (2023). Fault Detection in Small Fan Motors Using MCSA. 2023 IEEE International Electric Machines & Drives Conference (IEMDC), San Francisco, CA, USA, 2023, pp. 1-7, IEEE.
- [59] Budak, A.F., Zhu, K., Chen, H., Poddar, S., Zhao, L., Jia, Y. and Pan, D.Z. (2023). Joint optimization of sizing and layout for AMS designs: Challenges and opportunities. *Proceedings of the 2023 International Symposium on Physical Design*, pp. 84-92, ACM.

# **Journal Publications**

- [1] Farayola, P.O., Oko-Odion, E., Chaganti, S.K., Sheikh, A., Ravi, S. and Chen, D. (2023). Site-to-Site Variation in Analog Multisite Testing: A Survey on Its Detection and Correction. *IEEE Design & Test*, vol. 40, no. 5, pp. 52-61.
- [2] Bruce, I., Farayola, P.O., Chaganti, S.K., Sheikh, A., Ravi, S. and Chen, D. (2023). A Weighted-Bin Difference Method for Issue Site Identification in Analog and Mixed-Signal Multi-Site Testing. J Electron Test, vol. 39, pp. 57–69.
- [3] Farayola, P.O., Bruce, I., Chaganti, S.K., Sheikh, A., Ravi, S. and Chen, D. (2022). A Polynomial Transform Method for Hardware Systematic Error Identification and Correction in Semiconductor Multi-Site Testing. J Electron Test, vol. 38, no. 6, pp. 637–651.
- [4] Mehta, Y., Thomas, S. and Babakhani, A. (2022). A 140–220-GHz Low-Noise Amplifier With 6-dB Minimum Noise Figure and 80-GHz Bandwidth in 130-nm SiGe BiCMOS. *IEEE Microwave and Wireless Technology Letters*, vol. 33, no. 2, pp. 200-203.
- [5] Huang, C.H., Mandal, A., Peña-Colaiocco, D., Da Silva, E.P. and Sathe, V. (2023). Regenerative Breaking: Optimal Energy Recycling for Energy Minimization in Duty-Cycled Domains. *IEEE Journal of Solid-State Circuits*, vol. 58, no. 1, pp. 68-77.
- [6] Lee, S., Kang, T., Song, S., Kwon, K. and Flynn M. (2023). An 81.6 dB SNDR 15.625 MHz BW Third-Order CT SDM With a True Time-Interleaving Noise-Shaping Quantizer. *IEEE Journal of Solid-State Circuits*, vol. 58, no. 4, pp. 929-938.
- [7] Avci, M.E. and Ozev, S. (2023). Low-Overhead RF Impedance Measurement Using Periodic Structures. *IEEE Transactions on Microwave Theory and Techniques*.
- [8] Tang, C.I., Deng, X. and Takashima, Y. (2022). Real-Time CGH Generation by CUDA-OpenGL Interoperability for Adaptive Beam Steering with a MEMS Phase SLM. *Micromachines*, vol. 13, no. 9, pp. 1527.
- [9] Kang, E., Choi, H., Hellman, B., Rodriguez, J., Smith, B., Deng, X., Liu, P., Lee, T.L.T., Evans, E., Hong, Y. and Guan, J. (2022). All-MEMS Lidar Using Hybrid Optical Architecture with Digital Micromirror Devices and a 2D-MEMS Mirror. *Micromachines*, vol. 13, no. 9, pp. 1444.

- [10] Guan, J., Dong, Z., Deng, X. and Takashima, Y. (2022). Optical Enhancement of Diffraction Efficiency of Texas Instruments Phase Light Modulator for Beam Steering in Near Infrared. *Micromachines*, vol. 13, no. 9, pp. 1393.
- [11] Deng, X., Tang, C.I., Luo, C. and Takashima, Y. (2022). Diffraction Efficiency of MEMS Phase Light Modulator, TI-PLM, for Quasi-Continuous and Multi-Point Beam Steering. *Micromachines*, vol. 13, no. 6, pp. 966.
- [12] Vankayalapati, B.T., Farhadi, M., Sajadi, R., Akin, B. and Tan, H. (2022). A Practical Switch Condition Monitoring Solution for SiC Traction Inverters. *IEEE Journal of Emerging and Selected Topics in Power Electronics*, vol. 11, no. 2, pp. 2190-2202.
- [13] Wei, K., Kwak, J.W. and Ma, D.B. (2022). An Encrypted On-Chip Power Supply With Random Parallel Power Injection and Charge Recycling Against Power/EM Side-Channel Attacks. *IEEE Transactions on Power Electronics*, vol. 38, no. 1, pp. 500-509.
- [14] Mehta, A., Wang, Q., Shichijo, S. and Kim, M.J. (2022). Characterization of GaN E-mode HEMT Devices by In-Situ STEM Electrical Biasing. *Microscopy and Microanalysis*, vol. 28, no. S1, pp. 2276-2277.
- [15] Lee, S., Kang, T., Bell, J., Haghighat, M., Martinez, A. and Flynn, M.P. (2022). An Eight-Element Frequency-Selective Acoustic Beamformer and Bitstream Feature Extractor. *IEEE Journal of Solid-State Circuits*, vol. 57, no. 6, pp. 1812-1823.
- [16] Zhou, Z., Tang, N., Nguyen, B., Hong, W., Pande, P.P., Krishnamurthy, R.K. and Heo, D. (2022). An Inductor-First Single-Inductor Multiple-Output Hybrid DC–DC Converter With Integrated Flying Capacitor for SoC Applications. *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 69, no. 12, pp. 4823-4836.
- [17] Xu, C., Vankayalapati, B.T., Yang, F. and Akin, B. (2022). A Reconfigurable AC Power Cycling Test Setup for Comprehensive Reliability Evaluation of GaN HEMTs. *IEEE Transactions on Industry Applications*, **vol.** 59, **no.** 1, **pp.** 1109-1117.
- [18] Van Marter, J.P., Dabak, A.G., Al-Dhahir, N. and Torlak, M. (2023). Support Vector Regression for Bluetooth Ranging in Multipath Environments. *IEEE Internet of Things Journal*, vol. 10, no. 13, pp. 11533-11546.
- [19] Huang, T.W., Huang, Y.C., Chien, C., Chiang, K.C. and Tsai, J.H. (2023). A 38-GHz demodulator with high image rejection in 65 nm-CMOS process. *International Journal of Microwave and Wireless Technologies*, **pp.** 1-7.
- [20] Lin, Y.H., Cheng, J.H., Chang, L.C., Lin, W.J., Tsai, J.H. and Huang, T.W. (2022). A broadband MFCW agile radar concept for vital-sign detection under various thoracic movements. *IEEE Transactions on Microwave Theory and Techniques*, vol. 70, no. 8, pp. 4056-4070.
- [21] Huang, W.Z., Wang, Y., Liu, S.W., Chiong, C.C. and Wang, H. (2023). A 30–50-GHz Ultralow-Power Low-Noise Amplifier With Second-Stage Current-Reuse for Radio Astronomical Receivers in 90-nm CMOS Process. *IEEE Microwave and Wireless Technology Letters*, vol. 33, no. 5, pp. 555-558.
- [22] Wang, Y., Chiu, T.Y., Chiu, T.Y., Yu, K.J., Teng, Y.M., Huang, G.W., Li, C.H., Kuo, C.N., Chiong, C.C. and Wang, H. (2022). AK a-to G-Band Detector With 5.5-GHz Video Bandwidth Using a Modified Traveling-Wave Structure in 65-nm CMOS Technology. *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 70, no. 4, pp. 1371-1375.

- [23] Li, J. and Rohrer, R. (2022). Efficient Static-Driven Integration for Step-Function Transient Simulation. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 41, no. 7, pp. 2213-2222.
- [24] Sukharev, V., Kteyan, A., Najm, F.N., Yi, Y.H., Kim, C.H., Choy, J.H., Torosyan, S. and Zhu, Y. (2022). Experimental Validation of a Novel Methodology for Electromigration Assessment in Onchip Power Grids. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 41, no. 11, pp. 4837-4850.

### **Invited Presentations**

- [1] Zhou, P., Edwards, A. J., Mancoff, F. B., Aggarwal, S. and Friedman, J. S. (2022, October 1 -November 4). Binarized Neuromorphic Inference Network with STT MTJ Synapses [Invited Presentation]. 2022 Magnetism and Magnetic Materials Conference, Minneapolis, MN, United States.
- [2] Friedman, J.S., Zhou, P., Hu, X., Edwards, A.J., Hassan, N., Brigner, W.H., Garcia-Sanchez, F., Bennett, C.H., Velasquez, A., Incorvia, J.A.C., Mancoff, F.B., Aggarwal, S. (2022, October). Unsupervised Learning and Recognition with Single-Domain and Domain-Wall MTJs [Invited Presentation]. Tohoku University Center for Science and Innovation in Spintronics Symposium, Tohoku, Japan.
- [3] Zhou, P., Edwards, A.J., Mancoff, F.B., Houssameddine, D., Aggarwal, S. and Friedman, J.S. (2022, August 29-31). Experimental Demonstration of Neuromorphic Network with STT MTJ Synapses [Invited Presentation]. IEEE The Magnetic Recording Conference, San Jose, CA, United States.
- [4] Zhou, P., Edwards, A. J., Mancoff, F. B., Houssameddine, D., Aggarwal, S., Friedman, J. S. (2022, October 4). Binarized Neuromorphic Inference Network with STT MTJ Synapses [Invited Presentation]. Proc. SPIE PC12205 Spintronics XV, San Diego, CA, United States.
- [5] Zhou, P., Edwards, A. J., Mancoff, F. B., Houssameddine, D., Aggarwal, S. and Friedman, J. S. (2022, July 27 - August 1). Binarized Neuromorphic Computing with STT MTJ Synapses [Invited Presentation]. International Conference on Neuromorphic Systems (ICONS), Knoxville, TN, United States.
- [6] Takashima, Y. (2023, April 18-21). Lidar and Near-to-eye AR Display by Angular and Spatial Light Modulation with MEMS SLM [Keynote Presentation]. Laser Display and Lightning Conference 2023, Yokohama, Kanagawa, Japan.
- [7] Takashima, Y. (2022, September 5-9). Unlocking the potential of TI-DMD for AR display and lidar engine: Image and beam steering by MEMS SLMs [Invited Presentation]. ICO-25 / OWLS-16, Dresden, Germany.
- [8] Takashima, Y. (2022, July 20). Beam and Image Steering for Mobility Photonics Application of MEMS SLM for solid-state lidar and AR display [Invited Presentation]. The 2nd Mobility Photonics Research Meeting, Optoelectronics Industry and Technology Development Association (OITDA), Tokyo, Japan.
- [9] Takashima, Y. (2022, July 31 August 5). Angular and Spatial Light Modulation for Lidar and AR display [Invited Presentation]. International Symposium on Imaging, Sensing, and Optical Memory 2022, Sapporo, Japan.
- [10] Mukhopadhyay, S. (2022, October 9-13). On ageing of on-chip voltage regulators [Invited Presentation]. 2022 International Integrated Reliability Workshop (IIRW), Fallen Leaf Lake, CA, United States.

# **Contact TxACE**

To become a TxACE partner, please contact: Kenneth K. O, Director 972-883-5556

To discuss our core facilities in Dallas and how to obtain access to them and to receive future TxACE requests for proposals, please contact: Lucien Finley lucien.finley@utdallas.edu 972-883-5553

TxACE is based at The University of Texas at Dallas. We are located in the Engineering and Computer Science North building, ECSN 3.302.

> Texas Analog Center of Excellence The University of Texas at Dallas, EC37 800 West Campbell Road Richardson, Texas 75080

> > centers.utdallas.edu/txace



Semiconductor Research Corporation

