

Hao, Yijia (2025) *Analog integrated circuit design empowered by artificial intelligence techniques: from building blocks to small systems.* PhD thesis.

https://theses.gla.ac.uk/85613/

Copyright and moral rights for this work are retained by the author

A copy can be downloaded for personal non-commercial research or study, without prior permission or charge

This work cannot be reproduced or quoted extensively from without first obtaining permission from the author

The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author

When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given

Enlighten: Theses
<a href="https://theses.gla.ac.uk/">https://theses.gla.ac.uk/</a>
research-enlighten@glasgow.ac.uk

# Analog Integrated Circuit Design Empowered by Artificial Intelligence Techniques: From Building Blocks to Small Systems

## Yijia Hao

# SUBMITTED IN FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

JAMES WATT SCHOOL OF ENGINEERING
COLLEGE OF SCIENCE AND ENGINEERING





## Abstract

Analog integrated circuit (IC) design remains a major bottleneck in modern electronic systems due to its reliance on expertise-driven iteration and the growing complexity of performance, robustness, and variability requirements. This dissertation aims to advance artificial intelligence (AI)-driven analog IC design methodologies across three hierarchical levels: building blocks, subsystems, and systems.

At the block level, a design—insight—aware comparison is conducted across representative circuits including a StrongARM comparator, two Miller-compensated operational amplifiers, and an LC voltage-controlled oscillator (VCO) to benchmark AI-assisted optimization against conventional systematic flows. Post-layout and silicon measurement results demonstrate that AI-assisted frameworks can achieve superior performance and robustness while preserving design intent.

At the subsystem level, the first AI-driven co-design flow for VCO–LDO integration is introduced. By simultaneously optimizing both blocks under supply—noise coupling and frequency—pushing effects, the method improves phase noise, figure of merit (FoM), and runtime efficiency compared to sequential design, demonstrating the value of cross-block optimization.

At the system level, a global–local optimization framework with multi-fidelity simulation is proposed for asynchronous successive-approximation register analog-to-digital converter. This methodology cascades surrogate model-based global exploration with parallel pattern search refinement, achieving competitive FoM across 12 design cases (7–12 bit, up to 250 MHz) with significantly reduced runtime and minimal manual effort.

Together, these contributions establish a practical pathway for AI-driven analog IC design automation. By combining black-box optimization, learning-based acceleration, and designer-in-the-loop validation, this work demonstrates measurable gains in design quality, robustness, and time efficiency, offering a foundation for future system-level EDA tools.

## Contents

| $\mathbf{A}$  | bstra                    | ct       |                                              | iii  |
|---------------|--------------------------|----------|----------------------------------------------|------|
| $\mathbf{Li}$ | List of Publications vii |          |                                              |      |
| $\mathbf{Li}$ | st of                    | Tables   |                                              | ix   |
| Li            | st of                    | Figures  | S                                            | xi   |
| A             | cknov                    | wledgen  | nents                                        | xii  |
| D             | eclar                    | ation    |                                              | xiv  |
| $\mathbf{A}$  | bbre                     | viations |                                              | xv   |
| 1             | Intr                     | oductio  | on                                           | 1    |
|               | 1.1                      | Analog   | Circuit Design Automation                    | . 1  |
|               | 1.2                      | Optimi   | zation-Based Analog IC Sizing                | . 2  |
|               | 1.3                      |          | nges and Research Objectives                 |      |
|               |                          | 1.3.1    | Block Level Sizing                           | . 4  |
|               |                          | 1.3.2    | Subsystem Co-Design                          | . 5  |
|               |                          | 1.3.3    | System Level Sizing                          |      |
|               | 1.4                      | Main C   | Contributions                                | . 8  |
|               | 1.5                      | Thesis   | Outline                                      | . 9  |
| 2             | Bac                      | kgroun   | d and Literature Review                      | 11   |
|               | 2.1                      | Backgr   | ound on Analog Circuits                      | . 11 |
|               | 2.2                      | Three (  | Generations of Analog Circuit Design Methods | . 14 |
|               |                          | 2.2.1    | Generation I: Quadratic Hand Analysis        | . 14 |
|               |                          | 2.2.2    | Generation II: The $g_m/I_D$ Method          | . 15 |
|               |                          | 2.2.3    | Generation III: Optimization-Centric Sizing  | . 16 |
|               |                          | 2.2.4    | Common Verification Loop                     | . 17 |
|               |                          | 2.2.5    | Comparison                                   | . 17 |
|               | 2.3                      | Local (  | Optimization Methods                         | . 17 |
|               |                          | 2.3.1    | Gradient-Based Methods                       | . 18 |
|               |                          | 2.3.2    | Nelder-Mead (Simplex)                        | . 18 |

|   |      | 2.3.3 Pattern Search Method                                   | )  |
|---|------|---------------------------------------------------------------|----|
|   |      | 2.3.4 Summary                                                 | L  |
|   | 2.4  | Global Optimization Methods                                   | 2  |
|   |      | 2.4.1 Evolutionary and Swarm-Based Heuristics                 | 2  |
|   |      | 2.4.2 Surrogate Model-Assisted Optimization                   | 1  |
|   |      | 2.4.3 ANN vs. GPR                                             | )  |
|   | 2.5  | Summary                                                       | )  |
| 3 | Ass  | essing AI-Empowered Optimization Techniques for Analog Build- |    |
|   | ing  | Block Sizing 31                                               | Ĺ  |
|   | 3.1  | Introduction                                                  | L  |
|   | 3.2  | Contributions                                                 | }  |
|   | 3.3  | The AI-Empowered Analog Building Block Sizing Approach        | Į. |
|   |      | 3.3.1 AI-Empowered Sizing Framework                           | 1  |
|   |      | 3.3.2 Global Optimization Engine                              | ;  |
|   | 3.4  | Comparative Study Using Four Design Cases                     | 7  |
|   |      | 3.4.1 StrongARM Latch Comparator                              | 3  |
|   |      | 3.4.2 Two-Stage Miller-Compensated Op-Amp (3.3 V)             | 1  |
|   |      | 3.4.3 Two-Stage Miller-Compensated Op-Amp (1.8 V)             | 2  |
|   |      | 3.4.4 LC Oscillator                                           | j  |
|   |      | 3.4.5 Discussion                                              | 3  |
|   | 3.5  | Summary                                                       | 2  |
| 4 | Sub  | osystem Design: VCO with LDO Integration 63                   | 3  |
|   | 4.1  | Introduction                                                  | }  |
|   | 4.2  | Literature Review                                             | j  |
|   | 4.3  | Contributions                                                 | j  |
|   | 4.4  | Problem Formulation                                           | j  |
|   |      | 4.4.1 Architecture of LDO-VCO                                 | j  |
|   |      | 4.4.2 Design Variables                                        | 7  |
|   |      | 4.4.3 Testbench and Measures                                  | )  |
|   |      | 4.4.4 Objective and Constraints                               | )  |
|   | 4.5  | AI-Driven Co-Design Method                                    | )  |
|   |      | 4.5.1 Sizing Flow and Considerations                          | )  |
|   |      | 4.5.2 Sizing Algorithm                                        | )  |
|   | 4.6  | Pre-layout Sizing Results and Analysis                        | l  |
|   | 4.7  | Post-Layout Results and Discussion                            | 1  |
|   | 4.8  | Summary                                                       | j  |
| 5 | Syst | tem-Level Design Automation for SAR ADCs 77                   | 7  |
|   | -    | Background                                                    | 7  |

|              | 5.2  | Litera | ture Review                                                     | 78  |
|--------------|------|--------|-----------------------------------------------------------------|-----|
|              | 5.3  | Contri | ibutions                                                        | 79  |
|              | 5.4  | Archit | ecture and Design Considerations of SAR ADC                     | 80  |
|              |      | 5.4.1  | Architecture and Operation                                      | 81  |
|              |      | 5.4.2  | Design Considerations and Trade-Offs                            | 86  |
|              | 5.5  | Metho  | odology                                                         | 88  |
|              |      | 5.5.1  | Overview                                                        | 88  |
|              |      | 5.5.2  | Automatic Specification Derivation                              | 89  |
|              |      | 5.5.3  | Low-Cost Simulation-Based Global Optimization                   | 92  |
|              |      | 5.5.4  | Fast Local Optimization Using Parallel Multi-Fidelity Transient |     |
|              |      |        | Simulation                                                      | 94  |
|              | 5.6  | Exper  | imental Results                                                 | 97  |
|              | 5.7  | Summ   | ary                                                             | 99  |
| 6            | Con  | clusio | ns and Future Work                                              | 101 |
|              | 6.1  | Analog | g Bulding Block Sizing                                          | 101 |
|              | 6.2  | LDO a  | and VCO Co-Design                                               | 103 |
|              | 6.3  | SAR A  | ADC Design                                                      | 104 |
| $\mathbf{A}$ | ppen | dices  |                                                                 | 105 |
|              | A    | Tables | 3                                                               | 105 |

## List of Publications

- 1. M. Chen, Y. Hao, et al., "Trade-off-Aware Analog Circuit Sizing Based on a Multitask Surrogate Model-Assisted Evolutionary Algorithm," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025. (Under Review)
- 2. Y. Hao, et al., "A Global–Local Optimization Approach for Asynchronous SAR ADC Design," IEEE Transactions on Circuits and Systems II: Express Briefs (TCAS-II), 2025. (Under Review)
- 3. Y. Hao, et al., "From Systematic to Intelligent: Assessing AI-Empowered Optimization Techniques for Analog Building Block Sizing," IEEE Access, 2025.
- 4. Y. Hao, et al., "An AI-Driven EDA Algorithm-Empowered VCO and LDO Co-Design Method," IEEE International Conference on Synthesis, Modeling, Analysis and Simulation Methods, and Applications to Circuit Design (SMACD), 2025.
- 5. J. Wang, Y. Hao, et al., "Pose-Guided Focal Loss for Enhancing Vision Transformers in Continuous Sign Language Recognition," IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP), 2025.
- 6. Y. Hao, et al., "Integrating AI in Engineering Education: A Comprehensive Review and Student-Informed Module Design for UK Students," IEEE Transactions on Education, 2025.
- 7. A. Alexandrou, Y. Hao, et al., "Properties of Textured Piezoceramics Measured with Miniature Samples," IEEE International Ultrasonics, Ferroelectrics, and Frequency Control Symposium (UFFC), 2024.
- 8. Y. Hao, et al., "Design of a Two-Stage Miller-Compensated Operational Amplifier Using an EDA Tool-Centered Approach," IEEE International Conference on Synthesis, Modeling, Analysis and Simulation Methods, and Applications to Circuit Design (SMACD), 2024.

## List of Tables

| 2.1 $2.2$  | A comparison between static and dynamic analog circuits                                                                                      | 12<br>30 |
|------------|----------------------------------------------------------------------------------------------------------------------------------------------|----------|
| 3.1<br>3.2 | Design variables and search ranges of StrongARM latch comparator Pre-layout performance values of the design obtained by AI-empowered method | 40       |
|            | and the reference design of StrongARM latch comparator                                                                                       | 41       |
| 3.3        | Post-layout performance values of the AI-empowered design (left) and the                                                                     |          |
|            | reference design of StrongARM latch comparator (right)                                                                                       | 43       |
| 3.4        | Design variables and search ranges of two-stage Miller-compensated op-amp                                                                    | 46       |
| 3.5        | Pre-layout performance values of the AI-empowered designs and the reference                                                                  |          |
|            | design of the two-stage Miller-compensated op-amp                                                                                            | 47       |
| 3.6        | Measured performance values of the AI-empowered design and the reference                                                                     |          |
|            | design of the two-stage Miller-compensated op-amp                                                                                            | 50       |
| 3.7        | Design variables and search ranges of low power design                                                                                       | 53       |
| 3.8        | Pre-layout performance values of the AI-empowered 3.3 V design and 1.8 V                                                                     |          |
|            | design                                                                                                                                       | 54       |
| 3.9        | Measured performance values of the AI-empowered 3.3 V design and 1.8 V                                                                       |          |
| 3.10       | design of the two-stage Miller-compensated op-amp                                                                                            | 54       |
|            | amplifier and the state-of-the-art                                                                                                           | 55       |
| 3.11       | Design variables and search ranges of the CMOS cross-coupled LC oscillator                                                                   | 58       |
|            | Performance values of the AI-empowered design and the reference design (pre-                                                                 |          |
|            | layout simulation results) of the CMOS cross-coupled LC oscillator                                                                           | 59       |
| 3.13       | Performance values of the AI-empowered design (measurement result) of the                                                                    |          |
|            | CMOS cross-coupled LC oscillator and the state-of-the-art                                                                                    | 60       |
| 3.14       | Summary of the comparison of typical contemporary AI-empowered and con-                                                                      |          |
|            | ventional systematic manual sizing methods based on the four case studies                                                                    | 61       |
| 4.1        | Design variables and search ranges of the CMOS cross-coupled LC oscillator                                                                   |          |
|            | and the LDO                                                                                                                                  | 68       |
| 4.2        | Specifications and pre-layout simulation results of the sequentially and co-                                                                 |          |
|            | designed LDO-VCO                                                                                                                             | 71       |
| 4.3        | Specifications and post-layout simulation results of the co-designed LDO-VCO.                                                                | 75       |
| 5.1        | Summary of specifications used in optimization                                                                                               | 92       |

| 5.2 | Simulation performance versus number of segments $M$ with a 12-bit 20 MHz   |
|-----|-----------------------------------------------------------------------------|
|     | SAR ADC                                                                     |
| 5.3 | Comparison with prior SAR ADC designs                                       |
| 1   | Pre-layout performance comparison of the AI-empowered design and the ref-   |
|     | erence design of the two-stage Miller-compensated op-amp with an additional |
|     | area constraint                                                             |

## List of Figures

| 1.1  | The design optimization paradigms in analog circuit sizing                      | 3  |
|------|---------------------------------------------------------------------------------|----|
| 2.1  | The $g_m/I_D$ based method                                                      | 15 |
| 2.2  | Three generations of analog design methods                                      | 18 |
| 2.3  | The illustration of PS in a 2-D problem                                         | 20 |
| 2.4  | The surrogate model-assisted optimization flow                                  | 24 |
| 2.5  | An 1D GPR example                                                               | 26 |
| 2.6  | An ANN with two hidden layers                                                   | 28 |
| 2.7  | The illustration of Beta distribution                                           | 29 |
| 3.1  | The flow diagram of the AI-empowered analog building block sizing approach.     | 34 |
| 3.2  | Workflow of the experimental implementation                                     | 38 |
| 3.3  | Schematic of the classic StrongARM latch comparator                             | 39 |
| 3.4  | Transient response of the AI-empowered design and the reference design [94]     |    |
|      | at output nodes $V_{OUTP}$ and $V_{OUTN}$                                       | 42 |
| 3.5  | Layouts for the AI-empowered design and the reference design [94] of the        |    |
|      | StrongARM latch comparator                                                      | 42 |
| 3.6  | Comparison of the post-layout responses of the AI-empowered design and the      |    |
|      | reference design [94] of the StrongARM latch comparator                         | 43 |
| 3.7  | Comparison of the speed of the AI-empowered design and the reference design     |    |
|      | [94] across 16 corners of the StrongARM latch comparator, with post-layout      |    |
|      | (a) delay time and (b) reset time grouped by four process corners               | 44 |
| 3.8  | Schematic of the two-stage Miller-compensated op-amp. The devices shown in      |    |
|      | gray (M12–M14) are auxiliary transistors used for startup and shutdown control. | 45 |
| 3.9  | A comparison of the op-amp unity-gain step responses. The input step size is    |    |
|      | 1 V                                                                             | 48 |
| 3.10 | CMRR of the AI-empowered design 3rd iteration                                   | 48 |
| 3.11 | Chip microphotograph of the reference design, the AI-empowered design, 3.3V     |    |
|      | CCIA, and 1.8V CCIA on the same die                                             | 50 |
| 3.12 | Comparison of the measurement results of transient responses between the AI-    |    |
|      | empowered design and the reference design of the two-stage Miller-compensated   |    |
|      | op-amp                                                                          | 51 |
| 3.13 | Schematic of the CMOS cross-coupled LC oscillator. An 8-bit capacitor bank      |    |
|      | is used for frequency tuning                                                    | 57 |

| 3.14 | Chip microphotograph of the AI-empowered oscillator design                            | 59 |
|------|---------------------------------------------------------------------------------------|----|
| 3.15 | The measured PN performance of the AI-empowered design of the oscillator              | 60 |
| 4.1  | The VCO and LDO co-design method including the schematic diagram of the               |    |
|      | LC-tank VCO with an integrated LDO. Two design approaches: the sequen-                |    |
|      | tial approach, which involves two distinct design phases, and the co-design           |    |
|      | approach, which optimizes both building blocks simultaneously                         | 67 |
| 4.2  | (a) Output noise of the LDOs. (b) PSR of the LDOs. (c) Phase noise perform-           |    |
|      | ance of the VCO designs with 1.2 V ideal supply and with LDO                          | 72 |
| 4.3  | Oscillation transients for LDO-VCO designs. The co-designed VCO has a                 |    |
|      | slower oscillation start-up and smaller oscillation amplitude                         | 73 |
| 4.4  | (a) Corner spread of PN for the co-designed LDO-VCO. (b) Corner spread of             |    |
|      | PN for the sequentially designed LDO-VCO                                              | 74 |
| 4.5  | Co-designed LDO-VCO layout                                                            | 75 |
| 5.1  | The architecture of an N-bit asynchronous SAR ADC                                     | 81 |
| 5.2  | The circuit diagram for SAR ADC building blocks, including: (a) bootstrap             |    |
|      | switch, (b) CDAC, (c) SAR logic, and (d) dynamic comparator                           | 83 |
| 5.3  | The timing diagrams of synchronous and asynchronous SAR ADCs                          | 85 |
| 5.4  | The flow diagram of the proposed global-local sizing approach. The red blocks         |    |
|      | are based on the single-point test, while the blue block represents the full sine     |    |
|      | wave test                                                                             | 90 |
| 5.5  | Illustration of phase-shifted and time-interleaved parallel transient simulation:     |    |
|      | 16-point coverage via $4\times4$ samples                                              | 95 |
| 5.6  | SNDR and FoM of 12 design cases: (a) 12 bit (b) 7 bit. 10 design cases with           |    |
|      | $\alpha = 1$ and 2 design cases with $\alpha = 2$                                     | 98 |
| 5.7  | An example sizing process for the 12 bit SAR ADC, with the convergence plots $\alpha$ |    |
|      | of both the global and local optimization.                                            | 98 |

## Acknowledgements

This dissertation marks the culmination of several years of research, learning, and perseverance, and it would not have been possible without the guidance and support of many individuals to whom I am sincerely grateful.

First and foremost, I would like to express my heartfelt gratitude to my academic supervisors, Prof. Bo Liu and Prof. Sandy Cochran, as well as my industrial supervisor, Dr. Miguel Gandara, for their invaluable guidance, insightful feedback, and unwavering encouragement throughout this journey. Their patience, expertise, and dedication have been instrumental in shaping both this thesis and my growth as a researcher.

I would like to extend my gratitude to EPSRC for funding this research and supplying the resources that made it possible. This support played a vital role in enabling me to complete this thesis.

My heartfelt thanks go to my collaborators and colleagues, including Dr. Maarten Strackx (Magics Technologies), Dr. Srinjoy Mitra (University of Edinburgh), Prof. Francisco V. Fernandez (Universidad de Sevilla), Mr. Ken Li and Prof. Shaolan Li (Georgia Tech), Prof. Rami Ghannam, Mr. Minyang Chen, Mr. Yushi Liu, and Mr. Jingyan Wang (University of Glasgow). Their valuable input, thoughtful discussions, and generous sharing of expertise have been a constant source of inspiration, and I feel truly fortunate to have worked alongside them. I would also like to thank Dr. Alexandru Moldovan, Dr. Bartas Abaravicius, Dr. Meraj Ahmet and Mr. Huxi Wang for their training and help with my first tapeout and measurement.

I would also like to acknowledge the support of James Watt School of Engineering, whose facilities, resources, and administrative assistance have provided an excellent environment in which to conduct my studies. Special thanks go to FUSE CDT for their help behind the scenes.

Finally, I am deeply thankful to my family, friends, boyfriend for their love, patience, and encouragement during the most challenging times. Their constant support has sustained me through the challenges of this journey and made its completion possible. A special thanks to my cat, whose companionship and occasional distractions reminded me to take breaks and kept my spirits high.

To all who have contributed in ways big and small, I extend my deepest gratitude.

Declaration

Name: Yijia Hao

Registration Number: XXXXXXX

I certify that the thesis presented here for examination for a PhD degree of the University of Glasgow is solely my own work other than where I have clearly indicated that it is the work of others (in which case the extent of any work carried out jointly by me and any other person is clearly identified in it) and that the thesis has not been edited

by a third party beyond what is permitted by the University's PGR Code of Practice.

The copyright of this thesis rests with the author. No quotation from it is permitted

without full acknowledgement.

I declare that the thesis does not include work forming part of a thesis presented success-

fully for another degree.

I declare that this thesis has been produced in accordance with the University of Glasgow's

Code of Good Practice in Research.

I acknowledge that if any issues are raised regarding good research practice based on

review of the thesis, the examination may be postponed pending the outcome of any

investigation of the issues.

Yijia Hao

xiv

## Abbreviations

ABC Artificial Bee Colony AC Alternating Current

ACO Ant Colony Optimization
ADM Differential-Mode Gain
AI Artificial Intelligence
AMS Analog and Mixed-Signal
ANNs Artificial Neural Network

BGR Bandgap Reference
BO Bayesian Optimization

CCIA Capacitively Coupled Instrumentation Amplifier

CDF Cumulative Distribution Function
CMRR Common Mode Rejection Ratio
DACs Digital-to-Analog Converters

DFFs D Flip-Flops

DE Differential Evolution
DNL Differential Nonlinearity

EDA Electronic Design Automation

EI Expected Improvement

ESSAB Efficient Surrogate Model-Assisted Sizing Method for High-

Performance Analog Building Blocks

FFT Fast Fourier Transform FF Fast NMOS/Fast PMOS

FoM Figure of Merit

FS Fast NMOS/Slow PMOS

GA Genetic Algorithm GBW Gain Bandwidth

GPR Gaussian Process Regression

GWO Grey Wolf Optimizer
HT High Temperature

HV High Voltage

ICs Integrated Circuits
INL Integral Nonlinearity

IoT Internet of Things
IRN Input-Referred Noise
LDOs Low-Dropout Regulators
LNA Low-Noise Amplifier

LCB Lower Confidence Bound

LPTV Linear Periodically Time-Varying

LSB Least Significant Bit
LTE Long-Term Evolution
LTI Linear Time-Invariant
LT Low Temperature
LUTs Lookup Tables
LV Low Voltage
MC Monte Carlo

MSB Most Significant Bit NEF Noise Efficiency Factor

NM Nelder-Mead

op-amps Operational Amplifiers

PDF Probability Density Function

PDK Process Design Kit

PFI Probability of Further Improvement

PI Probability of Improvement

PLLs Phase-Locked Loops

PN Phase Noise
PNOISE Periodic Noise
PS Pattern Search

PSO Particle Swarm Optimization PSR Power Supply Rejection

PSRR Power Supply Rejection Ratio

PSS Periodic Steady-State

PVT Process, Voltage, Temperature

RF Radio Frequency

RL Reinforcement Learning

SAR ADCs Successive Approximation Register Analog-to-Digital Convert-

 $\operatorname{ers}$ 

S/H Sample-and-Hold
SC Switched-Capacitor
SerDes Serializer/Deserializer
SF Slow NMOS/Fast PMOS

SNDR Signal-to-Noise and Distortion Ratio

SoC System on Chip

SPGP Sparse Pseudo-Input Gaussian Process

SQP Sequential Quadratic Programming

SS Slow NMOS/Slow PMOS

SSRE Step Size Ratio Error

THD Total Harmonic Distortion

TR Tuning Range

UGB Unity Gain Bandwidth
UCB Upper Confidence Bound

VCOs Voltage-Controlled Oscillators

WCC Worst Case Corner

## Chapter 1

## Introduction

## 1.1 Analog Circuit Design Automation

In modern integrated circuit (IC) design, analog circuit design remains one of the most complex and resource-intensive tasks. High-performance analog blocks, such as operational amplifiers (op-amps), low-dropout regulators (LDOs), comparators, and data converters, require delicate trade-offs among gain, bandwidth, noise, stability, power consumption, and robustness across process, voltage, and temperature (PVT) variations [1]–[3]. Traditional manual flows are based on designer expertise and heuristic iterations, which are increasingly inefficient and error-prone as circuit complexity continues to grow [4]. As a result, automated analog circuit design has become an indispensable research direction, offering the promise of reducing design turnaround time while improving performance, robustness and yield.

The field of electronic design automation (EDA) has a long history. Digital design automation matured rapidly beginning in the 1980s with the advent of logic synthesis, static timing analysis, and automated place-and-route [5], [6]. This maturity allowed digital ICs to scale in complexity and performance, while design cycles were kept manageable [7]. By contrast, analog design automation followed a slower trajectory. Early efforts in the 1980s and 1990s focused on symbolic analysis, sensitivity-based optimization, and quadratic law approximations for MOSFET behavior [8]–[10]. In the 2000s, geometric programming and convex optimization methods [11] provided new avenues, particularly for circuits where design constraints could be expressed as posynomials. However, many analog circuits (e.g., dynamic comparators, oscillators, switched-capacitor filters) exhibit strong nonlinearity, time-varying dynamics, and layout-dependent effects that defy simple convex formulations.

In the last decade, research has increasingly shifted toward simulation-driven optimization and machine learning-assisted frameworks. These approaches leverage SPICE-in-the-loop evaluation, surrogate models, and evolutionary search to explore large design spaces without relying exclusively on analytical equations [12]–[17]. This trend reflects a recognition that accurate modeling of modern CMOS circuits requires high-fidelity simulation data.

Commercial EDA platforms such as Cadence Virtuoso ADE and Synopsys Custom Compiler offer environments for schematic capture, simulation management, layout generation, and PVT verification. While these tools provide automation at the verification and layout levels, true automation at the circuit sizing level remains limited. Designers are still responsible for selecting architectures, biasing strategies, and device dimensions. Built-in optimizers (e.g., gradient descent and heuristic search) are often generic and do not scale effectively to the nonlinear, constraint-rich nature of analog design. Therefore, practical analog design continues to rely heavily on expertise-driven iteration, which is time-consuming, error-prone, and difficult to scale across increasing design complexity and variability. This gap motivates the development of artificial intelligence (AI)-driven optimization methodologies.

## 1.2 Optimization-Based Analog IC Sizing

Analog circuit optimization in this context refers to the systematic selection of design variables (e.g., transistor dimensions, biasing, passive component values) to maximize performance while satisfying constraints such as stability, noise, linearity, and PVT robustness. It is often called sizing in the domain. Unlike digital logic synthesis, analog circuit sizing problem is inherently nonlinear, nonconvex, and strongly affected by PVT variations.

Analog circuit sizing problems can be formulated either as single-objective or multiobjective optimization tasks depending on the design goals [18]. Single-objective optimization [19], [20] focuses on improving a dominant metric, such as minimizing power or maximizing bandwidth, while treating other specifications as constraints. This contrasts with multi-objective optimization [21]–[23], which explicitly formulates competing goals and generates Pareto-optimal trade-off fronts. While multi-objective approaches provide richer design insight, they also incur significantly higher computational cost. In this thesis, only single-objective optimization is considered, targeting one design goal while ensuring all other specifications are satisfied. Over the past two decades, optimization for analog circuit design has evolved from equation-driven formulations to simulation-based methods and, more recently, to learning-assisted approaches, as depicted in Fig. 1.1. Each generation reflects a different balance between interpretability, accuracy, and computational cost.



Figure 1.1: The design optimization paradigms in analog circuit sizing.

Equation-driven methods rely on analytical circuit models and first-order approximations, often cast into convex or geometric programming formulations [11]. These techniques offer high interpretability and computational efficiency, making them effective for early-stage design. However, their accuracy degrades in nanoscale technologies, where velocity saturation, mismatch, and layout parasitics dominate.

Simulation-driven methods embed SPICE simulations directly into optimization loops [15], [16], [24], ensuring high fidelity for dynamic and nonlinear circuits such as comparators, oscillators, and data converters. Their main drawback lies in inefficiency: repeated transient and noise simulations lead to prohibitive runtimes, particularly when PVT corners and Monte Carlo (MC) analyses are included.

To alleviate simulation cost, surrogate or learning-assisted methods approximate circuit responses using statistical or machine learning models. Gaussian process regression (GPR) [25], [26] and artificial neural networks (ANNs) [27] have been adopted within surrogate model-assisted frameworks to accelerate design space exploration. While these methods drastically reduce the number of expensive simulations, they introduce challenges in model fidelity, training overhead, and generalization across design spaces.

Overall, design optimization for analog circuits needs to handle complex performance trade-offs, nonconvex search spaces, and variability-aware robustness, while ensuring accuracy and efficiency. Recent trends move toward hybrid frameworks that combine global and local optimization, multi-fidelity simulations, and parallel computing to achieve both efficiency and accuracy.

## 1.3 Challenges and Research Objectives

Although EDA has evolved over several decades, its application in practical analog circuit design remains limited. Most EDA-driven approaches emphasize algorithm improvement but are often criticized for failing to capture design intent. Consequently, analog design continues to rely on manual iteration and expert intuition, particularly at the subsystem and system levels where strong cross-block interactions exist. Motivated by this gap, this research aims to bridge the divide between EDA methods and practical analog design by developing the next-generation design methodology based on AI-driven EDA tools. The thesis first validates an AI-empowered analog IC sizing framework with design insights and silicon results, and then extending it toward system-level sizing to account for cross-block effects. Specifically, this research focuses on the AI-empowered optimization for analog building blocks, the development of co-design method for interacting modules, and the implementation of system-level optimization. Given that these represent three specific research topics with unique challenges, the following subsections provide a brief analysis of each. A more detailed discussion of each topic can be found in the corresponding chapters outlined in Section 1.5.

## 1.3.1 Block Level Sizing

At the block level, analog circuit optimization represents the foundation of EDA-based analog IC design. While recent advances have significantly improved optimization algorithms and surrogate modeling techniques, two central challenges remain unresolved:

• Lack of Wide Silicon Validation. Most optimization approaches are evaluated only on limited analog building blocks through simulation results. Comprehensive silicon-based validation across multiple building blocks and technology nodes is scarce. Without such evidence, it is difficult to establish confidence in the wide applicability of proposed optimization frameworks.

• Lack of Design Insight—Based Comparison. Existing evaluations focus primarily on algorithmic outcomes such as convergence, runtime, or numerical performance metrics. Far less attention is given to whether the optimization results preserve meaningful design insights, such as power—speed—noise trade-offs. This gap limits the ability to judge whether automated techniques align with designer intent and can be seamlessly adopted in practice.

Motivated by these limitations, this study is structured around two key objectives:

- Systematic Comparison across Representative Case Studies. The objective is to perform a comprehensive comparison between manual and AI-empowered design approaches on key analog building blocks such as op-amps, comparators, and oscillators. The study aims to evaluate both design quality and efficiency, with silicon measurement incorporated to ensure practical relevance beyond simulation.
- Benchmarking with Design Insights. The objective is to move beyond numerical metrics to assess whether optimization outcomes align with established design knowledge and designer intent. By examining circuit sizings, trade-offs, and interpretability, the study seeks to determine the extent to which automated methods can complement or enhance human expertise.

## 1.3.2 Subsystem Co-Design

In conventional analog IC design, individual blocks such as LDOs or voltage-controlled oscillators (VCOs) are often optimized in isolation under fixed assumptions about their operating environment. While this block-level approach simplifies the design process, it overlooks important interactions between connected circuits, which can lead to suboptimal overall performance once the blocks are integrated. Considering a VCO integrated with an LDO as an example, several key challenges can be identified:

- Limitations of Sequential Design. Conventional practice treats the VCO and LDO as independent blocks: the VCO is optimized under an ideal supply and the LDO is later tuned around it. This separation neglects block interactions and often forces iterative redesign, reducing design efficiency.
- Noise Co-Optimization Challenges. Low phase noise (PN) depends on the joint treatment of multiple sources such as thermal, flicker, and regulator-induced supply noise, which interact in nontrivial ways during integration. These coupling effects are difficult to capture with manual design, and block-level optimizations often fail to hold at the subsystem level.

• Scalability Limits of Manual Tuning. With dozens of design variables across both circuits, manual optimization is labor-intensive and error-prone. Evaluating design corners under PVT variations further compounds the complexity, making a systematic automated approach essential.

Motivated by these challenges, this work is guided by two primary objectives:

- Subsystem-Level Co-Design of VCO and LDO. The first objective is to develop an AI-driven co-design methodology that simultaneously optimizes the VCO and LDO as an integrated subsystem. By jointly considering power consumption, frequency pushing effect, and LDO-induced noise, the method seeks to minimize overall PN while maintaining energy efficiency and robustness across corners.
- Efficiency and Robustness Demonstration. The second objective is to validate the proposed co-design flow by implementing it with a machine learning—assisted optimization engine and applying it to a 65 nm CMOS LC-tank VCO with integrated LDO. Its performance is benchmarked against a sequential design flow to demonstrate improvements in figure of merit (FoM), PVT robustness and runtime efficiency, thereby establishing the practical advantages of subsystem-level co-design.

While this work focus on LDO-VCO as a representative case study, it should be emphasized that the co-design framework is not restricted to this particular combination of building blocks. The same principles can be readily extended to other subsystem configurations, such as oscillator—mixer pairs.

## 1.3.3 System Level Sizing

Scaling sizing from block-level circuits to system-level architectures introduces new challenges. To tackle these challenges, this work focuses on asynchronous successive-approximation register analog-to-digital converters (SAR ADCs) as a representative case study. For asynchronous SAR ADCs, these challenges manifest as pressing constraints that restrict the applicability of current AI-driven sizing methods:

• Block-Level Optimization Limitations. Prior design sizing methods typically operate at the block level, where individual components such as the comparator, digital-to-analog converters (DACs), and sample-and-hold (S/H) are sized separately. This decomposition requires manual specification allocation and cannot fully capture inter-block interactions, leading to suboptimal system performance and iterative design loops. In some cases, it may fail to meet the requirements of high-resolution and high-speed ADCs.

• High Dimensionality and Long Simulation Time. System-level SAR ADC design involves dozens of design variables spanning transistor sizes, capacitor sizes, and timing parameters. As the number of variables increases, the design space grows exponentially, making exhaustive exploration impractical. Although global optimization methods are theoretically capable of addressing such complexity, their computational cost often results in prohibitively long runtime. In addition, accurate fast Fourier transform (FFT)-based signal-to-noise and distortion ratio (SNDR) characterization requires long simulation durations at Nyquist rate, which is impractical to embed in an iterative optimization loop.

To overcome these challenges, this piece of work is guided by two main objectives:

- Global—Local Optimization with Multi-Fidelity Simulation. To handle the high dimensionality and long simulation time, the first objective is to develop a hierarchical system-level optimization framework that integrates fast global search using ANN-based surrogate modeling with local refinement via parallel multi-fidelity pattern search (PS). The approach balances exploration and exploitation: the global optimizer efficiently scans the design space with low-cost approximations, while the local optimizer ensures convergence to high-quality solutions using accurate full sine-wave simulations.
- System-Level SAR ADC Sizing. The second objective is to validate the proposed framework across 12 design cases spanning 7- and 12-bit resolutions with sampling rates up to 250 MHz. The goal is to demonstrate competitive SNDR and FoM with reduced runtime and minimal manual effort, thereby advancing practical system-level sizing for SAR ADC design.

While the discussion here is centered on SAR ADCs, the underlying issues such as the limits of block-level optimization and the complexity of high-dimensional system-level search spaces, are common across many analog and mixed-signal (AMS) subsystems. Thus, while SAR ADCs serve as a representative example in this study, the proposed methodology is applicable to a broad range of system-level AMS design problems.

#### 1.4 Main Contributions

This dissertation advances next-generation AI-driven design methodology at the block, subsystem, and system levels through three distinct studies that combine black-box optimization with learning-based acceleration and designer-in-the-loop validation. The works demonstrate measurable gains in design quality, PVT robustness, and turnaround time with post-layout or silicon validation.

Block-Level Sizing: from Systematic to Intelligent. A rigorous, design insight-aware comparison was performed between contemporary AI-empowered sizing and conventional systematic (e.g.,  $g_m/I_D$ ) methods across four representative blocks: a StrongARM comparator, two Miller-compensated op-amps (standard- and low-power targets), and a crosscoupled LC VCO, spanning  $0.35 \,\mu\text{m} - 65 \,\text{nm}$ . The flow adopts an AI-driven global search in two phases (worst-case corner optimization followed by all-corner optimization), followed by MC analyses. Designers remain in the loop only to validate design intent and, if needed, adjust specifications, avoiding experience-heavy decisions. The optimizer is based on an online surrogate-assisted differential evolution (DE) framework that trains a light ANN surrogate on-the-fly and selects infill samples via beta distribution-based ranking. Quantitatively, the comparator case achieves a 62% reduction in the power–noise FoM (from 6.32 nW · V to 2.42 nW · V) while meeting all 16 corners, with delay and power both improved over the literature reference. Similar all-corner gains are observed post-layout. In the op-amp case, overshoot in unity-gain configuration is eliminated by constraint reformulation, obtaining lower power (476  $\mu$ W vs. 856  $\mu$ W), while sustaining other performance metrics. For the LC VCO, the measured design reaches a PN of -120.6 dBc/Hzat 1 MHz and a FoM of 187.9 dBc/Hz, competitive with state-of-the-arts. These results, including three silicon validations, show that AI-empowered sizing can match designer intent while improving both efficiency and design performance, with human focusing on encoding high-level objectives rather than device-level heuristics.

Subsystem Co-Design: LDO-VCO with AI-Driven EDA. The first AI-driven codesign flow was proposed that optimizes an LC-tank VCO and its integrated LDO simultaneously, explicitly capturing the trade-off between various noise sources. 32 PVT corners are considered using the same optimization engine for an apple-to-apple comparison with the sequential (first VCO then LDO) approach. On a 65 nm design targeting 5.5–5.6 GHz, co-design improves PN by 1.2 dB at 1 MHz offset, reduces dynamic power by 28.8%, and increases FoM by 2.4 dBc/Hz relative to the sequential flow. The runtime drops from 18 hours (7 hours VCO + 11 hours LDO) to 6 hours on a 32-core workstation, evidencing scalability of learning-assisted optimization beyond single blocks.

#### System-Level Sizing: Global-Local Framework for Asynchronous SAR ADCs.

A SAR-ADC sizing framework is developed that cascades a fast global explorer with a derivative-free local optimizer under multi-fidelity simulation. The global phase enforces automatically derived coarse constraints including step-size ratio error (SSRE), sampling error, thermal noise, and power, obtained analytically from top-level SNDR targets and it finishes within 3–4 hours. The local phase then applies a parallel, multi-fidelity PS that interleaves inexpensive checks with periodic full-cycle FFT analysis accelerated by time-interleaved transient runs, converging within another 3 hours. Across 12 cases (7–12 bit, 100 kHz–250 MHz, 65 nm), the framework achieves up to 72.2 dB SNDR and FoM of 177.3 dB, while automating specification allocation and inter-block co-optimization, which addressed the key limitations of block-level methods.

Overall Impact. Together, these studies (i) link the EDA and design communities, showing that designer-in-the-loop, AI-empowered sizing with silicon validation can surpass experience-driven flows while preserving design intent; (ii) demonstrate the effectiveness of the co-design approach using AI-driven EDA algorithms that remain beyond the reach of systematic manual design; and (iii) propose a holistic design methodology for analog small systems, bridging a gap previously unaddressed by both the EDA and design communities.

#### 1.5 Thesis Outline

This dissertation comprises six chapters. Chapter 1 introduces the motivation for analog design automation, reviews challenges of manual design, and outlines the role of EDA tools, research objectives, and contributions. Chapter 2 provides the background and literature review. It surveys common analog building blocks, reviews traditional and automated methodologies, and analyzes EDA sizing techniques, with emphasis on mathematical optimization, evolutionary algorithms, and surrogate-assisted methods. Chapter 3 addresses block-level sizing. An AI-empowered flow with designer interaction is developed for schematic sizing, and case studies on op-amps, comparators, and VCOs demonstrate improved efficiency and design performance over manual design with measurement results. Chapter 4 extends to subsystem-level design through a co-design framework integrating VCOs and LDOs for communication applications. Post-layout validation shows gains in PSRR, noise, and efficiency. Chapter 5 advances to system-level sizing with a hybrid framework combining surrogate model-assisted optimization and multi-fidelity PS. Twelve SAR ADC design cases are explored, with benchmarking against conventional block-level methods

highlighting improvements in quality and efficiency. Finally, Chapter 6 summarizes the contributions across block-, subsystem-, and system-level sizing, discusses practical impact, and outlines future directions including layout optimization and advanced machine learning methods for full analog synthesis.

## Chapter 2

## Background and Literature Review

## 2.1 Background on Analog Circuits

Analog ICs process signals that vary continuously in time and amplitude. They provide the indispensable interface between the physical world and digital world: amplifying microvolt-level sensor outputs, filtering noise, generating clocks, converting between analog and digital domains, and regulating on-chip supplies, etc.. This section classifies analog circuits by their time behavior (static vs. dynamic) and details the working principles of representative blocks that are widely found in modern applications.

As summarized in Table 2.1, static and dynamic analog circuits differ fundamentally in time behavior, design intent, and verification focus. Static analog circuits operate around a fixed bias point and can be approximated as linear time-invariant (LTI) systems under small-signal conditions. This enables accurate frequency domain analysis (e.g., gain, bandwidth, loop stability) within a limited operating region where device behavior remains approximately linear. Dynamic circuits are time-varying, either linear but periodically time-varying (LPTV) due to clocked switching, or strongly nonlinear due to regeneration, quantization, or oscillation. These distinctions determine the appropriate analysis techniques and dictate which performance metrics become most critical to meeting design specifications.

Static blocks draw a continuous quiescent current  $I_Q$  to establish transconductances and pole locations. Power is primarily static and can be calculated with  $P_{\rm static} \approx I_Q V_{DD}$ , with minimal dynamic components. LPTV blocks typically exhibit negligible static power consumption within their core switching networks, while consuming energy predominantly from capacitive charging and the overhead of clock or local oscillator drive circuits, exem-

Table 2.1: A comparison between static and dynamic analog circuits.

|                       | Static                                                                                   | Dynamic                                                                                          |
|-----------------------|------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
| Time behavior         | Continuous bias, steady operating point                                                  | Clocked, sampled, regenerative, or autonomous oscillation                                        |
| Linearity             | Small-signal linear around bias                                                          | Strongly nonlinear or LPTV                                                                       |
| Power profile         | Quiescent                                                                                | Often zero static power, dynamic charging/switching dominates                                    |
| Primary<br>analysis   | AC/noise models, loop gain and PM                                                        | Transient, PSS/PNOISE for periodic, timing and metastability                                     |
| Core specifications   | Power, DC gain, UGB, phase margin, PSRR, CMRR, noise, offset                             | Decision time, jitter/PN, SNDR, INL/DNL                                                          |
| Typical nonidealities | Limited swing, finite $r_o$ , flicker noise, stability margins                           | Kickback, metastability, limit cycles, reference drop                                            |
| Verification focus    | PVT corners, MC for offset/noise, stability                                              | Timing across PVT, metastability probability, jitter/PN, settling                                |
| Example blocks        | Op-amps (telescopic, folded-cascode, two-stage), active filters, bandgaps, LDOs, buffers | Dynamic comparators (StrongARM, double-tail), VCOs/PLLs (LC, ring), SC filters, mixers, SAR ADCs |

plified by

$$P_{\rm dyn} \approx \sum_{i} \left( C_i V_i^2 f_{\rm tog} \right) + P_{\rm clk} + P_{\rm ref},$$
 (2.1)

where  $C_i$  is the capacitance being charged,  $V_i$  is the corresponding voltage swing,  $f_{\text{tog}}$  is the effective switching frequency (i.e., toggle rate),  $P_{\text{clk}}$  is the power required to drive clock, and  $P_{\text{ref}}$  accounts for additional reference or biasing circuitry. Oscillators and phase-locked loops (PLLs) are an important exception, which require a steady bias to sustain oscillations in addition to divider and loop switching.

Representative static blocks include op-amps such as telescopic [28], folded-cascode [29], and two-stage Miller compensated op-amps [30] together with active-RC/Gm-C filters [31], bandgap references (BGR) [32], LDOs [33], and unity gain buffers [34]. The design emphasis is small-signal gain, stability, and noise. First-order approximations can guide architecture selection and sizing, for example, the gain and unity gain bandwidth (UGB) can be estimated with:

$$A_{\nu} \approx g_m r_o,$$
 (2.2)

$$f_u \approx \frac{g_m}{2\pi C_I},\tag{2.3}$$

where  $g_m$  is the transconductance,  $r_o$  is the output resistance, and  $C_L$  is the load capacitance.

Verification is primarily conducted in the frequency domain through alternating current (AC) and noise analyses. This is complemented by PVT corner and MC simulations to assess performance spread, mismatch, and offset distributions. Typical nonidealities include finite output resistance, flicker noise, headroom limits, and compensation trade-offs. Core specifications include power, DC gain, UGB, phase margin (PM), power supply rejection ratio (PSRR), common mode rejection ratio (CMRR), noise, and offset.

Dynamic circuits comprise two broad classes. The first is LPTV, exemplified by switched-capacitor (SC) networks [35] and passive mixers [36], where circuit parameters change with clock or local oscillator, and performance is captured by periodic steady-state (PSS) and periodic noise (PNOISE) analyses. Canonical behaviors include effective resistance in SC networks and frequency translation in mixers, for example:

$$R_{\rm eq} = \frac{1}{Cf_s},\tag{2.4}$$

where C is the switched capacitance and  $f_s$  is the switching frequency. The second category comprises strongly nonlinear circuits, such as dynamic comparators (e.g., StrongARM [37] and double-tail latches [38]), VCOs [39], PLLs [40], and SAR ADCs [41], where regenerative feedback, limit-cycle behavior, or quantization effects dominate [42]–[45]. Representative behavioral models aid in initial sizing and provide insight into dominant dynamic characteristics, such as decision time and oscillation frequency:

$$t_d \approx \tau \ln \left( \frac{V_{\text{swing}}}{V_0} \right),$$
 (2.5)

$$\tau \approx \frac{C_L}{g_{m,\text{eff}}},\tag{2.6}$$

$$f_0 = \frac{1}{2\pi\sqrt{LC}},\tag{2.7}$$

where  $t_d$  is the decision time,  $\tau$  is the time constant,  $V_{\text{swing}}$  is the output voltage swing,  $V_0$  is the initial input differential voltage,  $C_L$  is the load capacitance,  $g_{m,\text{eff}}$  is the effective transconductance, L is the inductor value, and C is the capacitance in the resonant tank. Corresponding core specifications shift to decision time and metastability probability for comparators, PN and tuning range for VCOs/PLLs, and SNDR and integral nonlinearity (INL) and differential nonlinearity (DNL) for SAR ADCs.

# 2.2 Three Generations of Analog Circuit Design Methods

This section organizes analog circuit design practice into three methodological "generations" [1], [4]. Generation I relies on long-channel quadratic laws and closed-form hand analysis. Generation II centers on the transconductance efficiency  $g_m/I_D$  for navigating weak–moderate–strong inversion with process-portable sizing. Generation III formulates design as an optimization problem, either with analytic constraints or simulation-driven.

#### 2.2.1 Generation I: Quadratic Hand Analysis

Under long-channel, strong-inversion assumptions, MOS devices follow simple relations enabling paper-and-pencil sizing. For an NMOS,

$$I_D \approx \frac{1}{2}\mu C_{\text{ox}} \frac{W}{L} (V_{GS} - V_{TH})^2 (1 + \lambda V_{DS}), \qquad g_m \approx \frac{2I_D}{V_{\text{ov}}}, \qquad r_o \approx \frac{1}{\lambda I_D},$$
 (2.8)

where  $I_D$  is the drain current,  $\mu$  is the carrier mobility,  $C_{\rm ox}$  is the gate-oxide capacitance per unit area, W and L are the channel width and length,  $V_{GS}$  is the gate-source voltage,  $V_{TH}$  is the threshold voltage,  $\lambda$  is the channel-length modulation parameter,  $V_{DS}$  is the drain-source voltage,  $g_m$  is the transconductance,  $V_{\rm ov} = V_{GS} - V_{TH}$  is the overdrive voltage, and  $r_o$  is the output resistance. These lead to first-order block metrics such as

$$A_{\nu} \approx g_{m} r_{o}, \qquad \omega_{p} \approx \frac{1}{R_{\text{out}} C_{L}}, \qquad f_{u} \approx \frac{g_{m}}{2\pi C_{C}},$$
 (2.9)

where  $A_v$  is the low-frequency gain,  $g_m$  is the effective transconductance,  $r_o$  is the output resistance,  $\omega_p$  is the dominant pole frequency,  $R_{\text{out}}$  is the output resistance seen at the node of interest,  $C_L$  is the load capacitance,  $f_u$  is the UGB, and  $C_C$  is the compensation capacitance.

A representative design flow based on the quadratic long-channel law proceeds as follows: (i) select the circuit topology and establish the bias point; (ii) choose the overdrive  $V_{ov}$  and bias currents from bandwidth and noise requirements; (iii) back-solve the device dimensions (W/L) from the targeted  $g_m$  and  $r_o$ ; (iv) select the compensation network (e.g.,  $C_C$  and zero placement) to satisfy the desired PM; and (v) validate the design and iterate using SPICE simulation. This approach offers high interpretability, enables rapid first-order sizing, and is effective for static, approximately LTI blocks. However, its accuracy degrades in deep-submicron technologies (e.g., modern CMOS nodes < 130 nm) due to

velocity saturation, mobility degradation, short-channel effects, and body effect. Moreover, weak/moderate inversion operation, device mismatch, and PSRR/CMRR constraints are captured only at a coarse level. Therefore, it is largely used in educational settings and for initial sizing of some low-frequency designs with legacy processes (> 180 nm).

#### 2.2.2 Generation II: The $g_m/I_D$ Method

The transconductance efficiency  $g_m/I_D$  parameterizes the speed–noise–power trade-space via inversion level and maps directly to device sizing through EKV/BSIM lookup tables (LUTs) or process curves as shown in Fig. 2.1. Asymptotically,

$$\left. \frac{g_m}{I_D} \right|_{\text{weak}} \approx \frac{1}{nV_T} \text{ (typically } 20 \sim 30 \,\text{V}^{-1}\text{)},$$
 (2.10)

$$\left. \frac{g_m}{I_D} \right|_{\text{strong}} \approx \frac{2}{V_{\text{ov}}},$$
 (2.11)

where  $g_m$  is the transconductance,  $I_D$  is the drain current, n is the subthreshold slope factor,  $V_T$  is the thermal voltage, and  $V_{ov}$  is the overdrive voltage.

A typical  $g_m/I_D$ -based sizing flow proceeds as follows: (i) select a target  $g_m/I_D$  band from bandwidth, noise, and energy objectives; (ii) set  $g_m$  from the UGB  $f_u$  or from a noise budget, then compute  $I_D = g_m/(g_m/I_D)$ ; (iii) map to device dimensions (W/L), overdrive  $V_{\rm ov}$ , and operating region using process curves or LUTs; (iv) verify  $r_o$ , headroom, PM, PSRR/CMRR, and layout feasibility; and (v) refine.



Figure 2.1: The  $g_m/I_D$  based method.

The method is technology-portable, spans weak—moderate—strong inversion, and makes the power—speed—noise trade-offs explicit. It is most effective for quasi-LTI blocks, such as op-amps, gm—C filters, and LDOs, where performance metrics such as UGB and PSRR map directly to  $g_m$  and  $r_o$ . However, strongly time-varying or nonlinear behaviors (e.g., regeneration, limit cycles, quantization in dynamic circuits) are only indirectly captured. Furthermore, process curves are sampled at coarse intervals to reduce characterization time and memory cost, which limits its accuracy and may still require further tuning. In practice, the  $gm/I_D$ -based design methodology is widely adopted in both academia and industry, particularly for initial transistor sizing and early-stage performance estimation. It enables efficient exploration of power-constrained design spaces, such as those encountered in Internet of things (IoT) and biomedical applications. Supported by EKV models, LUT-based sizing environments, and some commercial EDA tools such as those from Synopsys and Mentor, this approach balances intuition and numerical accuracy. However, it is not fully automated and still relies on designer expertise for iterative refinement and results interpretation.

#### 2.2.3 Generation III: Optimization-Centric Sizing

Analog circuit sizing is formulated as an optimization problem with objective(s) and constraints, solved either from analytic models or via simulation-in-the-loop search with PVT considerations. Below is a single-objective example.

$$\begin{aligned} & \underset{x}{\min} \quad P(x) \\ & \text{s.t.} \quad A_0(x) \geq A_{\min}, \\ & f_u(x) \geq f_{\min}, \\ & \text{PM}(x) \geq \phi_{\min}, \\ & \text{PSRR}(x) \geq \rho_{\min}, \\ & x \in X. \end{aligned} \tag{2.12}$$

Equation-Driven/Convex Optimization. When specifications can be formulated as analytic, convex-like constraints (e.g., small-signal gain, UGB, PM and noise), geometric programming provides fast, globally optimal sizing, where global optimality is guaranteed [11]. However, the range of problems that can be addressed is limited by whether the underlying model can be transformed into a convex form. Many important behaviors, such as regeneration, sampling, and oscillation, cannot generally be expressed in convex terms, and therefore fall outside the scope of these methods. In addition, the accuracy of the equations is limited compared with BSIM models.

Simulation-Driven (Black-Box) Optimization. In this approach, SPICE analyses (AC, Transient, PSS, PNOISE) serve directly as evaluators within single- or multi-objective search frameworks, such as gradient-based, heuristic, or Bayesian methods. Surrogate models are often used to reduce the number of expensive simulations. This strategy is well suited for dynamic and nonlinear blocks, which supports capturing behaviors that are difficult to approximate analytically including comparator decision time and metastability, VCO PN, and SAR ADC SNDR. However, the computational cost is inevitably higher than that of quadratic law or  $g_m/I_D$ -based methods. Recent advances in surrogate modeling and efficient search algorithms have made the runtime increasingly manageable in practice, which will be discussed in Section 2.4.

#### 2.2.4 Common Verification Loop

Design verification follows a common procedure regardless of the methodology. Sign-off requires PVT analysis to guarantee functionality and stability margins, and MC simulations to quantify mismatch-induced variability such as input-referred offset and pole dispersion. For realistic yield assessment, MC should be performed at the identified worst-case corners.

#### 2.2.5 Comparison

In summary, Generation I enables rapid, first-order design via long-channel models and remains useful for early-stage estimation. Generation II introduces inversion-aware sizing through the  $g_m/I_D$  methodology, widely applied in power-constrained designs for its balance of efficiency and portability. Generation III formulates sizing as an optimization problem, supporting accurate and relatively efficient analog IC sizing. Fig. 2.2 summarizes the comparison across generations.

## 2.3 Local Optimization Methods

Local optimization techniques represent the earliest approaches adopted for analog circuit sizing. During the 1980s and early 1990s, gradient-based methods such as steepest descent, Newton methods, and sequential quadratic programming (SQP) were used, where circuit equations could be approximated in convex or near-convex form for efficient optimization.

## Gen I Quadratic law

#### Closed-form/intuition

- Closed-form sizing
- Fast, interpretable
- Degrades in deep submicron

## $\begin{array}{c} \textbf{Gen II} \\ \textbf{g}_{\textbf{m}}/\textbf{I}_{\textbf{D}} \text{ method} \end{array}$

#### Systematic sizing

- Process-portable flow
- Explicit power-speednoise trade-off
- LUTs (accuracy vs memory)
- Limited for dynamic blocks

## **Gen III**Optimization-based

#### Automated sizing

- Integrates
   PVT/MC/parasitics
- Accurate and surrogate model reduces runtime
- Higher computation cost

Figure 2.2: Three generations of analog design methods.

Representative efforts, such as OPASYN [46], demonstrated the utility of convex and geometric programming frameworks for structured circuit classes. With the increasing reliance on simulation-based flows in the 1990s, derivative-free algorithms, including the Nelder–Mead (NM) simplex method and PS, were introduced to handle highly nonlinear behaviors where analytic gradients were unavailable.

#### 2.3.1 Gradient-Based Methods

When analytic or adjoint sensitivities are available, quasi-Newton or SQP [47], [48] provides fast refinement. While gradient-based methods (e.g., Newton/BFGS/SQP, interior-point) converges rapidly on smooth small-signal objectives, many analog performance metrics (e.g., transient settling, comparator decision time, SAR ADC SNDR) are noisy, nonsmooth, or even discontinuous, making gradient-free local methods more practical. Moreover, when reliable derivatives are unavailable, finite-difference approximations become both computationally expensive and brittle in the presence of SPICE noise, further motivating the adoption of derivative-free alternatives.

## 2.3.2 Nelder-Mead (Simplex)

The NM algorithm [49], [50] is a derivative-free local optimization technique that has been widely applied in engineering design problems, including analog circuit sizing. Unlike gradient-based methods, NM requires single objective function evaluations, making it suitable for noisy or nonsmooth responses often encountered in SPICE simulations.

The method maintains a simplex, i.e., a set of n+1 vertices in an n-dimensional design space. At each iteration, the vertices are ordered by their objective values, and the worst point is replaced by a new candidate generated through geometric operations relative to the centroid of the remaining vertices. The standard update rules include:

- Reflection:  $x_{\text{refl}} = c + \rho(c x_{\text{max}})$ , where c is the centroid,  $\rho > 0$  is the reflection coefficient and  $x_{\text{max}}$  is the worst vertex.
- Expansion:  $x_{\text{exp}} = c + \chi(x_{\text{refl}} c)$ , with  $\chi > 1$  promoting aggressive search when reflection improves the solution.
- Contraction:  $x_{\text{con}} = c + \gamma(x_{\text{max}} c)$ , with  $0 < \gamma < 1$  for conservative search if reflection fails.
- Shrink:  $x_i \leftarrow x_{\min} + \sigma(x_i x_{\min})$ , with  $0 < \sigma < 1$  contracting the simplex around the best vertex  $x_{\min}$ .

By iteratively applying these steps, NM adaptively explores the local design space without derivatives. Its strengths are simplicity, robustness to noise, and low overhead in low-to-moderate dimensions. In analog sizing, NM is typically used to fine-tune design parameters after global exploration, either in hybrid flows that interleave global and local updates or through simple concatenation where NM refines the best solution returned by a global search. Its limitations include potential stagnation in high dimensions and lack of convergence guarantees beyond low-dimensional smooth functions.

#### 2.3.3 Pattern Search Method

PS [51] is another derivative-free local optimization method designed for black-box and nonsmooth problems. Unlike simplex-based schemes, PS explores the design space by evaluating a set of candidate directions around the current solution and accepts a move if any candidate improves the objective.

At iteration t, given the incumbent  $x^{(t)}$  and mesh size  $\Delta^{(t)}$ , PS evaluates trial points of the form

$$x^{(t)} + \Delta^{(t)}d, \quad d \in D, \tag{2.13}$$

where D is a positive spanning set of directions (e.g., the coordinate basis  $\{\pm e_i\}_{i=1}^n$ ). In each iteration, the algorithm evaluates the objective at the poll points  $x^{(t)} + \Delta^{(t)}d$  for all  $d \in D$ , as shown in Fig. 2.3.



Figure 2.3: The illustration of PS in a 2-D problem.

If at least one poll point improves upon the incumbent, the best improving point is accepted as the new incumbent  $x^{(t+1)}$ , and the mesh size  $\Delta^{(t)}$  may be expanded to accelerate progress. If no poll point improves the objective, the incumbent is retained, the mesh size is reduced (typically by a constant factor such as 1/2), and a new poll step is attempted.

This poll-and-update cycle is repeated until a convergence criterion is satisfied, commonly when the mesh size  $\Delta^{(t)}$  falls below a prescribed threshold, or when successive iterations yield negligible improvement in the objective. In this way, PS adaptively balances global exploration (through larger mesh sizes) and local refinement (through progressively finer meshes).

The strengths of PS are its ability to handle nonsmooth, discontinuous, or noisy objective functions without requiring gradients. Its convergence is guaranteed under mild conditions to a Clarke stationary point [52]. In analog circuit sizing, PS is particularly useful for tuning mixed discrete—continuous parameters (e.g., unit capacitor counts, transistor fingers) and for refining post-layout designs where SPICE responses may be noisy or irregular. Compared to NM, PS scales more reliably in higher dimensions and supports parallel evaluation of the poll set, making it attractive in simulation-driven optimization flows. The pseudo code is shown in Algorithm 1.

#### Algorithm 1 PS for Analog Circuit Sizing

**Require:** Objective  $f(\cdot)$ , initial point  $x_0$ , initial mesh size  $\Delta_0$ , direction set D, bounds [l,u], integer mask isInt, expansion factor  $\alpha > 1$ , contraction factor  $0 < \beta < 1$ , tolerance  $\varepsilon$ , maximum iterations T.

```
1: x \leftarrow \text{ProjectAndRound}(x_0, l, u, \text{isInt})
 2: f_x \leftarrow f(x), \Delta \leftarrow \Delta_0
 3: for t = 1 \rightarrow T do
                                                              improved \leftarrow false
 4:
             x_{\text{best}} \leftarrow x, \quad f_{\text{best}} \leftarrow f_x,
             for each d \in D do
 5:
                    x_{\text{trial}} \leftarrow x + \Delta d
 6:
                    x_{\text{trial}} \leftarrow \text{ProjectAndRound}(x_{\text{trial}}, l, u, \text{isInt})
 7:
                    f_{\text{trial}} \leftarrow f(x_{\text{trial}})
 8:
                    if f_{\text{trial}} + \varepsilon < f_{\text{best}} then
 9:
                          x_{\text{best}} \leftarrow x_{\text{trial}}, \quad f_{\text{best}} \leftarrow f_{\text{trial}}
10:
11:
                          improved \leftarrow true
12:
                    end if
             end for
13:
             if improved then
14:
                                          f_x \leftarrow f_{\text{best}}
15:
                    x \leftarrow x_{\text{best}},
                    \Delta \leftarrow \alpha \Delta
                                                                                                                                          ⊳ expand mesh
16:
17:
             else
18:
                    \Delta \leftarrow \beta \Delta
                                                                                                                                        ▷ contract mesh
             end if
19:
             if \Delta < \Delta_{\min} then
20:
                    break
21:
22:
             end if
23: end for
24: return x, f_x
```

As a rule of thumb, NM is lightweight and effective for  $n \le 15$  with moderate noise. PS scales better to mixed discrete and continuous variables and parallel evaluation. Both are robust and are now used frequently as second stages after global exploration.

# 2.3.4 Summary

Gradient-based local methods are ideal when small-signal, differentiable models exist. Otherwise, NM and PS provide practical, derivative-free refinement that respects search bounds and constraints, tolerates SPICE noise, and integrates cleanly after global optimizer.

# 2.4 Global Optimization Methods

Different from local methods, global optimization algorithms do not require an initial design and are capable of escaping local minima in highly nonlinear, multimodal analog design spaces. Evolutionary and swarm-based heuristics use population-based search strategies to explore the design space efficiently. More recently, surrogate model-assisted global optimization has been introduced to accelerate convergence by combining data-driven models with heuristic exploration, reducing the number of circuit simulations while maintaining global search capability.

#### 2.4.1 Evolutionary and Swarm-Based Heuristics

Evolutionary and swarm-based algorithms represent a major branch in analog design sizing, motivated by their robustness in handling nonconvex, nonlinear, and discrete search spaces. Representative examples include genetic algorithms (GA), particle swarm optimization (PSO), ant colony optimization (ACO), DE, artificial bee colony (ABC), and grey wolf optimizer (GWO) [15], [53]–[57]. These methods operate on a population of candidate solutions or probabilistic perturbations, iteratively improving performance by exploring diverse regions of the design space.

A common feature of evolutionary and swarm-based heuristics is that they maintain a population of candidate solutions rather than a single trajectory. Their effectiveness relies on two key principles: (i) the update strategy, which defines how new candidates are generated from existing ones (e.g., via mutation and crossover), and (ii) the elitism mechanism, which ensures that the best solutions identified to date are retained, thereby avoiding degradation of the objective value across iterations.

DE exemplifies these principles. Each candidate vector  $x_i$  is perturbed to form a mutant  $u_i$  through weighted difference operations among randomly chosen individuals:

Mutation and crossover. For a population matrix  $X^{(t)} \in \mathbb{R}^{NP \times D}$ , with NP individuals and D design variables, the following strategies are commonly applied:

• Strategy 1 (current-to-best/1):

$$u_i = x_i + F(x_{\text{best}} - x_i) + F(x_{r1} - x_{r2}),$$
 (2.14)

where  $x_{\text{best}}$  is the current best solution, F is a scaling factor, and r1, r2 are distinct random indices.

• Strategy 2 (rand/2):

$$u_i = x_{r5} + F[(x_{r1} - x_{r2}) + (x_{r3} - x_{r4})],$$
 (2.15)

with  $r1, \ldots, r5$  chosen as distinct random indices.

Crossover is applied through a binomial mask:

$$u_{ij} = \begin{cases} u_{ij}, & \text{if } \text{rand}(0,1) < CR, \\ x_{ij}, & \text{otherwise,} \end{cases}$$
 (2.16)

where CR denotes the crossover rate. This preserves some parent components while introducing mutant traits.

**Elitism mechanism.** Each trial vector competes directly with its parent and the one with the better objective survives:

$$x_i^{(t+1)} = \begin{cases} u_i, & \text{if } f(u_i) \le f(x_i^{(t)}), \\ x_i^{(t)}, & \text{otherwise.} \end{cases}$$
 (2.17)

This guarantees monotonic preservation of the best solution across generations and prevents performance regression.

Other heuristics, such as GA, PSO, and ACO, adopt different update rules such as velocity updates in PSO, or pheromone trails in ACO, but all adopt some form of elitism to ensure monotonic improvement in the population's best solution. They have been applied to transistor sizing and analog layout generation. In analog design automation, these properties make evolutionary and swarm-based heuristics attractive for black-box optimization, since they require no derivatives and can accommodate non-convexity, discontinuities, and discrete design choices. Their key advantage is the ability to escape local minima without requiring derivatives or convex approximations, making them well suited for simulation-in-the-loop optimization. However, their major limitation is computational cost: convergence is slow and often requires thousands of simulations, making them challenging for large-scale circuits or post-layout verification.

#### 2.4.2 Surrogate Model-Assisted Optimization

To alleviate the computational burden of simulation-driven optimization, surrogate model and machine learning-assisted methods have gained increasing popularity [17], [26], [58]–[60]. The surrogate model-assisted optimization process is described in Fig. 2.4. It begins with an initial set of candidate designs evaluated by high-fidelity simulations, which forms the initial database. An optimization algorithm (e.g., DE) then generates new candidates, while a surrogate model is trained with history data from database and iteratively updated to approximate expensive simulations. Candidate designs are screened using infill sampling criteria based on the prediction result from the trained surrogate model, and only the most promising one(s) are validated through accurate simulations. The database will then be updated with the new design(s). This cycle continues until convergence, at which point the best design is obtained.



Figure 2.4: The surrogate model-assisted optimization flow.

Surrogates such as GPR, random forests, or neural networks are used to approximate the expensive SPICE evaluation function. In analog design, surrogates are particularly useful for dynamic or strongly nonlinear blocks where one simulation is expensive (e.g., PSS/PNOISE analysis for VCO PN or transient for SAR ADC SNDR evaluation). By learning from a small set of simulations, surrogate-assisted methods reduce the number of evaluations by orders of magnitude. The trade-off lies in model fidelity: accuracy depends heavily on the quality and coverage of the training samples. Sparse sampling can result in degraded predictive accuracy and wasted evaluations.

Two surrogate modeling approaches are particularly popular in analog circuit sizing. The first is GPR, which provides not only a smooth approximation of circuit performance but also an explicit measure of predictive uncertainty, making it well suited for Bayesian optimization and active learning frameworks. The second is the use of ANNs, which can capture highly nonlinear relationships between design variables and performance metrics and scale effectively to high-dimensional problems. GPR excels in sample efficiency and uncertainty quantification but struggles in very high dimensions, whereas ANNs handle large-scale, complex mappings but require larger training sets and careful regularization to avoid overfitting. Both approaches are widely used to accelerate optimization by reducing the number of expensive SPICE evaluations.

# 2.4.2.1 Gaussian Process Regression

GPR [25] is a nonparametric Bayesian method used to approximate black-box functions when observations are limited and potentially noisy. It models an unknown scalar-valued function f(x) as a distribution over functions, such that the outputs corresponding to any finite set of inputs follow a multivariate Gaussian distribution.

Let  $f: \mathbb{R}^d \to \mathbb{R}$  be a latent function modeled as a Gaussian process:

$$f(x) \sim GP(m(x), k(x, x')),$$

where m(x) is the mean function, and k(x,x') is a positive semi-definite kernel function.

Given training inputs  $X = [x_1, ..., x_N]^T$  and noisy observations  $y = f(X) + \varepsilon$  with  $\varepsilon \sim N(0, \sigma_n^2 I)$ , the joint distribution of training outputs and the function value at a test point  $x_*$  is:

$$\begin{bmatrix} y \\ f(x_*) \end{bmatrix} \sim N \left( \begin{bmatrix} m(X) \\ m(x_*) \end{bmatrix}, \begin{bmatrix} K(X,X) + \sigma_n^2 I & K(X,x_*) \\ K(x_*,X) & K(x_*,x_*) \end{bmatrix} \right),$$

where K(A,B) denotes the kernel matrix with entries  $[K(A,B)]_{ij} = k(a_i,b_j)$ .

The posterior predictive distribution for  $f(x_*)$  is Gaussian:

$$f(x_*) | X, y \sim N(\mu_*, \sigma_*^2),$$

where:

$$\mu_* = m(x_*) + K(x_*, X)[K(X, X) + \sigma_n^2 I]^{-1}(y - m(X)),$$
  
$$\sigma_*^2 = K(x_*, x_*) - K(x_*, X)[K(X, X) + \sigma_n^2 I]^{-1}K(X, x_*).$$

This formulation provides both prediction and uncertainty estimation at new points as shown in Fig. 2.5, enabling trade-off between exploration and exploitation in Bayesian optimization (BO) framework. However, computational complexity scales cubically with the number of training points  $O(N^3)$ , which limits GPR to moderate-sized datasets. [61] introduces a sparse pseudo-input Gaussian process (SPGP) surrogate model for analog circuit sizing. By using  $M \ll N$  inducing pseudo-inputs, training and prediction costs drop from  $O(N^3)$  and  $O(N^2)$  to approximately  $O(NM^2)$  and  $O(M^2)$ , respectively. This achieves substantial runtime reduction without sacrificing surrogate modeling power. Another limitation is that, for constrained optimization problems, a distinct GPR is required for each constraint, leading to the need for parallel training and higher computational overhead.



Figure 2.5: An 1D GPR example.

In GP-based BO, acquisition functions are used to convert the GP posterior mean  $\mu(x)$  and uncertainty  $\sigma(x)$  into a scalar criterion for guiding the next evaluation. Several classical acquisition functions are widely adopted [62], [63]. The probability of improvement (PI) selects points most likely to outperform the current best  $f^+$  but is often overly greedy. The upper (or lower) confidence bound (UCB/LCB) balances exploration and exploitation through a tunable parameter  $\kappa$  in the form  $\mu(x) \mp \kappa \sigma(x)$ , and is especially suitable

for parallel optimization. For multi-objective optimization, the expected hypervolume improvement [64] is frequently applied, preferring samples that enlarge the dominated hypervolume of the Pareto front. Constrained variants, such as constrained expected improvement (cEI) or feasibility-weighted UCB, incorporate the probability of satisfying specifications, making them well suited for engineering design tasks.

Among these, EI is widely used due to its closed form, omitting tuning parameters, and effective trade-off between exploitation and exploration. For a minimization task, let  $f(x) \sim N(\mu(x), \sigma^2(x))$  be the GP posterior at candidate x, and let  $f^+$  denote the best observed value so far. The improvement is defined as  $I(x) = \max(f^+ - f(x), 0)$ . The EI criterion is its posterior expectation:

$$EI(x) = \mathbb{E}[I(x)] = (f^{+} - \mu(x)) \Phi(z) + \sigma(x) \phi(z), \quad z = \frac{f^{+} - \mu(x)}{\sigma(x)}, \quad (2.18)$$

where  $\Phi(\cdot)$  and  $\phi(\cdot)$  denote the standard normal cumulative distribution function (CDF) and probability density function (PDF), respectively. The first term in (2.18) favors candidates with small predicted mean (exploitation), while the second term favors those with large uncertainty (exploration). Extensions such as batch EI, noisy EI, and cEI [65], [66] are commonly used in practice for parallel, noisy, or constrained optimization tasks.

#### 2.4.2.2 Artificial Neural Networks

ANNs [67] provide flexible, differentiable function approximators that map design variables to circuit performance metrics without requiring explicit physics-based models. An example ANN with two hidden layers is shown in Fig. 2.6. More generally, for an ANN with L layers:

$$\hat{y} = f_{\theta}(x) = \phi_L(W_L \phi_{L-1}(\cdots \phi_1(W_1 x + b_1) \cdots) + b_L),$$

where x represents design variables (e.g., device sizes, biases, capacitor ratios),  $\hat{y}$  stacks one or more performance metrics (e.g., gain, UGB, PM, PSRR, noise),  $\{\phi_{\ell}\}$  are nonlinear activations (e.g., ReLU, GELU, tanh), and parameters  $\theta = \{W_{\ell}, b_{\ell}\}$  are trained from simulation data.

The ANN model is trained on a dataset  $\{(x_i, y_i)\}_{i=1}^N$  generated by circuit simulations, where the network parameters  $\theta$  are obtained by minimizing a regularized least-squares objective,

$$\min_{\theta} \frac{1}{N} \sum_{i=1}^{N} ||y_i - f_{\theta}(x_i)||_2^2 + \lambda ||\theta||_2^2, \tag{2.19}$$

#### Hidden Layer 1 Hidden Layer 2



Figure 2.6: An ANN with two hidden layers.

with the first term representing the mean squared error and the second term providing  $L_2$ -regularization to mitigate overfitting. Once trained, the ANN acts as a computationally efficient surrogate model in the optimization frameworks, substantially reducing the number of costly SPICE evaluations. To improve conditioning, inputs are typically normalized (z-score or min-max), and bounded through logit or hyperbolic tangent transformations. Discrete variables can be encoded via ordinal embeddings.

Unlike GPR, vanilla ANNs do not provide uncertainty. To include uncertainty, deep ensembles, MC dropout, or heteroscedastic heads [68]–[70] can be included, which yield variance estimates usable by acquisition functions or trust-region gating at higher machine learning cost. While ANNs scale well to high-dimensional, nonlinear problems, they exhibit limited data efficiency compared with GPR. In analog circuit sizing, where simulations are expensive and datasets are typically small, this limitation increases the risk of overfitting.

Similarly, when an ANN surrogate is used, it is trained or updated in each iteration and used to obtain predicted performance  $\hat{y}_{ij}$  for n candidates with m metrics. To select the next design for expensive evaluation, probability of further improvement (PFI) can be computed on top of the ANN predictions.

First, each performance column j is normalized to [0,1] and fitted by a Beta distribution (Fig. 2.7) [71]  $Y_j \sim \text{Beta}(\alpha_j, \beta_j)$  with CDF  $F_j(\cdot)$ , capturing the empirical shape (skewed, U-shaped, etc.) induced by the current ANN outputs across the population. For a metric with lower bound specification  $S_j$ , the PFI for candidate i is

$$B_{j}^{i} = \begin{cases} F_{j}\left(Y_{j}^{i}\right) - F_{j}\left(S_{j}\right), & \text{if } Y_{j}^{i} > S_{j}, \\ 0, & \text{otherwise,} \end{cases}$$

$$(2.20)$$

where  $S_j$  is the  $j_{th}$  specification. The candidate's potential is then calculated by summing the PFIs across all performance metrics,

$$PO(i) = \sum_{j=1}^{m} B_j^i, \tag{2.21}$$

and candidates are ranked by PO(i) to choose the design for next expensive simulation.



Figure 2.7: The illustration of Beta distribution.

There are several advantages using PFI:

- Surrogate-aligned, uncertainty-robust: PFI integrates over the Beta fit of the ANN's population predictions instead of relying on a single, potentially inaccurate variance estimate.
- Constraint-aware without penalties: Equation 2.20 truncates the tail on the feasible side of each specification  $S_j$ , avoiding ad hoc penalty coefficients and directly relating violation to feasibility.

#### 2.4.3 ANN vs. GPR.

For low–moderate dimensions and limited data, GPR is typically preferable due to its sample efficiency and calibrated uncertainty. For higher dimensions, large datasets, or multi-output regressions, ANNs scale better, offer fast training and evaluation. The comparison is detailed in Table 2.2.

| Aspect             | GPR                                                          | ANN                                                          |
|--------------------|--------------------------------------------------------------|--------------------------------------------------------------|
| Data regime        | Small $N$ (tens)/ low-mid $d$                                | $\label{eq:mid-high} \mbox{Mid-to-large $N/$ mid-high $d$}$  |
| Uncertainty        | Native, posterior variance                                   | Not native, use ensembles/MC-dropout/heteroscedastic heads   |
| Scalability        | $O(N^3)$ in train, $O(N)$ in inference, sparse GP for relief | $O(N^2)$ in train, $O(N)$ in inference, scales well with $N$ |
| Expressiveness     | Strong with kernels, limited in high $d$                     | Very high (nonlinear interactions)                           |
| Hyperparameters    | Few, interpretable                                           | Many (architecture, regularization), tuning needed           |
| Typical use in EDA | Small-to-mid $d$ circuits with ex-                           | High $d$ circuits, multi-output                              |

specs

pensive SPICE

Table 2.2: ANN vs. GPR as analog surrogate modeling.

# 2.5 Summary

This chapter establishes the theoretical foundation for the thesis by first classifying analog circuits into static and dynamic types, emphasizing their differing behaviors, performance metrics, and verification focuses. It then reviews three generations of analog circuit design methodologies, from analytical hand design to optimization-centric approaches, emphasizing the shift toward automation and data-driven design. Local optimization techniques such as gradient-based methods, the NM, and PS are then introduced, followed by global optimization approaches including evolutionary heuristics and surrogate model-assisted methods. The discussion concludes with a comparison of ANN and GPR for surrogate modeling. Together, these topics form the methodological basis for the AI-empowered optimization frameworks developed in later chapters.

# Assessing AI-Empowered Optimization Techniques for Analog Building Block Sizing

### 3.1 Introduction

This chapter provides a comprehensive, design insight—based comparison between an AI-empowered analog building block sizing framework and the conventional manual methodology. Sizing is a critical step in analog circuit design, but the process is highly time-consuming because of the large number of design variables and possible solutions. Traditionally, sizing has relied heavily on iterative manual effort. As a result, both the analog IC design community and the EDA communities have proposed numerous methods to address this challenge. The two communities, however, adopt fundamentally different perspectives: the design community emphasizes design insights and systematic manual optimization, whereas the EDA community focuses on automation through optimization algorithms.

In the design community, the first-generation sizing methodology is based on transistor equivalent circuit models [1]. Its major challenge is accuracy [72]. Results based on derivation using the equivalent circuit model tend to exhibit large discrepancies from both SPICE simulation and actual silicon behavior. To address this challenge, the  $g_m/I_D$ -based sizing methods have become widely used in the industry [73], in which, transistor equivalent circuit models are replaced by SPICE simulation/measurement-based LUTs. While this approach has demonstrated effectiveness [74]–[77], to make the LUTs manageable,

simplifications are necessary. For example, the tables are often restricted to a small number of fixed grid points for L,  $V_{GS}$ ,  $V_{DS}$  and  $V_{BS}$  [78]. Moreover, the effect of width on the behavior of the transistor, such as  $V_{th}$ , is often neglected [79], [80], which can lead to inaccuracies [81].

Building on key decisions derived from transistor equivalent circuit models or  $g_m/I_D$  (e.g., selecting width-to-length ratios of critical transistors or applying variable conversions for low-voltage design), optimization techniques have also been used in the analog design community. Recent works include applications to oscillators and analog filters [82]–[84]. These approaches complement manual optimization and have demonstrated excellent results.

In summary, the dominant approach in the analog IC design community remains design insight—driven systematic manual or algorithm-assisted optimization, where the design insights are the key. As a result, the obtained sizing solutions are typically consistent with the designer's intentions. However, this reliance on experience presents a drawback: intuitions and estimations based on prior design knowledge play a critical role, which can limit both the quality and efficiency of the sizing process, particularly for less experienced designers.

On the other hand, the EDA community approaches sizing from a different perspective. The sizing problem is formulated into a SPICE simulation-based black-box optimization problem and the target is to develop efficient solvers (i.e., optimizers). Design insights are largely omitted aside from straightforward cases (e.g., enforcing identical dimensions for differential pair transistors), and most methods in the literature aim for a fully automated "one-button" flow [85], [86]. Since SPICE simulations are used throughout, the challenges on model accuracy are naturally addressed and complex design experience-based decisions are avoided. However, a major challenge is time consumption [82], [87]. When considering complete and stringent specifications across full PVT corners, the optimization can be highly time-consuming due to the many necessary SPICE simulations.

To improve efficiency, AI techniques were introduced. An online machine learning-assisted evolutionary algorithm was proposed for radio frequency (RF) IC synthesis in 2014 [88]. Machine learning-assisted optimization was then introduced into analog building block sizing such as BO-based methods [86], [89], [90]. However, analog building block sizing sometimes needs to consider more than 20 performance metrics, and some of them are hard to handle by machine learning techniques (e.g., transient response-related ones) [17]. As a result, although many methods significantly reduce the number of required simulations, machine learning cost becomes a new challenge [59].

Recently, this challenge was addressed and novel machine learning-assisted analog IC sizing methods have been proposed. These include surrogate model-assisted optimization frameworks [17], [59], [60] and reinforcement learning (RL)-based design flows [20], [91], [92]. Efficient surrogate model-assisted sizing method for high-performance analog building blocks (ESSAB) [17] is one of the algorithms that addresses this challenge and shows effectiveness when considering the whole set of specifications, including all the hard-to-learn ones, which will be used later in this chapter as an optimization engine example.

Although recent new progress in the EDA community shows potential for practical use, the conventional design insight/experience-based systematic manual sizing method still dominates the design community. This raises the necessity of a comprehensive comparative study of the above two methods. In the literature, most EDA research works that focus on sizing algorithms provide only schematic-level simulation results. Even for those with silicon validation, often only one kind of circuit is considered. Moreover, comparison with solutions obtained by conventional systematic manual design methods using comprehensive design insights is little, but this could be most important for convincing the design community.

# 3.2 Contributions

To fill this gap, this work performs a comparative study. In this thesis, the term AI-empowered emphasizes workflows that involve designer participation, whereas AI-driven denotes processes in which AI algorithms make all decisions with minimal human intervention. The focus here is on AI-empowered sizing, which leverages AI capabilities while maintaining designer interactions. Using four case studies, including a comparator, two amplifiers (one standard and one low power design), and an oscillator, from 65 nm to 0.35  $\mu$ m technologies, the design solutions obtained by the AI-empowered sizing are compared with expert designs from literature and/or industry, where the design insight-dominated systematic sizing is used. Comprehensive design insight-based comparison for all the case studies (i.e., why a design from a certain kind of method is better based on circuit working principles) and silicon validation for three of them are provided. The remainder of this chapter is organized as follows. Section 4.2 introduces the AI-empowered analog building block sizing approach. Section 4.3 demonstrates the comparison using four case studies. Concluding remarks are presented in Section 4.4. The key contributions of this work are listed below:

- 1. Reliability and Designer Intent Alignment: The study evaluates whether solutions obtained by AI-empowered analog IC sizing methods are reliable and consistent with the designer's intentions, addressing a long-standing concern in the design community [93].
- 2. Quality and Efficiency Comparison: It investigates how the AI-empowered analog IC sizing methods perform in terms of design solution quality and efficiency compared to the dominant method in the design community.
- 3. Role of Design Insights: The research investigates the role of design insights in AI-empowered sizing.

# 3.3 The AI-Empowered Analog Building Block Sizing Approach

#### 3.3.1 AI-Empowered Sizing Framework

Fig. 3.1 shows the framework for an AI-empowered sizing method with human interactions, where \* is used to annotate human interactions. It has eight main steps.



Figure 3.1: The flow diagram of the AI-empowered analog building block sizing approach.

#### • Step 1\*: Problem Formulation

The inputs include the circuit topology, design variables with wide search ranges (i.e., without needing much design insight or any initial design), design specifications, the FoM (objective function), and PVT corners. In this step, the analog IC sizing problem is formulated as a single-objective constrained optimization problem.

#### • Step 2: First-Phase Optimization: Worst Case Corner

An AI-empowered global optimization is performed considering the worst-case corner (WCC), which is case dependent. This choice is motivated by the observation that satisfying the specification at the WCC often promotes robustness across the remaining PVT corners. In this work, the corner with high temperature, low supply voltage, and slow NMOS/slow PMOS serves as the WCC example. The goal of this phase is to obtain an optimal design that satisfies the specifications under the WCC.

#### Step 3\*: Designer Validation and Feedback

Designers analyze the obtained design by observing its responses in the time and/or frequency domain. The responses to be analyzed are case-dependent. For example, the transient waveform of a step response may be examined for a unity-gain buffer.

#### • Step 4\*: Problem Re-Formulation (If Necessary)

If the obtained design is not aligned with the designer's intentions, the optimization problem can be reformulated. This may involve structural modifications (e.g., topology changes) or specification adjustments (e.g., adding an overshoot constraint). The optimization is then reinitiated from Step 2.

#### • Step 5: Second-Phase Optimization: All Corners

Once the WCC solution is approved, the second phase optimization is carried out. Here, another AI-driven global optimization using full-corner simulations is executed to obtain a PVT-robust design.

#### • Step 6\*: Designer Validation

The designer re-validates the results. A successful outcome satisfies all specifications across PVT corners and achieves an optimal FoM. At this stage, second phase sizing is considered complete, and the process advances to layout and post-layout verification. If validation fails, the process returns to Step 4\*.

#### • Step 7\*: Layout and Post-Layout Analysis

After passing all-corner validation, the design undergoes layout generation and postlayout simulations, followed by MC simulations. Passing this step indicates that the design process is nearly complete.

#### • Step 8\*: Final Designer Validation

If the design fails to meet specifications in layout stage, the reasons must be identified. One approach is to first attempt layout-level optimization. If the design still fails, a straightforward remedy is to apply over-design by leaving additional margin for the violated specification. Since layout optimization is not considered in this framework, Step 4\* may need to be revisited based on this analysis.

The role of the human designer in AI-empowered analog IC sizing is crucial. While many EDA methods aim for a fully automated, "one-button" approach where the user simply specifies targets and the tool outputs a complete design in one go, designers should remain in the loop. Their involvement enhances AI-driven sizing by interpreting intermediate results and refining specifications (i.e., designer validation and problem re-formulation in Fig. 3.1), which ensures that the design aligns with intent, as initial specifications (i.e., problem formulation in Fig. 3.1) may not fully capture requirements, and a mathematically optimal solution may not always be desirable.

Note that this is different from using optimizers in the conventional systematic manual sizing. In such methods, although optimizers are used, key sizing decisions (e.g., the ratio between width and lengths of key transistors [82]) come from design insights. These design insights need strong experience and may compromise the design quality. In contrast, in AI-empowered analog IC sizing, the designer does not make such experience-based decisions. Instead, they only need to validate the solution obtained by the optimizer and (re)formulate the sizing problem to better represent the design requirements. Therefore, this AI-empowered analog IC sizing approach does not replace the designer, but avoids the reliance on extensive design experience required in conventional manual sizing methods. In other words, the designer only needs to analyze the sized circuits.

# 3.3.2 Global Optimization Engine

A global optimization engine is required in the AI-empowered sizing framework to efficiently explore the design space of analog building blocks. In this work, ESSAB algorithm [17] is used as the optimizer. ESSAB is an online machine learning-assisted global optimization approach specifically designed for analog IC sizing problems, where simulation-based evaluations are computationally expensive.

Surrogate-assisted optimization is a well-established technique that builds an approximate model (surrogate) of the optimization target to reduce the number of expensive simulations. ESSAB follows this principle and integrates three key components: a search engine, a surrogate modeling method, and an infill sampling strategy. It uses an ANN to model the relationship between circuit parameters and performance metrics, which has lower training cost than a Gaussian surrogate model when the number of specifications is large. DE is used as the core global search technique. And the infill strategy uses a Beta distribution-based ranking mechanism to guide the selection of new sample points for simulation, effectively balancing exploration and exploitation. The ESSAB algorithm [17] proceeds through seven steps as shown in Fig. 3.1, which are detailed as follows:

- 1. **Initialization:** Sample a small number  $\alpha$  of candidate solutions from the design space  $[a,b]^d$  using the Latin hypercube sampling method, where d is the number of design variables. Perform SPICE simulations for each sample and store the results in the initial database.
- 2. **Stopping Criterion Check:** If a predefined stopping condition is met (e.g., maximum number of iterations or convergence threshold), output the best design found so far. Otherwise, proceed to the next step.
- 3. Candidate Selection for Search: Rank all designs in the database using the infill sampling criterion to balance exploration and exploitation. Select the top  $\lambda$  designs as a parent population P.
- 4. Offspring Generation: Apply DE mutation and crossover operations to P to generate  $\lambda$  child candidate solutions.
- 5. Surrogate Model Construction: Select the top  $\tau$  solutions from the database and use them as training data to construct an ANN model for performance prediction.
- 6. Child Solution Evaluation: Evaluate the  $\lambda$  child solutions using the ANN model and rank them using the infill sampling criterion.
- 7. **Simulation and Update:** Simulate the estimated best child solution from Step 6 using SPICE. Add its performance to the database, then return to Step 2.

In this work and the other works in this thesis,  $\alpha$ ,  $\lambda$ ,  $\tau$  are set to be  $5 \times d$  following an empirical design rule. The initial search space is defined to be relatively large, while ensuring full compliance with process design kit (PDK) limits and design common sense. This allows the exploration to cover all potentially feasible regions without violating manufacturing or reliability limits. The machine learning-assisted nature of ESSAB, with its adaptive surrogate modeling and infill sampling, makes the algorithm rapidly learns from evaluated designs and focuses the search on promising regions, enabling efficient convergence even when the search space is large. This property is particularly advantageous for inexperienced designers, as it reduces the need for precise search space specification at the outset. The robustness of ESSAB to large search spaces is further demonstrated in four representative design cases presented in Section III.

# 3.4 Comparative Study Using Four Design Cases

In this section, four case studies, including a comparator, an amplifier (standard and low power design), and an oscillator, are used to compare the AI-empowered analog building block sizing method and the conventional systematic design method. These four circuits are selected as representative analog building blocks encompassing both static and dynamic behaviors, enabling a comprehensive evaluation of the AI-empowered sizing method

across diverse design objectives. The designs for the former method are obtained by following the flow in Section 2. For the latter method, the selected reference designs are expert designs, which are obtained from various resources: the reference comparator design is from the literature [94]; the reference amplifier design using the standard voltage is from an industry IP core; the low-power amplifier design is compared with state-of-the-art in the literature; for the oscillator, the reference design is from industry experts, and it is also compared with state-of-the-arts in the literature. Measurement results are provided for the amplifier and oscillator designs, and post-layout simulation results are used for the comparator.

The experimental implementation is illustrated in Fig. 3.2. User inputs are specified in YAML format and processed by MATLAB R2024a, which serves as the global optimization engine. Circuit performance evaluation is carried out using the Spectre simulator within Cadence Virtuoso 23.1, with communication between MATLAB and Cadence managed through Ocean scripts. PVT corners are defined in the Virtuoso Assembler. All simulations run on a workstation equipped with an AMD Ryzen Threadripper PRO 3975WX CPU and 290 GB of RAM, utilizing 32 cores for parallel execution. Reported runtimes correspond to wall-clock time.



Figure 3.2: Workflow of the experimental implementation.

# 3.4.1 StrongARM Latch Comparator

As a representative dynamic circuit, the StrongARM latch comparator is selected as the first case study due to its widespread use in high-speed and low-power mixed-signal systems. The StrongARM latch comparator to be designed is shown in Fig. 3.3. It operates in two phases: during reset, internal nodes are reset to  $V_{DD}$ ; during evaluation, the in-

put differential pair first performs a pre-amplification by discharging the internal nodes proportionally to the input difference, after which the cross-coupled latch regenerates this imbalance to full-swing outputs at exponential rate. This clocked operation enables high-speed, low-power decision without static current, which makes the StrongARM latch especially attractive for energy-efficient ADC applications [95].

The 13 design variables to be decided are shown in Table 3.1. The search ranges for the design variables are wide. 16 corners include slow NMOS/slow PMOS (SS), slow NMOS/fast PMOS (SF), fast NMOS/slow PMOS (FS), and fast NMOS/fast PMOS (FF) in combination with high temperature (HT, 125°C) and low temperature (LT, -40°C), and with high supply voltage (HV, 1.26V) and low supply voltage (LV, 1.08V). The clock frequency is 20 MHz and the supply voltage is 1.2 V. The common-mode input voltage is 0.8 V. The reference design is from [94], based on which the specifications are set (Table 3.2). It can be seen that the specifications are demanding. The same 180 nm CMOS technology is used for both designs.



Figure 3.3: Schematic of the classic StrongARM latch comparator.

In this case study, the ESSAB algorithm finds a reliable and optimal design in Step 2 and Step 4. Validation shows that there is no need to re-formulate the optimization problem. The AI-empowered sizing method takes 4 hours to finish the whole sizing process. The transient responses are shown in Fig. 3.4.

Table 3.1: Design variables and search ranges of StrongARM latch comparator.

| Variables                                | Min.   | Max.       | Ref. design<br>[94] | AI-<br>empowered<br>design |
|------------------------------------------|--------|------------|---------------------|----------------------------|
| $\mathbf{L_{Mb}}$ ( $\mu \mathrm{m}$ )   | 0.18   | 1          | 0.18                | 0.34                       |
| $\mathbf{L_{1,2}}~(\mu\mathrm{m})$       | 0.18   | 1          | 0.18                | 0.20                       |
| $\mathbf{L_{3,4}}~(\mu\mathrm{m})$       | 0.18   | 1          | 0.18                | 0.265                      |
| $\mathbf{L_{5,6}}$ ( $\mu\mathrm{m}$ )   | 0.18   | 1          | 0.18                | 0.39                       |
| $\mathbf{L_{7,8}}$ ( $\mu\mathrm{m}$ )   | 0.18   | 1          | 0.18                | 0.58                       |
| $\mathbf{L_{9,10}}$ ( $\mu \mathrm{m}$ ) | 0.18   | 1          | 0.18                | 0.18                       |
| $\mathbf{W_{Mb}}$ ( $\mu \mathrm{m}$ )   | 1.76   | 80         | 8.8                 | 10.76                      |
| $\mathbf{W_{1,2}}$ ( $\mu \mathrm{m}$ )  | 2.2    | 100        | 44                  | 91.5                       |
| $\mathbf{W_{3,4}}$ ( $\mu \mathrm{m}$ )  | 0.88   | 40         | 4.4                 | 24.0                       |
| $\mathbf{W_{5,6}}$ ( $\mu m$ )           | 0.88   | 40         | 8.8                 | 16.44                      |
| $\mathbf{W_{7,8}}$ ( $\mu\mathrm{m}$ )   | 0.88   | 40         | 4.4                 | 17.04                      |
| $\mathbf{W_{9,10}}$ ( $\mu \mathrm{m}$ ) | 0.88   | 40         | 2.2                 | 19.49                      |
| $C_X/NH$ (F)                             | 22f/10 | 17.5 p/288 | 1.1 p/72            | 0.28 p/36                  |

The comparison results between the AI-empowered design and the reference design in [94] are shown in Table 3.1 and Table 3.2 (both are pre-layout simulation results). It can be seen that the AI-empowered design outperforms the reference design using the conventional method considering most performance metrics. Particularly, a 62% enhancement is demonstrated for the power and noise product (i.e., the FoM). Additionally, the proposed design satisfies the specifications for all 16 corners, in contrast to the reference design.

It can be seen that compared to the reference design, the RMS input-referred noise (IRN) is reduced by approximately 40% with power reduced by 35% without speed trade-off. More design insights are as follows. Referring to [94], the IRN is contributed by both the dynamic integrator and the latch, illustrated by

$$\sigma_{n,in} = \sqrt{\sigma_{n,int}^2 + \frac{\sigma_{n,latch}^2}{A_{int}^2}}$$
(3.1)

where  $A_{int}$  is approximate as

$$A_{int} = \frac{g_m}{I_D} \cdot V_{THN} \tag{3.2}$$

which is determined by the input transistor  $g_m/I_D$  and the threshold voltage  $V_{THN}$ .

Table 3.2: Pre-layout performance values of the design obtained by AI-empowered method and the reference design of StrongARM latch comparator.

| Performances                               | Specifi-<br>cations | Ref. design [94] (Nominal) | AI-<br>empowered<br>design<br>(Nominal) | Ref. design<br>[94]<br>(WCC) | AI-<br>empowered<br>design<br>(WCC) |
|--------------------------------------------|---------------------|----------------------------|-----------------------------------------|------------------------------|-------------------------------------|
| Power· IRN $(nW \cdot V)$                  | Minimize            | 6.32                       | 2.42                                    | 5.77                         | 2.24                                |
| Power $(\mu W)$                            | $\leq 90$           | 89                         | 53                                      | 89                           | 51                                  |
| Delay (ns)                                 | $\leq 6$            | 3.77                       | 2.69                                    | 8.88                         | 5.82                                |
| Reset (ns)                                 | $\leq 1$            | 1.8                        | 0.47                                    | 3.56                         | 0.85                                |
| Reset error $(\mu V)$                      | $\leq 20$           | 0.75                       | 0.02                                    | 16.16                        | 0.04                                |
| Set error $(\mu V)$                        | $\leq 20$           | 4.31                       | 1.13                                    | 3.28                         | 2.28                                |
| ${\rm Pos\_res\_int^1}~(\mu{\rm V})$       | $\leq 15$           | 2.99                       | 0.85                                    | 1.72                         | 1.29                                |
| $\mathrm{Neg\_res\_int^2}~(\mu\mathrm{V})$ | $\leq 15$           | 3.58                       | 1.03                                    | 1.56                         | 0.97                                |
| ${\rm Pos\_res\_out^3}~(\mu{\rm V})$       | $\leq 15$           | 0.83                       | 0.04                                    | 13.53                        | 0.07                                |
| Neg_res_out <sup>4</sup> $(\mu V)$         | $\leq 15$           | 0.08                       | 0.02                                    | 2.63                         | 0.03                                |
| IRN ( $\mu$ Vrms)                          | $\leq 70$           | 70.68                      | 45.95                                   | 64.65                        | 53.97                               |

<sup>&</sup>lt;sup>1</sup> Reset error at integration node  $V_{X+}$ .

A larger integration gain,  $A_{int}$ , is preferred to obtain lower noise, which is achieved by AI-empowered sizing (i.e., 8.7 vs. 6 of the reference design). A larger  $A_{int}$  needs a larger  $g_m/I_D$  of the input transistor pair. For the input pair, the transistor width is doubled (91.5  $\mu$ m compared to 44  $\mu$ m), providing a larger  $g_m/I_D$ . Additionally, the AI-empowered design increases the enable switch resistance by approximately two times, resulting in reduced  $V_{gs}$  for the input transistor pair and hence increases the  $g_m/I_D$ . Moreover, a 100 fF capacitance increase at the output nodes due to cross-coupled pair and pull-up transistors also contributes to noise reduction, while a minimum value is used in the reference design.

Compared to the reference design, a 48.8% speed improvement is shown. Referring to [94], [96], [97], the integration time can be approximated as

$$T_{int} = \frac{C_X}{I_D} \cdot V_{THN},\tag{3.3}$$

where  $I_D$  is the common-mode drain current of the input pair and  $V_{THN}$  is the value for transistors M3 and M4.

<sup>&</sup>lt;sup>2</sup> Reset error at integration node  $V_{X-}$ .

 $<sup>^3</sup>$  Reset error at output node  $V_{OUTP}$ .

 $<sup>^4</sup>$  Reset error at output node  $V_{OUTN}$ .

Although  $I_D$  is higher, the  $C_X$  of the reference design is 1.1 pF compared to 0.28 pF in the AI-empowered design, causing larger delays in the pre-amplification phase. Considering the integration time (Fig. 3.4), the AI-empowered design records 2 ns in contrast with 3 ns of the reference design. In the latch regeneration phase, the differential output voltage has an exponential dependence on time controlled by the time constant  $C/(g_{mn} + g_{mp})$ . The dominance of the pull-down behavior in the latch regeneration phase is noted. The increased value of  $W_{3,4}/L_{3,4}$  in the proposed design enhances  $g_{mn}$ , thereby decreasing the pull-down time. Specifically, the pull-down time in the AI-empowered design is measured at 0.43 ns, as opposed to 0.62 ns in the reference design. Combining the effects, the total delay for the AI-empowered design is reduced. It should also be noted that the power consumption is halved because the AI-empowered design has a  $C_X$  of 0.28 pF compared to 1.1 pF of the reference design.



Figure 3.4: Transient response of the AI-empowered design and the reference design [94] at output nodes  $V_{OUTP}$  and  $V_{OUTN}$ .

Using the design obtained from Step 5, Step 6\* is carried out and all the specifications are satisfied in MC analysis. The layout is then carried out (Fig. 3.5) and the post-layout simulation results are shown in Table 3.3, Fig. 3.6, and Fig. 3.7.



Figure 3.5: Layouts for the AI-empowered design and the reference design [94] of the StrongARM latch comparator.

Table 3.3: Post-layout performance values of the AI-empowered design (left) and the reference design of StrongARM latch comparator (right).

| Performances                             | Specifi-<br>cations | Ref. design<br>[94]<br>(Nominal) | AI-<br>empowered<br>design<br>(Nominal) | Ref. design<br>[94] (WCC) | AI-<br>empowered<br>design<br>(WCC) |
|------------------------------------------|---------------------|----------------------------------|-----------------------------------------|---------------------------|-------------------------------------|
| Power· IRN (nW·V)                        | Minimize            | 7.81                             | 3.21                                    | 6.26                      | 5.11                                |
| Power $(\mu W)$                          | $\leq 90$           | 106                              | 67                                      | 96                        | 59.6                                |
| Delay (ns)                               | $\leq 6$            | 4.68                             | 3.4                                     | 9.88                      | 6.78                                |
| Reset (ns)                               | $\leq 1$            | 2.25                             | 0.58                                    | 4.06                      | 0.98                                |
| Reset error $(\mu V)$                    | $\leq 20$           | 3.1                              | 0.7                                     | 53.05                     | 0.01                                |
| Set error $(\mu V)$                      | $\leq 20$           | 0.61                             | 0.59                                    | 0.50                      | 7.37                                |
| $Pos\_res\_int~(\mu V)$                  | $\leq 15$           | 0.22                             | 0.35                                    | 0.58                      | 3.89                                |
| Neg_res_int $(\mu V)$                    | $\leq 15$           | 0.01                             | 0.31                                    | 0.17                      | 3.85                                |
| $Pos\_res\_out~(\mu V)$                  | $\leq 15$           | 2.96                             | 0.83                                    | 48.54                     | 0.48                                |
| Neg_res_out $(\mu V)$                    | $\leq 15$           | 0.15                             | 0.13                                    | 4.52                      | 0.48                                |
| $\overline{\text{IRN }(\mu\text{Vrms})}$ | ≤ 70                | 73.71                            | 47.88                                   | 65.21                     | 52.67                               |



Figure 3.6: Comparison of the post-layout responses of the AI-empowered design and the reference design [94] of the StrongARM latch comparator.

It can be observed that, in this case, the ESSAB solver, when applied to PVT corners, not only generates designs consistent with the working principles of the targeted comparator but also makes subtle decisions (particularly, several subtle decisions together) that are challenging for conventional systematic design methods, leading to superior performance. This is because ESSAB is driven by data (i.e., machine learning and heuristic optimization techniques), instead of a series of empirical decisions affected by model accuracy and the designer's experience and intuition. After validating the obtained design, the optimization problem does not need to be reformulated and a one-button design process is achieved.



Figure 3.7: Comparison of the speed of the AI-empowered design and the reference design [94] across 16 corners of the StrongARM latch comparator, with post-layout (a) delay time and (b) reset time grouped by four process corners.

#### 3.4.2 Two-Stage Miller-Compensated Op-Amp (3.3 V)

In this case study, a two-stage Miller-compensated op-amp is used. It is one of the most widely adopted amplifier topologies in analog IC design. Compared with the previous dynamic circuit case, this static analog circuit exhibits more intricate design trade-offs among gain, bandwidth, stability, power consumption, and linearity. It consists of a differential input stage that provides high gain and a second gain stage that further boosts overall amplification and drives the load. A compensation capacitor is connected between the output of the second stage and the intermediate node, serving primarily for pole splitting, which separates the high-impedance first-stage output node from the second-stage output node and improves stability.

Fig. 3.8 shows the structure of the two-stage Miller-compensated op-amp, which is an IP core from a leading semiconductor company to demonstrate its 0.35  $\mu$ m technology. Its 16 design variables are shown in Table 3.4. The search ranges for the design variables are very wide. The following 16 corners are considered: SS, SF, FS, FF in combination with HT (125°C) and LT (0°C), and with HV (3.47 V) and LV (2.97 V). It is designed to drive a 10 pF capacitor and a 10 M $\Omega$  resistor load. The supply voltage is 3.3 V and the common mode input voltage is 1.65 V. 10 performance specifications are shown in the first column of Table 3.5. The optimization goal is to minimize the power consumption of the IP core while keeping all other performances comparable, including high CMRR, PSRR, differential-mode gain (ADM), adequate PM, and gain bandwidth (GBW). Additionally, IRN and total harmonic distortion (THD) are considered in the optimization

process. Meanwhile, consistent performance across PVT corners is expected. All performance metrics are measured in unity gain feedback configuration except CMRR, which is measured using a capacitively coupled instrumentation amplifier (CCIA) structure. The same  $0.35~\mu m$  technology is used.



Figure 3.8: Schematic of the two-stage Miller-compensated op-amp. The devices shown in gray (M12–M14) are auxiliary transistors used for startup and shutdown control.

Addressing the unexpected overshoot. The sizing result (i.e., AI-empowered design 1) by running Step 2 is shown in Table 3.5 (pre-layout simulation result). Although with excellent performance (Table 3.5), an overshoot of 2.3 dB is shown in its step response when configured as a unity-gain buffer (Fig. 3.9). Design insights show that this is because of the position of the Miller compensation zero, estimated with

$$z = \frac{1}{(1/g_{m6} - R_C) \cdot C_C} \tag{3.4}$$

where  $g_{m6}$  is the transconductance of the transistor M6. The wide range of Miller resistance in the initial search ranges allows for multiple design strategies. The ESSAB solver places the zero in the left half-plane to enhance PM with  $g_{m6}$  around 0.69 mS and  $R_C$  approximately 6 k $\Omega$ . This zero contributes to the increased GBW. However, it introduces a pole-zero doublet [98], thereby engendering undesired oscillations. In contrast, the reference design aims to push the right half-plane zero to infinity to mitigate the negative impact of right-half plane zero with  $g_{m6}$  at approximately 2.38 mS and  $R_C$  around 200  $\Omega$ . Our observations in loop gain and closed loop gain responses also verify this design insight.

Table 3.4: Design variables and search ranges of two-stage Miller-compensated op-amp.

| Vars                                      | Min  | Max | Ref.<br>design | AI-<br>empowered<br>design 1 | AI-<br>empowered<br>design 3 | AI-<br>empowered<br>design 3rd<br>iteration |
|-------------------------------------------|------|-----|----------------|------------------------------|------------------------------|---------------------------------------------|
| $L_{1,2,8,9} (\mu m)$                     | 0.35 | 20  | 14.05          | 13.1                         | 11.35                        | 14.4                                        |
| $\mathbf{L_{3,4}}$ ( $\mu \mathrm{m}$ )   | 0.35 | 20  | 10.05          | 1.35                         | 3.05                         | 2.6                                         |
| $L_{5,7,11} (\mu m)$                      | 0.35 | 20  | 1.05           | 19.5                         | 4.9                          | 13.1                                        |
| $\mathbf{L_6}~(\mu\mathrm{m})$            | 0.35 | 20  | 0.55           | 1.5                          | 1.05                         | 0.9                                         |
| $\mathbf{L_{10}}~(\mu\mathrm{m})$         | 0.35 | 20  | 0.35           | 13.7                         | 7.65                         | 16.15                                       |
| $L_{12,13,14} (\mu m)$                    | 0.35 | 20  | 0.35           | 13.45                        | 5.3                          | 12.8                                        |
| $W_{1,2,8,9} (\mu m)$                     | 0.22 | 200 | 60             | 173.45                       | 138.55                       | 168.6                                       |
| $\mathbf{W_{3,4}}$ ( $\mu\mathrm{m}$ )    | 0.22 | 200 | 304            | 138.8                        | 102.2                        | 102.2                                       |
| $N_{1,2,3,4}$ (integer)                   | 1    | 10  | 1              | 2                            | 2                            | 2                                           |
| $\mathbf{W_{5,7,11}}$ ( $\mu\mathrm{m}$ ) | 0.22 | 200 | 150            | 42.1                         | 156.7                        | 35.85                                       |
| $N_5$ (integer)                           | 2    | 20  | 2              | 4                            | 4                            | 4                                           |
| $N_7$ (integer)                           | 1    | 10  | 1              | 5                            | 8                            | 8                                           |
| $\mathbf{W_6}~(\mu\mathrm{m})$            | 0.22 | 200 | 64             | 177.4                        | 195.55                       | 147.2                                       |
| $\mathbf{W_{10}}~(\mu\mathrm{m})$         | 0.22 | 200 | 10             | 30.05                        | 130.85                       | 166.8                                       |
| $W_{12,13,14} (\mu m)$                    | 0.22 | 200 | 2              | 53.8                         | 65.5                         | 49                                          |
| $\mathbf{C}_{\mathbf{C}}$ (pF)            | 0.1  | 100 | 8.94           | 6.26                         | 7.75                         | 7.37                                        |
| $\mathbf{R_{C}}$ $(\Omega)$               | 1    | 10k | 204            | 6k                           | 4.8k                         | 6k                                          |

To reduce this oscillation, the optimization problem needs to be re-formulated and Step 2 needs to be carried out again. Two parallel runs are conducted and compared. In the first run, an over-damped design similar to the IP core is expected. To achieve this, an additional specification is added restricting the overshoot value to be 10 mV (1% of the step size) maximum. The sizing result is shown in Table 3.5 (i.e., AI-empowered design 2). An over-damped design solution is obtained and the power consumption is reduced by 25% compared to the reference design while keeping other performance superior or comparable.

In the other run, the overshoot is restricted to an acceptable level of 100 mV by adding a new specification on overshoot. The sizing result is shown in Table 3.5 (i.e., AI-empowered design 3). The resultant design satisfies the proposed specifications, with its peak remaining within 1 dB of the designated step size. This is achieved with better overlapping between the second pole and left half-plane zero. It also shows successful results considering all PVT corners, and no abnormal response is observed. This design is chosen to proceed to later steps due to its superior performance in both power consumption and speed.

Table 3.5: Pre-layout performance values of the AI-empowered designs and the reference design of the two-stage Miller-compensated op-amp.

| ${f Performances}$                 | Specifi-<br>cations | Ref.<br>design<br>(Nom-<br>inal) | AI-<br>empowered<br>design 3<br>(Nominal) | Ref.<br>design<br>(WCC) | AI-<br>empowered<br>design 1<br>(WCC) | AI-<br>empowered<br>design 2<br>(WCC) | AI-<br>empowered<br>design 3<br>(WCC) | AI- empowered design 3rd iteration (Nominal) | AI-<br>empowered<br>design 3rd<br>iteration<br>(WCC) |
|------------------------------------|---------------------|----------------------------------|-------------------------------------------|-------------------------|---------------------------------------|---------------------------------------|---------------------------------------|----------------------------------------------|------------------------------------------------------|
| Power $(\mu W)$                    | Minimize            | 856                              | 476                                       | 856                     | 299                                   | 602                                   | 423                                   | 476                                          | 489                                                  |
| CMRR (dB)                          | > 90                | 91                               | 66                                        | 88                      | 26                                    | 06                                    | 86                                    | 128                                          | 112                                                  |
| PSRR (dB)                          | $\geq 100$          | 109                              | 119                                       | 1111                    | 116                                   | 117                                   | 120                                   | 121                                          | 117                                                  |
| ADM (dB)                           | $\geq 100$          | 101                              | 109                                       | 101                     | 108                                   | 102                                   | 108                                   | 108                                          | 108                                                  |
| $_{ m PM}$ ( $^{\circ}$ )          | > 60                | 99                               | 102                                       | 63                      | 09                                    | 63                                    | 92                                    | 116                                          | 106                                                  |
| GBW (MHz)                          | > 2                 | 2.97                             | 4.97                                      | 2.80                    | 6.32                                  | 3.55                                  | 4.29                                  | 4.23                                         | 4.14                                                 |
| IRN $(\mu Vrms)$                   | 9 >                 | 5.71                             | 4.14                                      | 7.62                    | 5.06                                  | 5.88                                  | 5.52                                  | 4.24                                         | 5.67                                                 |
| Rise/Fall slew rate $(V/\mu s)$    | $\geq 2/2$          | 2.28/2.12                        | 3.91/4.92                                 | 2.50/2.37               | 2.25/8.86                             | 2.00/4.19                             | 3.76/4.62                             | 4.34/4.49                                    | 4.10/4.62                                            |
| Rise/Fall settling time ( $\mu$ s) | $\leq 1/1$          | 0.67/0.97                        | 0.52/0.50                                 | 1.23/1.05               | 0.98/0.47                             | 0.97/0.85                             | 0.67/0.57                             | 0.51/0.62                                    | 0.81/0.68                                            |
| THD at 1Vpp, 1 kHz (dB)            | $06 - \geq$         | -115                             | -114                                      | -101                    | -94                                   | 96-                                   | -113                                  | -1111                                        | -92                                                  |
| Overshoot (%)                      |                     | 0.3                              | 3.9                                       | 1.3                     | 39.5                                  | 6.0                                   | 0.2                                   | 2.0                                          | 3.2                                                  |
| Undershoot (%)                     | 1                   | 3.2                              | 1.3                                       | 3.7                     | 1.8                                   | 6.3                                   | 0.7                                   | 2.3                                          | 2.1                                                  |



Figure 3.9: A comparison of the op-amp unity-gain step responses. The input step size is 1 V.

Addressing CMRR failure in the MC analysis. In Step 6\*, the MC analysis, a notable 10 dB CMRR variation in common-mode gain ( $A_{CM}$ ) is observed, originating primarily from the first stage of the circuit. After excluding the input transistors mismatch effects and the impact of fluctuations in the tail current source, the current mirror load is identified as the main source of the CMRR variation. This variation stems from the random mismatch in the NMOS mirror pair, causing a differential current at the single-ended output, which degraded  $A_{CM}$  and, consequently, CMRR. Hence, an over-design setting is used and we returned to Step 2 with CMRR specification increased from 90 dB to 100 dB. As a result, the total width of M1 increased from 277.2  $\mu$ m to 337.2  $\mu$ m and the worst-case MC CMRR satisfies the specification as shown in Fig. 3.10.



Figure 3.10: CMRR of the AI-empowered design 3rd iteration.

Results comparison & discussion. The whole sizing process to obtain the final design took 3 hours in total and the comparison results are shown in Table 3.4 and Table 3.5 (pre-layout simulation results). It can be seen that all specifications are satisfied with only 56% power consumption of the reference design. This decrease in power consumption is noteworthy for its capacity to uphold a consistent noise level across all PVT corners, consistent with the nominal corner of the reference design. In addition, the AI-empowered design satisfies the specifications for all 16 corners, in contrast to the reference IP core design, where 13 corners fail to meet the specifications in the pre-layout simulation.

More design insights are as follows. The input transistors of the AI-empowered design have a higher W/L ratio (200/2.6) compared to the reference design (300/10). This leads to a substantial increase in  $g_{m3,4}$  (324  $\mu$ S vs. 150  $\mu$ S). Consequently, this yields a substantial enhancement in the GBW, escalating from 2.97 MHz to 4.23 MHz, as determined by  $GBW \approx g_{m3,4}/C_C$ . Meanwhile, the higher  $g_m$  of the input PMOS pair in the AI-empowered design also diminishes the thermal noise, given that the primary noise source is the thermal noise of the input pair and current mirror load. The current mirror load in the optimal design is 5 times wider than that of the reference design, which leads to a larger area, minimizes random mismatch, and also improves the CMRR.

In the second stage, the reference design draws a current of 219  $\mu$ A to achieve a higher  $g_{m6}$  and a larger second pole. In contrast, the AI-empowered design operates more efficiently by consuming only 83  $\mu$ A, achieving a 44% power reduction. This results in a smaller non-dominant pole. Although this could potentially lead to a compromised PM due to the increased GBW, the presence of a smaller zero (close to GBW) actually contributes significantly to a marked increase in PM. In other words, this strategy enables a larger GBW and lower power of the AI-empowered design. Furthermore, simulations indicate that this pole-zero cancelation technique is robust under capacitor and resistor corners.

In the layout process, although the layout symmetry is maximized and a common centroid pattern to enhance capacitor matching is used, a considerable degradation in CMRR performance is noted. This is primarily attributed to mismatch caused by parasitic capacitances and resistances in the capacitor array. Unavoidable routing asymmetries introduced capacitance mismatch between the two feedback capacitors  $C_f$ , thereby limiting the effective common-mode rejection. Nonetheless, this figure remains considerably superior to the CMRR of 64 dB exhibited by the reference design, which also experiences a similar degradation after the layout.

Fig. 3.11 shows the microphotograph of the reference design and the AI-empowered design on the same die. The comparison is shown in Table 3.6 and Fig. 3.12 using the measurement results. For the reference design, the measurement result is very close to the performance values in the datasheet from the leading semiconductor company.



Figure 3.11: Chip microphotograph of the reference design, the AI-empowered design, 3.3V CCIA, and 1.8V CCIA on the same die.

Table 3.6: Measured performance values of the AI-empowered design and the reference design of the two-stage Miller-compensated op-amp.

| Performances                      | Specifi-<br>cations | Ref. design | AI-empowered design |
|-----------------------------------|---------------------|-------------|---------------------|
| Power $(\mu W)$                   | Minimize            | 924         | 528                 |
| CMRR (dB)                         | $\geq 100$          | 62          | 83                  |
| PSRR (dB)                         | $\geq 100$          | 91          | 102                 |
| GBW (MHz)                         | $\geq 2$            | 2.91        | 4.10                |
| IRN ( $\mu Vrms$ )                | $\leq 6$            | 6.03        | 4.48                |
| Rise/Fall slew rate $(V/\mu s)$   | $\geq 2/2$          | 2.34/2.39   | 5.08/4.97           |
| Rise/Fall settling time $(\mu s)$ | $\leq 1/1$          | 0.57/0.56   | 0.40/0.38           |
| THD at 1Vpp, 1 kHz (dB)           | $\leq -90$          | -101        | -107                |
| Overshoot (mV)                    | $\leq 100$          | 8           | 12                  |

From this case study, the following observations can be made: (1) The AI-empowered sizing approach can lead to superior solutions more efficiently than the conventional systematic sizing method. Machine learning-assisted global optimization can make correct and even better decisions than human designers using design experience when the optimization problem is appropriately set. This is particularly true when a single decision affects multiple performance metrics. (2) Instead of a one-button approach, which is a routine in the literature focusing on analog IC sizing algorithms in the EDA community, it is essential for designers to engage with the AI-empowered sizing approach. This is because not all the intentions can be predefined and the designer's validation and problem



Figure 3.12: Comparison of the measurement results of transient responses between the AI-empowered design and the reference design of the two-stage Miller-compensated opamp.

re-formulation can effectively guide the machine learning-assisted global optimization. (3) The AI-empowered sizing approach is more accessible to inexperienced engineers because the way to use design insights is to understand the generated design and adjust specifications. Decisions made by intuition and "gut feelings" in the conventional systematic sizing method are substantially reduced.

Note that area is not considered in the sizing problem. The pre-layout area is estimated directly from the circuit schematic by summing the active device areas  $(W \times L)$  and the capacitor and resistor area. Using this method, the AI-empowered design occupies 0.0355 mm<sup>2</sup> compared to 0.022 mm<sup>2</sup> of the reference design, which may prompt the question of whether the performance gains are primarily due to the increased area. A new sizing run was carried out in which the total circuit area was included as a performance metric. Under the area constraint, the optimizer produced a design with reduced PM (75° vs.  $116^{\circ}$ ) compared to the original AI-empowered design, but with similar GBW and noise performance. Notably, the total pre-layout area of this design is 10.5% smaller than the reference design  $(0.0196 \text{ mm}^2 \text{ vs. } 0.022 \text{ mm}^2)$ , demonstrating that high performance can be maintained with less area when it is explicitly considered in the sizing problem formulation. A comparison of the new and reference designs is in the Appendix A.

#### 3.4.3 Two-Stage Miller-Compensated Op-Amp (1.8 V)

In this case study, the same topology is sized under a  $V_{DD}$  of 1.8 V. Low-power design requires transistors to operate in weak inversion. Although power consumption is saved, low-power design often results in constrained bandwidth, necessitating the use of larger devices to ensure small  $V_{GS}$ . Additionally, transistors in weak inversion are more sensitive to process variations and environmental conditions. All of the above compromise circuit performance. Balancing these trade-offs is challenging for the conventional systematic design method.

Here, the supply voltage for the two-stage Miller-compensated op-amp in Section III (B) is reduced from 3.3 V to 1.8 V. With this low power supply, linearity is challenging because transistors may exhibit nonlinear behavior at the vicinity of the saturation region. This may lead to distortion and worse THD. Hence, another variable common-mode input voltage  $V_{cm}$  is introduced to ensure proper transistor operation. The design specifications are shown in Table 3.8. Four process corners (SS, SF, FS, and FF) are considered in this case.

After 4 hours of optimization, all active transistors are in the saturation region. The results are shown in Tables 3.7 and 3.8. It can be seen that compared to the 3.3 V design, a 56% power reduction is observed while meeting the specifications of GBW, IRN, and THD.

More design insights are as follows. The dominant noise source is the thermal noise of the input pair. Compared to the 3.3 V design, the W/L of the input transistors is increased, which increases transconductance  $g_{m3,4}$  from 324  $\mu$ S to 716  $\mu$ S, contributing to lower thermal noise. The compromise is that the current in the first stage is doubled (i.e., 40.11  $\mu$ A vs. 20.62  $\mu$ A) to maintain  $V_{gs}$  and input swing. The W/L of the current source is increased to achieve this. As  $L_5$  decreases, the CMRR decreases due to the reduction in the output resistance of the current source. The  $C_C$  is increased by approximately two times to avoid GBW being too large to be close to the second pole and thus preserves the PM.

In the second stage, the biasing current is substantially reduced from 80.78  $\mu$ A in the 3.3 V design to 20.18  $\mu$ A, resulting in a proportional decrease in  $g_{m6}$  from 1.4 mS to 0.3 mS and a smaller non-dominant pole. Despite this, the GBW remains significantly lower than the second pole in the 1.8 V design, preserving the PM. The channel length of the

Table 3.7: Design variables and search ranges of low power design.

| Vars                                         | Min. | Max. | 3.3 V<br>design | 1.8 V<br>design |
|----------------------------------------------|------|------|-----------------|-----------------|
| $L_{1,2,8,9} (\mu m)$                        | 0.35 | 20   | 14.4            | 12.45           |
| $\mathbf{L_{3,4}}$ ( $\mu\mathrm{m}$ )       | 0.35 | 20   | 2.6             | 1.45            |
| $L_{5,7,11} (\mu m)$                         | 0.35 | 20   | 13.1            | 3.5             |
| $\mathbf{L_6}~(\mu\mathrm{m})$               | 0.35 | 20   | 0.9             | 5.2             |
| $\mathbf{L_{10}}~(\mu\mathrm{m})$            | 0.35 | 20   | 16.15           | 8.1             |
| $L_{12,13,14} (\mu m)$                       | 0.35 | 20   | 12.8            | 19.15           |
| $W_{1,2,8,9} (\mu m)$                        | 0.22 | 200  | 168.6           | 148.85          |
| $\mathbf{W_{3,4}}$ ( $\mu$ m)                | 0.22 | 200  | 102.2           | 97.35           |
| $N_{1,2,3,4}$ (integer)                      | 1    | 10   | 2               | 4               |
| $W_{5,7,11} \ (\mu m)$                       | 0.22 | 200  | 35.85           | 82.2            |
| $N_5$ (integer)                              | 2    | 20   | 4               | 8               |
| $N_7$ (integer)                              | 1    | 10   | 8               | 2               |
| $\mathbf{W_6}~(\mu\mathrm{m})$               | 0.22 | 200  | 147.2           | 132.8           |
| $\mathbf{W_{10}}~(\mu\mathrm{m})$            | 0.22 | 200  | 166.8           | 166.75          |
| $\mathbf{W_{12,13,14}} \; (\mu  \mathrm{m})$ | 0.22 | 200  | 49              | 15              |
| $\mathbf{C}_{\mathbf{C}}$ (pF)               | 0.1  | 100  | 7.37            | 15.05           |
| $\mathbf{R_{C}}$ $(\Omega)$                  | 1    | 10k  | 6k              | 8.15k           |

transistor M6 is increased from 0.9  $\mu$ m to 5.2  $\mu$ m to increase the DC gain of the second stage. The reduced supply voltage limits the output swing, impacting THD. To enhance THD, a lower  $V_{gs}$  is required to reduce  $V_{ds,sat}$  and increase output swing. Consequently,  $W_7/L_7$  is increased to enhance achievable output swing and improve THD.

This design was taped out on the same chip as Case 2, as shown in Fig. 3.11. The measurement results of the 1.8 V design and the 3.3 V design are shown in Table 3.9. In Table 3.10, the 1.8 V design is compared with state-of-the-art low-noise low-power amplifiers [99]–[102] using noise efficiency factor (NEF) [100], which is defined as

$$NEF = v_{rms,in} \sqrt{\frac{2 \cdot I_{tot}}{\pi \cdot V_T \cdot 4kT \cdot BW}},$$
(3.5)

where  $V_{rms,in}$  is the rms value of the input-referred noise within the bandwidth,  $I_{tot}$  is the total current consumption,  $V_T$  is the thermal voltage, k is Boltzmann's constant, T is the absolute temperature, and BW is the noise bandwidth.

Table 3.8: Pre-layout performance values of the AI-empowered 3.3 V design and 1.8 V design.

| Performances             | Specifications | $3.3~{ m V} \  m design$ | 1.8~ m V design |
|--------------------------|----------------|--------------------------|-----------------|
| Power $(\mu W)$          | Minimize       | 475                      | 217             |
| CMRR (dB)                | $\geq 100$     | 128                      | 105             |
| PSRR (dB)                | $\geq 100$     | 116                      | 109             |
| ADM (dB)                 | $\geq 100$     | 109                      | 103             |
| PM (°)                   | $\geq 60$      | 87                       | 89              |
| GBW (MHz)                | $\geq 4$       | 5.82                     | 6.58            |
| IRN ( $\mu$ Vrms)        | $\leq 6$       | 4.62                     | 3.11            |
| THD at 2mVpp, 1 kHz (dB) | $\leq -80$     | -80.91                   | -81.25          |

Table 3.9: Measured performance values of the AI-empowered 3.3 V design and 1.8 V design of the two-stage Miller-compensated op-amp.

| Performances             | Specifications | 3.3 V<br>design | 1.8 V<br>design |
|--------------------------|----------------|-----------------|-----------------|
| Power $(\mu W)$          | Minimize       | 495             | 216             |
| CMRR (dB)                | $\geq 100$     | 83              | 71              |
| PSRR (dB)                | $\geq 100$     | 84              | 82              |
| GBW (MHz)                | $\geq 4$       | 4.50            | 5.75            |
| IRN ( $\mu$ Vrms)        | $\leq 6$       | 6.19            | 3.73            |
| THD at 2mVpp, 1 kHz (dB) | $\leq -80$     | -71             | -73             |

The folded-cascode topology is commonly used in state-of-the-art low-power designs to improve power efficiency. Compared to them, this work achieves a reasonably high NEF while maintaining state-of-the-art noise levels, and the topology is much simpler. [100], [102] adopt a chopping mechanism to shift low-frequency noise out of the signal band, and their performance gains are achieved beyond sizing. In addition, the 1.8 V design boasts a high THD comparable to [99].

From this case study, it can be observed that the machine learning—assisted global optimization algorithm makes effective design decisions for low-power operation. Despite operating in weak inversion, the optimized design achieves comparable or even superior performance to the nominal 3.3 V version, realizing over 50% power reduction while satisfying GBW, IRN, and THD specifications that are typically difficult to size manually. Moreover, the resulting biasing and sizing choices align well with established analog design principles, confirming the reliability of the AI-empowered approach.

Table 3.10: Performance values of the AI-empowered 1.8 V design of the instrumentation amplifier and the state-of-the-art.

| Ref.                 | Tech. $(\mu m)$ | Tech. VDD $(\mu m)$ (V) | $\begin{array}{c} \textbf{Power} \\ (\mu  \textbf{W}) \end{array}$ | Gain<br>(dB) |       | BW CMRR PSRR<br>(Hz) (dB) (dB) | PSRR (dB) | IRN $(\mu { m Vrms})$                 | THD (dB)       | NEF   |
|----------------------|-----------------|-------------------------|--------------------------------------------------------------------|--------------|-------|--------------------------------|-----------|---------------------------------------|----------------|-------|
| TCASI2011 [99]       | 0.35            | 33                      | 855                                                                | 33.7         | 2M    | 91                             | 71        |                                       | -76.5 at 1mVpp | 5.97  |
| TCASII2023 [100]     | 0.18            | 1.8                     | 4                                                                  | 41.5         | 10.5K | n.r.                           | n.r.      | 5.9 (4 Hz - 10.5 KHz)                 | n.r.           | 3.36  |
| TBioCAS2023<br>[101] | 0.18            | 1.2                     | 6.55                                                               | 26           | 0.7K  | 82                             | n.r.      | 1.67 (1 Hz - 100 Hz)                  | n.r.           | 15.28 |
| [TVLSI2024 [102]     | 0.18            | 1.8                     | 2.1                                                                | 40           | 1K    | 106                            | n.r.      | 0.66 (1  Hz -100  Hz)                 | n.r.           | 3.75  |
| This work            | 0.35            | 1.8                     | 217                                                                | 34           | 115K  | 71                             | 83        | $3.73~(0.5~{ m Hz}$ - $100~{ m KHz})$ | -73 at 2mVpp   | 5.07  |

#### 3.4.4 LC Oscillator

In this case study, an LC oscillator is used as another representative dynamic circuit, where PN and power efficiency dominate the design trade-offs. The LC oscillator is a fundamental building block for generating periodic signals in RF and high-speed systems. Its operation relies on an inductor–capacitor tank, which defines the oscillation frequency, and an active negative-resistance core, typically implemented with a cross-coupled transistor pair, that compensates for energy loss in the tank. Among various implementations, CMOS cross-coupled LC oscillators [103] are particularly popular because they offer excellent PN performance. The design process is intricate, given the complex trade-off between PN and power consumption, resulting in numerous redesign iterations when using the conventional systematic sizing method.

In this section, a CMOS cross-coupled LC oscillator (Fig. 3.13) is sized using the AIempowered sizing method and is compared with both a reference design from the industry (i.e., expert design) and the state-of-the-art in the literature. The FoM formula is given as

$$FoM = -10\log\left[\left(\frac{\Delta f}{f_0}\right)^2 \cdot \frac{P_{dyn}}{1mW}\right] - PN(\Delta f), \tag{3.6}$$

which serves as a natural design objective, encompassing the trade-offs between power consumption  $P_{dyn}$ , oscillation frequency  $f_0$ , and PN  $PN(\Delta f)$  at a frequency offset  $\Delta f$ . The 21 design variables and their search ranges are shown in Table 3.11, which are set by an experienced designer. 32 corners are considered, including the inductor and capacitor corners under the worst noise case. A 65 nm CMOS technology is used. The target output frequency is 5.5-6 GHz. The specifications shown in Table 4.2 adhere to the performance of the reference design, which is competitive.

For this case study, the ESSAB algorithm finds a reliable and optimal design after Phase 1 and Phase 2 sizing. Validation shows that there is no need to re-formulate the optimization problem. The AI-empowered method took 7 hours to finish the sizing.

The AI-empowered design achieves a FoM of 191.8 dBc/Hz at 5.65 GHz with a very low PN of -125 dBc/Hz at 1 MHz offset, indicating improved power efficiency compared to the expert design using the  $g_m/I_D$ -based systematic design method. Additionally, the AI-empowered design exhibits superior PN performance across all 32 corners, surpassing the reference design even in the nominal condition. In contrast, the expert design demonstrates significant variance in its PN characteristics considering the corners.



Figure 3.13: Schematic of the CMOS cross-coupled LC oscillator. An 8-bit capacitor bank is used for frequency tuning.

The primary contributor to noise at 1 MHz offset for the expert design is the thermal noise from the switching transistors, while the flicker noise of M1-M4 dominates across corners. To reduce PM in the nominal condition, the fixed capacitance increases from 0.53 pF to 1.28 pF and thus better PN and FoM. This results in a 4.1 dB reduction in PN at the 1 MHz offset. As a compromise, the tuning range (TR) is reduced from 53.9% to 31.4%. To mitigate the influence of flicker noise, the total widths and lengths of transistors M1-M4 are increased in the AI-empowered design. While increasing transistor sizes effectively reduces noise across corners, it concurrently imposes limitations on the oscillation frequency due to the increased parasitic capacitance. In order to maintain the desired frequency, the inductance is decreased in the AI-empowered design, specifically from 589 pH to 347 pH with thicker wire (i.e., 27.86  $\mu$ m vs 18.6  $\mu$ m), which not only improves power efficiency but also the FoM.

During the layout phase, a lower post-layout Q factor was observed, primarily due to parasitic effects from the varactor. To counter this, the width of the cross-coupled transistors was increased by 10% to mitigate the layout-induced varactor effects. This adjustment resulted in a compromise as the power was increased. Despite this, the PN deteriorates due to the lowered Q factor resulting from layout-dependent effects.

The AI-empowered oscillator design was fabricated and the microphotograph of the oscillator is shown in Fig. 3.14, with an area of  $350\,\mu\mathrm{m}$  x  $450\,\mu\mathrm{m}$ . The measured PN performance is shown in Fig. 3.15. A comparison was conducted by comparing the measured results with state-of-the-art oscillator designs and some observations can be made.

Table 3.11: Design variables and search ranges of the CMOS cross-coupled LC oscillator.

| Vars                                    | Min. | Max. | Reference<br>design | AI-<br>empowered<br>design |
|-----------------------------------------|------|------|---------------------|----------------------------|
| $\mathbf{L_{1,2}} \; (\mu \mathrm{m})$  | 0.2  | 5    | 0.2                 | 2.07                       |
| $\mathbf{L_{3,4}}$ ( $\mu \mathrm{m}$ ) | 0.06 | 0.24 | 0.06                | 0.24                       |
| $\mathbf{L_{5,6}}$ ( $\mu \mathrm{m}$ ) | 0.06 | 0.24 | 0.06                | 0.07                       |
| $\mathbf{W_{1,2}}~(\mu\mathrm{m})$      | 1    | 10   | 4                   | 8.17                       |
| $\mathbf{F_{1,2}}$ (integer)            | 2    | 20   | 7                   | 2                          |
| $\mathbf{M_1}$ (integer)                | 1    | 10   | 1                   | 1                          |
| $\mathbf{M_2}$ (integer)                | 10   | 1000 | 10                  | 872                        |
| $\mathbf{W_{3,4}}$ ( $\mu \mathrm{m}$ ) | 1    | 6    | 4                   | 4.60                       |
| $\mathbf{F_{3,4}}$ (integer)            | 2    | 32   | 3                   | 10                         |
| $M_{3,4}$ (integer)                     | 1    | 10   | 5                   | 2                          |
| $\mathbf{W_{5,6}}$ ( $\mu\mathrm{m}$ )  | 1    | 6    | 4.5                 | 1.72                       |
| $\mathbf{F_{5,6}}$ (integer)            | 2    | 32   | 6                   | 13                         |
| $\mathbf{M_{5,6}}$ (integer)            | 1    | 10   | 5                   | 10                         |
| $\mathbf{NH}$ (integer)                 | 10   | 200  | 68                  | 94                         |
| $\mathbf{NV}$ (integer)                 | 10   | 200  | 52                  | 88                         |
| $\mathbf{Mbot}$ (integer)               | 1    | 3    | 1                   | 1                          |
| $\mathbf{W} \; (\mu \mathrm{m})$        | 3    | 30   | 18.6                | 27.86                      |
| $\mathbf{R}~(\mu\mathrm{m})$            | 15   | 90   | 32                  | 76.58                      |
| $\mathbf{NT}$ (integer)                 | 1    | 3    | 2                   | 1                          |
| $\mathbf{S}$ ( $\mu$ m)                 | 2    | 4    | 4                   | 3.18                       |
| $\mathbf{GR}$ ( $\mu \mathbf{m}$ )      | 10   | 40   | 40                  | 21.72                      |

In Table 3.13, the proposed oscillator design exhibits competitive FoM and the second highest PN performance of -120.6 dBc/Hz at 1 MHz frequency offset, surpassing references [104]–[107]. [108] attains a PN of -123.1 dBc/Hz but with lower FoM and FoM<sub>T</sub> [109] of 186.1 dBc/Hz and 198.4 dBc/Hz, respectively. [107] achieves an ultra-low power and high TR design with a compromised PN at -110.8 dBc/Hz.

#### 3.4.5 Discussion

According to the four case studies, the comparison between a typical contemporary AIempowered analog building sizing method and the conventional systematic sizing method is summarized in Table 3.14.

Table 3.12: Performance values of the AI-empowered design and the reference design (prelayout simulation results) of the CMOS cross-coupled LC oscillator.

| Performances       | Specifi-<br>cations | Ref. design<br>(Nominal) | AI-<br>empowered<br>design<br>(Nominal) | Ref. design<br>(WCC) | AI-<br>empowered<br>design<br>(WCC) |
|--------------------|---------------------|--------------------------|-----------------------------------------|----------------------|-------------------------------------|
| FoM (dBc/Hz)       | Maximize            | 190.6                    | 193.2                                   | 178.9                | 189.6                               |
| Frequency (GHz)    | $\geq 5$            | 5.94                     | 5.76                                    | 5.50                 | 5.34                                |
| PN@100KHz~(dBc/Hz) | $\leq -94$          | -95.7                    | -99.1                                   | -79.3                | -98.8                               |
| PN@1MHz~(dBc/Hz)   | $\leq -123$         | -120.9                   | -123.2                                  | -108.9               | -121.6                              |
| PN@10MH~(dBc/Hz)   | $\leq -143$         | -142.4                   | -144.2                                  | -137.3               | -142.3                              |
| Power (mV)         | $\leq 7$            | 4.31                     | 3.32                                    | 3.28                 | 4.53                                |



Figure 3.14: Chip microphotograph of the AI-empowered oscillator design.

For all four case studies, the advantages in terms of design quality and efficiency of the AI-empowered analog building sizing method are shown. In terms of accessibility, in contrast with requiring extensive experience-based key design decisions for the systematic sizing method, AI-empowered sizing methods only require the designer to understand the working principles of the circuits and form appropriate optimization problems. This also ensures the reliability of the obtained design is as high as systematic sizing.

Alignment with Designer Intent. In the four case studies, the AI-generated solutions were generally aligned with the designer's intentions, as observed in Cases 1, 2, and 4. Rare deviations arose when the specification settings allowed multiple valid trade-offs. In such situations, the optimizer could select a technically correct solution that nonetheless differed from the designer's preferred approach or full design intent. This issue can be mitigated by refining the specification or constraint set to more accurately encode the intended design priorities, as illustrated in Fig. 3.1.



Figure 3.15: The measured PN performance of the AI-empowered design of the oscillator.

Table 3.13: Performance values of the AI-empowered design (measurement result) of the CMOS cross-coupled LC oscillator and the state-of-the-art.

| Reference           | Topo-<br>logy     | CMOS<br>Techno-<br>logy | Tuning range<br>(TR%, GHz) | VDD<br>(V) | Frequency<br>(GHz) | y Power<br>(mW) | PN<br>@1MHz<br>(dBc/Hz) | $\begin{array}{c} \text{FoM} \\ \text{@1MHz} \\ \text{(dBc/Hz)} \end{array}$ | $\begin{array}{c} \mathrm{FoM_{T}[109]} \\ \mathrm{@1MHz} \\ \mathrm{(dBc/Hz)} \end{array}$ |
|---------------------|-------------------|-------------------------|----------------------------|------------|--------------------|-----------------|-------------------------|------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|
| JSSC2007<br>[104]   | Class-C<br>P-N    | 90 nm                   | 4.5-7.1 (45%)              | 1.6        | 5.63               | 14              | -108.5                  | 172                                                                          | 185.1                                                                                       |
| TCASII2011<br>[105] | Class-C<br>N-only | 65 nm                   | 4.6-6.2 (28.6%)            | 0.65       | 5.49               | 8.7             | -113.3                  | 178.7                                                                        | 187.8                                                                                       |
| TCASII2014<br>[108] | Class-C<br>N-only | 65 nm                   | 3.36-5.1 (41%)             | -          | 4.21               | 8.7             | -123.1                  | 186.1                                                                        | 198.4                                                                                       |
| TVLSI2015<br>[106]  | Class-C<br>N-only | 180 nm                  | 3.2-5.25 (49.8%)           | 0.65       | 5.23               | 2.37            | -115.1                  | 185.7                                                                        | 199.7                                                                                       |
| TCASI2022<br>[107]  | Class-C<br>P-only | 40 nm                   | 3.2-6.49 (67.9%)           | 0.36       | 6.49               | 0.35            | -110.8                  | 191.6                                                                        | 208.2                                                                                       |
| This work           | Class-C<br>P-N    | 65 nm                   | 3.73-5.75 (42.6%)          | 1.2        | 5.75               | 6.12            | -120.6                  | 187.9                                                                        | 200.5                                                                                       |

Generalization to Other Algorithms. This work adopts ESSAB as the optimization engine, hence, a natural question is the generality of the conclusions considering other AI-empowered sizing methods. BO and RL-based approaches also attracts much attention in the EDA community. Both ESSAB and BO belong to surrogate model-assisted optimization. While BO commonly uses Gaussian process surrogates with acquisition functions to guide sampling [88], [89], [17] have noted that standard BO may encounter scalability challenges when the number of design variables and the number of performance specifications are large, due to the computational cost of surrogate training. RL-based methods [110], [111] can effectively capture sequential decision-making processes, but often require a large number of simulations to converge, and this sample inefficiency becomes more pronounced for high-dimensional, stringently constrained sizing problems. Both BO and RL are active research areas, and recent advances have sought to improve scalability and sample efficiency [19], [20], [60], [91], [92], [112]. These developments could be investigated in future work. Nevertheless, this research demonstrates the existence of advantages and limitations of AI-empowered analog IC sizing methods with an example (i.e., ESSAB).

Table 3.14: Summary of the comparison of typical contemporary AI-empowered and conventional systematic manual sizing methods based on the four case studies.

|                    | AI-empowered sizing | Systematic sizing |
|--------------------|---------------------|-------------------|
| Design quality     | Very high           | High or medium    |
| Design efficiency  | Very high           | Low               |
| Design reliability | Very high           | Very high         |
| Accessibility      | High or medium      | Low               |

Post-Layout Optimization. The optimization results presented in this work are based on schematic-level simulations, and therefore do not include the impact of layout-dependent effects such as parasitic capacitances, interconnect resistances, and device mismatches introduced by routing and placement. These effects can influence key performance metrics, particularly bandwidth, phase margin, and linearity. Incorporating full layout implementation, parasitic extraction, and post-layout simulation into the optimization loop is an important direction for future work to provide a more comprehensive and layout-aware comparison.

Scalability to Complex System. This research targets analog building block sizing, and it is interesting to foresee AMS system sizing. A complex system refers to an AMS architecture composed of multiple interdependent building blocks, such as amplifiers, comparators and oscillators, whose behaviors and performance metrics are strongly coupled through shared specifications and parasitic interactions. Currently, there are mainly two approaches. In a top-down approach, system-level specifications are partitioned into building block requirements, and each building block is optimized individually before system-level integration. In a holistic approach, the entire system is optimized concurrently, with all building block interactions considered in the loop. Challenges for ESSAB-like surrogate-assisted methods include the curse of dimensionality in large design spaces and the long simulation times associated with complex systems and new AI techniques are needed.

Scalability to Advanced Technology. While this work focus on mature technology nodes (0.35  $\mu$ m – 65 nm), the observation is also applicable to more advanced processes. In this context, advanced technology refers to deep-submicron and nanoscale CMOS nodes, typically below 28 nm. At modern nodes such as 3 nm, additional challenges arise, including pronounced layout-dependent effects, stricter design rules, and greater sensitivity to parasitics, which require close integration with physical design and layout-aware optimization tools. Addressing these challenges lies beyond the scope of this work but represents a compelling direction for future research.

## 3.5 Summary

While recent AI-empowered analog building block sizing algorithms are showing excellent performance in the EDA community, conventional systematic manual sizing methods still dominate the analog IC design community. To link the two communities, this chapter performs a comprehensive comparative study involving various analog building blocks, and design insight-based comparisons, which are important but are often missing in the EDA community, and silicon validation for three case studies have been provided.

This research shows that AI-empowered analog IC sizing can often obtain better design decisions/solutions (i.e., data-driven) than the conventional systematic sizing method (i.e., design experience-driven), while considerably improving the design efficiency (i.e., often a few hours). With the new sizing methods, designers only need to analyze the obtained design and re-formulate the sizing problem when necessary, and in some cases, it may be a one-button approach. This work validates the potential of AI in analog circuit design and paves the way for further integration of AI-driven methods into design practice.

## Chapter 4

# Subsystem Design: VCO with LDO Integration

#### 4.1 Introduction

In analog IC design, subsystems are pairs of closely interacting blocks that together realize a distinct function within a larger system. This abstraction lies between block-level design and full system architecture, simplifying complexity while preserving critical performance interdependencies. Subsystem-level analysis is particularly valuable in circuit sizing, as inter-block interactions often dictate key specifications including noise, stability, gain, and power efficiency.

A prominent example is the LDO–VCO subsystem, widely used in RF transceivers and frequency synthesizers [113]–[115]. The LDO provides a stable low-noise supply that directly impacts the VCO's PN and frequency stability, while the VCO imposes dynamic current demands that affect the LDO's transient response and loop stability. In practice, this pairing is implemented in smartphone radio, wireless system on chip (SoC), and high-speed serializer/deserializer (SerDes) [116]–[118], where a dedicated on-chip LDO powers the VCO to isolate it from noisy digital domains. The co-design at the subsystem level of the two blocks has been shown to improve PN, reduce supply sensitivity, and improve overall power efficiency [119].

Another important subsystem is the BGR combined with an LDO. The BGR generates a precise, temperature- and supply-independent voltage, which serves as the reference for the LDO. Since the accuracy and stability of the LDO output are directly tied to the quality of the BGR, this subsystem is ubiquitous in power management ICs, where stable supply rails are needed for both analog and digital blocks [120], [121].

Similarly, in RF receivers, a low-noise amplifier (LNA) is often paired with a mixer to form the front end. The LNA amplifies weak RF signals with minimal added noise, while the mixer downconverts them to an intermediate or baseband frequency. Together, this two-block subsystem strongly influences receiver sensitivity and dynamic range, and is central to wireless standards such as long-term evolution (LTE), Wi-Fi, and Bluetooth [122]–[125].

Traditionally, subsystem design has followed a sequential approach, where blocks are designed and optimized independently before integration. In an LDO–VCO pair, the VCO is typically sized to operate under assumed supply conditions, after which the LDO is designed for PSRR, dropout voltage, and output noise. In BGR–LDO subsystems, the BGR is designed first for accuracy and temperature stability, and then the LDO is adapted around it. Similarly, in RF receivers, LNAs are optimized for gain and noise figure, while mixers are subsequently designed for conversion gain and linearity. While this block-by-block approach simplifies design and leverages established analytical models, it neglects the strong coupling between blocks. Consequently, metrics such as PN in LDO–VCO subsystems, output accuracy in BGR–LDO subsystems, or overall noise figure in LNA–mixer subsystems may degrade after integration. Designers often compensate through multiple rounds of simulation and redesign at the transistor level, which increases development time and leads to conservative overdesign.

In summary, the design of analog subsystems requires both transistor-level expertise and a clear understanding of inter-block interactions. Sequential design methods offer simplicity but often result in suboptimal performance once blocks are integrated. Recent research in subsystem-level co-design emphasizes the need to account for coupling explicitly, particularly in critical pairs such as LDO–VCO, BGR–LDO, and LNA–mixer, to overcome performance degradation and inefficiency caused by traditional sequential design. Therefore, this chapter focuses on subsystem-level design and optimization, with the LDO–VCO subsystem serving as a representative case study. The discussion begins with a review of relevant literature to establish foundational concepts, followed by clarification of the main design challenges. Then the problem is formulated. After that, a novel subsystem-

level AI-driven optimization framework for automating the design of LDO-VCO circuits is presented in detail and validated with post-layout simulation results, demonstrating how subsystem-aware approaches can significantly improve the efficiency and accuracy of analog circuit sizing.

#### 4.2 Literature Review

VCOs are crucial components in high performance systems, particularly in wireless communication and RF applications. One of the major challenges in designing VCOs is optimizing PN, which can lead to degraded signal purity and reduced system performance [126]. While traditional approaches have focused on improving the inherent noise performance of the oscillator's core components, such as minimizing thermal and flicker noise in the transistors and enhancing the quality factor of the LC tank, external influences also play a significant role. Power supply noise can introduce additional PN through power supply rejection (PSR) path and VCO's frequency pushing factor [127]. To mitigate this impact, LDOs are commonly used [128]. However, the LDO itself introduces new challenges, as its low-frequency noise can be up-converted into PN, complicating the overall noise management strategy in VCO designs.

To improve the PN, a natural idea is to design LDO and VCO together. [119] carries out a comprehensive analysis between frequency pushing and power supply-induced PN. Based on the influence of the combined effect of the LDO's PSR and the VCO's inherent PN sensitivity to supply noise, the guidelines of LDO design are obtained. Reducing the width of the VCO's switching device is suggested in [119], which mitigates the frequency pushing effect. However, this may increase thermal noise, potentially degrading the VCO's PN performance at higher frequencies. Therefore, a holistic approach to obtain the optimal trade-off considering various kinds of noise is needed to obtain the truly optimal design.

For the traditional manual design method, it can be challenging to derive formulas considering all the complex trade-offs described above. Hence, this chapter presents an LDO and VCO co-design method empowered by the AI-driven algorithm ESSAB [17] mentioned in Chapter 3. The method optimizes the PN of an LC-tank VCO and the PSR of an integrated LDO together while also accounting for PVT corners. Notably, while most existing AI-driven design research focuses on individual analog building blocks, this work emphasizes subsystem-level co-design, offering a more holistic and effective optimization approach.

#### 4.3 Contributions

The contributions are summarized in the following points:

- 1. Subsystem-Level Co-Design Framework: A holistic AI-driven optimization framework is developed that treats the LDO and LC-tank VCO as a unified subsystem. By co-optimizing power supply rejection and PN, the approach addresses trade-offs often overlooked in traditional sequential design methods.
- 2. **AI-Driven Design Automation:** The co-design is empowered by a surrogate model-assisted optimization algorithm, enabling efficient exploration across 43 design variables and 32 PVT corners. This leads to improved robustness and reduced design iteration time.
- 3. **PVT-Aware Co-Design Method:** PVT variations were explicitly incorporated during optimization, ensuring robust performance across varying operating conditions. The methodology is validated through post-layout simulations in a 65 nm CMOS process.

#### 4.4 Problem Formulation

#### 4.4.1 Architecture of LDO-VCO

A diagram of an LDO-regulated VCO is shown in Fig. 4.1, where the cross-coupled LC-tank VCO from Chapter 3 is adopted [104]. The LC tank comprises an inductor, capacitor, and a varactor array. In parallel, the cross-coupled transistor pairs produce negative resistance to counteract the losses present in the LC tank. The LDO on the top provides the load current needed for the VCO. It consists of an error amplifier, a low-pass filter, an NMOS pass transistor, and a feedback divider. The error amplifier is implemented using a two-stage Miller-compensated op-amp. In addition, a bypass capacitor is used at the LDO output.



Figure 4.1: The VCO and LDO co-design method including the schematic diagram of the LC-tank VCO with an integrated LDO. Two design approaches: the sequential approach, which involves two distinct design phases, and the co-design approach, which optimizes both building blocks simultaneously.

#### 4.4.2 Design Variables

In total, there are 43 design variables for the chosen LDO-VCO architecture. Table 4.1 details the ranges of the 43 design variables, where W, R, NT, S, GR denotes the width, inner radius, number of turns, spacing between conductors and guard ring width of the inductor; L, W, F, M represent the channel length, width per finger, number of fingers and number of multiplier of the transistors;  $N_H$ ,  $N_V$ , and  $M_{bot}$  represent the number of horizontal fingers, vertical fingers, and bottom starting layer for the MOM capacitors in the VCO. The inductor and MOM capacitor are implemented using PDK components. The biasing circuits for VCO and LDO, the LDO feedback divider, the varactor and the bypass capacitor are maintained constant. The biasing networks, feedback divider, varactor, and bypass capacitor are fixed at nominal values to reduce design space complexity. The biasing networks and feedback divider are fixed since they have limited impact on core performance optimization, the varactor is pre-optimized to meet frequency tuning requirements, and the bypass capacitor is typically determined by design guidelines.

Table 4.1: Design variables and search ranges of the CMOS cross-coupled LC oscillator and the LDO.

|     | Var.                          | Unit    | $\begin{array}{c} \textbf{Lower} \\ \textbf{bound} \end{array}$ | Upper<br>bound | Co-<br>design | $egin{array}{c} \mathbf{Se-} \\ \mathbf{design} \end{array}$ |
|-----|-------------------------------|---------|-----------------------------------------------------------------|----------------|---------------|--------------------------------------------------------------|
|     | $\mathbf{M_2}$                | integer | 1                                                               | 1000           | 300           | 872                                                          |
|     | ${ m L_{3,4}}$                | m       | 60n                                                             | 240n           | 225n          | 239n                                                         |
|     | $\mathbf{W_{3,4}}$            | m       | 1u                                                              | 6u             | 1.22u         | 4.60u                                                        |
|     | $\mathbf{F_{3,4}}$            | integer | 2                                                               | 32             | 7             | 10                                                           |
|     | $\mathbf{M_{3,4}}$            | integer | 1                                                               | 10             | 8             | 2                                                            |
|     | $L_{5,6}$                     | m       | 60n                                                             | 240n           | 205n          | 75n                                                          |
|     | $\mathbf{W_{5,6}}$            | m       | 1u                                                              | 6u             | 1.96u         | 1.72u                                                        |
| VCO | $\mathbf{F_{5,6}}$            | integer | 2                                                               | 32             | 11            | 13                                                           |
| VCO | $\mathbf{M_{5,6}}$            | integer | 1                                                               | 10             | 6             | 10                                                           |
|     | $\overline{ m N_H}$           | integer | 10                                                              | 200            | 74            | 94                                                           |
|     | ${f N_V}$                     | integer | 10                                                              | 200            | 95            | 88                                                           |
|     | $ m M_{bot}$                  | integer | 1                                                               | 3              | 1             | 1                                                            |
|     | $\overline{\mathbf{w}}$       | m       | 3u                                                              | 30u            | 28.2u         | 27.9u                                                        |
|     | ${f R}$                       | m       | 15u                                                             | 90u            | 89.4u         | 76.6u                                                        |
|     | $\mathbf N$                   | integer | 1                                                               | 3              | 1             | 1                                                            |
|     | ${f s}$                       | m       | 2u                                                              | $4\mathrm{u}$  | 2.67u         | 3.18u                                                        |
|     | $\mathbf{G}\mathbf{R}$        | m       | 10u                                                             | 40u            | 28.7u         | 21.7u                                                        |
|     | $ m L_{nLoad}$                | m       | 500n                                                            | 10u            | 6.64u         | 8.28u                                                        |
|     | $\mathbf{W_{nLoad}}$          | m       | 400n                                                            | 10u            | 3.93u         | 500n                                                         |
|     | ${ m F_{nLoad}}$              | integer | 2                                                               | 32             | 25            | 3                                                            |
|     | $ m M_{nLoad}$                | integer | 1                                                               | 10             | 2             | 2                                                            |
|     | $\overline{\mathbf{L_{pIn}}}$ | m       | 400n                                                            | 10u            | 5.95u         | 470n                                                         |
|     | $\mathbf{W_{pIn}}$            | m       | 400n                                                            | 10u            | 3.25          | 1.43                                                         |
|     | $\mathbf{F_{pIn}}$            | integer | 2                                                               | 32             | 29            | 5                                                            |
|     | $ m M_{pIn}$                  | integer | 1                                                               | 10             | 5             | 1                                                            |
|     | $ ho_{ m bias}$               | m       | 400n                                                            | 10u            | 5.55u         | 3.63u                                                        |
|     | $\mathbf{W_{bias}}$           | m       | 400n                                                            | 10u            | 4.58u         | 9.14u                                                        |
|     | $\mathbf{F_{bias}}$           | integer | 2                                                               | 32             | 14            | 22                                                           |
|     | $ m M_{bias}$                 | integer | 1                                                               | 10             | 7             | 9                                                            |
| LDO | $ m M_{biasIn}$               | integer | 1                                                               | 10             | 5             | 6                                                            |
|     | $ m M_{biasOut}$              | integer | 1                                                               | 10             | 8             | 1                                                            |
|     | $ ho_{ m nOut}$               | m       | 500n                                                            | 10u            | 2.53u         | 3.42u                                                        |
|     | $\mathbf{W_{nOut}}$           | m       | 400n                                                            | 10u            | 2.08u         | 6.35u                                                        |
|     | $\mathbf{F_{nOut}}$           | integer | 2                                                               | 32             | 31            | 20                                                           |
|     | $ m M_{nOut}$                 | integer | 1                                                               | 10             | 7             | 7                                                            |
|     | $\mathbf{C}_{\mathbf{C}}$     | F       | 1p                                                              | 100p           | 67p           | 60p                                                          |
|     | $\mathbf{R}_{\mathbf{C}}$     | ohm     | 1                                                               | 1M             | 989K          | 514K                                                         |
|     | $\mathbf{C_F}$                | F       | 1p                                                              | 200p           | 182p          | 156p                                                         |
|     | $\mathbf{R_F}$                | ohm     | 1                                                               | 2M             | 1.66M         | 1.17M                                                        |
|     | $ ho_{ m pass}$               | m       | 1.2u                                                            | 10u            | 1.69u         | 1.62u                                                        |
|     | $ m W_{pass}$                 | m       | 500n                                                            | 10u            | 8.86u         | 5.96u                                                        |
|     | $\mathbf{F}_{\mathbf{pass}}$  | integer | 2                                                               | 100            | 47            | 35                                                           |
|     | - pass                        | integer | 1                                                               | 32             | 15            | 35<br>15                                                     |

#### 4.4.3 Testbench and Measures

A 65 nm CMOS process is used. The VCO varactor is set to work at its highest frequency where supply pushing is the highest. The performance metrics include oscillation frequency  $f_0$ , PN  $PN(\Delta f)$  at frequency offsets  $\Delta f$  of 100 kHz, 1 MHz and 10 MHz, total power consumption  $P_{dyn}$ , and the FoM at 1 MHz frequency offset [129]:

$$FoM = -10\log\left[\left(\frac{\Delta f}{f_0}\right)^2 \cdot \frac{P_{dyn}}{1mW}\right] - PN(\Delta f), \tag{4.1}$$

Additionally, the maximum PSR, PM, and the maximum  $V_{DD}$  are extracted. The process corners considered include FF, FS, SS, and SF in combination with min inductor/max inductor and min capacitor/max capacitor.  $-55^{\circ}$ C and  $125^{\circ}$ C are considered as temperature corners. For all corners, the lowest supply voltage  $V_{DD\_IO}$  (1.8 V · 90%) is used. In total, 32 corners are considered.

#### 4.4.4 Objective and Constraints

The targeting oscillation frequency is 5.5 GHz with a power consumption less than 7 mW. The PN constraints at 100 kHz, 1 MHz and 10 MHz are set to be  $-94 \, \mathrm{dBc/Hz}$ ,  $-120 \, \mathrm{dBc/Hz}$ , and  $-140 \, \mathrm{dBc/Hz}$  respectively, which meet industrial standards. The optimization objective for both approaches is FoM. The remaining performance parameters are set as constraints. These apply to all 32 corners.

## 4.5 AI-Driven Co-Design Method

## 4.5.1 Sizing Flow and Considerations

For the sequential design method, the VCO is first optimized independently with an ideal 1.2 V power supply to achieve an optimal FoM, as illustrated in Fig. 4.1. The LDO is then incorporated and optimized for this VCO load to generate a clean supply. Although this approach seems intuitive, the primary challenge lies in preventing the LDO noise from being upconverted and affecting the PN of the VCO. To mitigate these, the VCO may need to be re-designed to reduce its sensitivity to the noise introduced by the LDO. This iterative process often requires multiple sizing loops, which can be time-consuming and

labor-intensive. Additionally, a bypass capacitor is typically used to reduce high-frequency supply noise, but its impact is often overlooked in the sizing stage of VCO. Not considering this can lead to discrepancies between the designed and measured PN performance due to changed VCO's voltage swing.

To address above issues, a simultaneous co-design of the LDO and VCO is implemented. By treating them as an integrated system rather than separate blocks, mutual interaction is considered throughout the design process, ensuring that the contributors are jointly optimized to minimize PN. Fig. 4.1 presents the whole sizing flow. The two phase-optimization in Chapter 3 is used to handle corners and will not be repeated here.

#### 4.5.2 Sizing Algorithm

The ESSAB algorithm is used for both the sequential and co-design methods for an apple-to-apple comparison. It starts by initializing a database and iteratively refining designs until a predefined stopping criterion is met. Each iteration involves selecting the top candidate and applying DE operations. An online ANN model and beta ranking are used to guide the selection of the most promising candidate for simulation. The steps are abstracted as Algorithm 2 and algorithm details can be found either in Chapter 3 or [17].

#### Algorithm 2 Optimization Framework

```
1: Initialize Database
 2: Finish \leftarrow false
 3: while Finish = false do
 4:
       Rank and select top \lambda designs
 5:
       Apply DE operations
 6:
       Train ANN and predict performance
 7:
       Select best predicted design and simulate
       if Stopping criterion met then
 8:
           Finish \leftarrow true
 9:
10:
       else
           Update Database
11:
       end if
12:
13: end while
14: Output Final Design
```

## 4.6 Pre-layout Sizing Results and Analysis

To validate the proposed co-design method, a 5.6 GHz LC-tank VCO, regulated by an LDO, is designed using a TSMC 65 nm CMOS process. Using Cadence Virtuoso and ESSAB tool implemented in MATLAB on a workstation with an AMD Ryzen Thread-ripper PRO 3975WX (32 cores, 290 GB RAM, 3.5 GHz), the sequential method used 18 hours in total with 7 hours in VCO sizing and 11 hours in LDO sizing, while the co-design method used 6 hours in total. The sizing details for the obtained designs are provided in Table 4.1, labeled as co-design and se-design, respectively. The pre-layout simulation results are summarized in Table 4.2. The results of nominal corner and the corner with slow NMOS/slow PMOS, max inductor and max capacitor at 125°C (worst corner) are listed.

Table 4.2: Specifications and pre-layout simulation results of the sequentially and codesigned LDO-VCO.

| Symbol             | Specs.      | Se-design<br>(Nominal) | Co-design<br>(Nominal) | Se-design<br>(Worst<br>corner) | Co-design<br>(Worst<br>corner) |
|--------------------|-------------|------------------------|------------------------|--------------------------------|--------------------------------|
| FoM (dBc/Hz)       | Minimize    | -190                   | -192.4                 | -187.3                         | -187.8                         |
| Frequency (GHz)    | $\geq 5$    | 5.69                   | 5.60                   | 5.35                           | 5.27                           |
| PN@100kHz~(dBc/Hz) | $\leq -94$  | -96.2                  | -95.6                  | -93.8                          | -92.0                          |
| PN@1MHz~(dBc/Hz)   | $\leq -120$ | -122.9                 | -124.1                 | -120.9                         | -119.7                         |
| PN@10MHz~(dBc/Hz)  | $\le -140$  | -143.4                 | -144.7                 | -142                           | -141.5                         |
| $P_{dyn} (mW)$     | $\leq 7$    | 6.40                   | 4.56                   | 6.60                           | 4.33                           |
| $PSR_{max}$ (dB)   | $\leq -30$  | -33.7                  | -31.4                  | -31.6                          | -31.0                          |
| $V_{DD,max}(V)$    | $\leq 1.32$ | 1.24                   | 1.23                   | 1.24                           | 1.23                           |
| PM (°)             | $\geq 50$   | 67                     | 82                     | 59                             | 81                             |

To assess the impact of LDO in two approaches, the LDO output noise and PSR are analyzed. In addition, the VCO PN is extracted under two conditions: 1) powered by a 1.2 V ideal voltage source, and 2) powered by the LDO with a 1.62 V DC input and a 1.2 V output voltage.

At a 100 kHz offset, the output noise of LDO significantly deteriorates the VCO PN. The primary contributor to the LDO noise differs between the two designs: For the co-designed LDO, the main source of noise is the thermal noise from the NMOS load in the input stage. For the sequential design, the primary contributor is the thermal noise from the PMOS input pair. In the PMOS input pair, the transconductance is 61.89  $\mu$ S for the sequential design and 73.42  $\mu$ S for the co-designed LDO. For the NMOS load, the transconductance is 73.9  $\mu$ S in the co-designed case, compared to 18.7  $\mu$ S for the sequential design, resulting in higher output noise floor for the co-designed LDO, as shown in Fig. 4.2 (a). Furthermore,



Figure 4.2: (a) Output noise of the LDOs. (b) PSR of the LDOs. (c) Phase noise performance of the VCO designs with 1.2 V ideal supply and with LDO.

as shown in Fig. 4.2 (c), the co-designed VCO with ideal supply has a slightly worse PN at 100 kHz due to the smaller switching transistors. With LDO's output noise, the co-designed system exhibits worse PN performance at the 100 kHz offset when integrated with the LDO, with -95.6 dB compared to -96.2 dB.

At 1 MHz offset, the primary contributor to PN is the thermal noise from the switching transistors, while the flicker noise of transistors M3-M6 dominates across corners. With ideal supply, M5 and M6 have five times smaller W/L in the co-designed VCO, which reduces the effective transconductance of the switching pairs, resulting in higher thermal noise and 1 dB worse PN than se-design. However, with LDO incorporated, the PN performance of the se-design degrades from 124.9 dBc/Hz to 122.9 dBc/Hz, partly due to frequency pushing mechanism [119], which is significantly suppressed in the co-design. Consequently, the co-design achieves a 1.2 dB PN reduction at a 1 MHz offset and a 1.3 dB improvement at 10 MHz with the LDO incorporated. Additionally, the power

consumption is reduced due to the lowered total capacitance at the output nodes with smaller transistor area. The decrease in transconductance reduces the current flowing to the LC tank and affects the startup time as shown in Fig. 4.3. However, the startup is still achieved reliably across corners thanks to the corner analysis during optimization.



Figure 4.3: Oscillation transients for LDO-VCO designs. The co-designed VCO has a slower oscillation start-up and smaller oscillation amplitude.

In the sequentially designed LDO-VCO, the effect of the bypass capacitor is not well accounted for in the VCO design. To maintain low PN, the supply voltage swing is kept large, which aligns with Leeson's equation [130].

$$L(\Delta f) = 10 \log \left[ \frac{FkT}{2P_{\text{sig}}} \left( 1 + \left( \frac{f_0}{2Q\Delta f} \right)^2 \right) \left( 1 + \frac{f_c}{\Delta f} \right) \right], \tag{4.2}$$

where  $L(\Delta f)$  represents the single-sideband PN at an offset frequency  $\Delta f$  from the carrier, expressed in dBc/Hz. The parameter F denotes the noise factor of the oscillator, k is Boltzmann's constant, and T is the absolute temperature in Kelvin. The term  $P_{\text{sig}}$  refers to the signal power of the oscillator, while  $f_0$  is the oscillation frequency. The quality factor of the resonator is denoted by Q, and  $f_c$  represents the flicker noise corner frequency. It indicates that a larger output swing gives larger signal power according to  $P_{\text{sig}} \propto A^2$  and thus less PN.

However, when a bypass capacitor is added at the supply of the VCO, it flattens the voltage swing, which worsens the optimized PN performance. In contrast, the co-design approach considers the impact of the decoupling capacitor from the start. This approach constrains the reliance on voltage swings, resulting in lower phase noise. The inclusion of a bypass capacitor also degrades the VCO PN across corners. According to Fig. 4.4, the pre-designed VCO exhibits a much larger variation in PN (PN@100kHz: best case -96.19 dBc/Hz, worst case -83.66 dBc/Hz; PN@1MHz: best case -123.1 dBc/Hz, worst case -115.7 dBc/Hz; PN@10MHz: best case -145.6 dBc/Hz, worst case -140.5 dBc/Hz)

when the LDO and bypass capacitor are included. In contrast, the co-designed VCO shows more consistent performance (PN@100kHz: best case -95.59 dBc/Hz, worst case -91.28 dBc/Hz; PN@1MHz: best case -124.3 dBc/Hz, worst case -119.7 dBc/Hz; PN@10MHz: best case -146.8 dBc/Hz, worst case -141.5 dBc/Hz). Overall, the co-design approach leads to a better FoM of 2.4 dBc/Hz and superior performance across corners by accounting for block interactions and multiple effects.



Figure 4.4: (a) Corner spread of PN for the co-designed LDO-VCO. (b) Corner spread of PN for the sequentially designed LDO-VCO.

## 4.7 Post-Layout Results and Discussion

The layout was manually implemented, shown in Fig. 4.5. The post-layout simulation results are presented in Table 4.3. PN degrades by only 0.4 dB, and the oscillation frequency is barely affected. The overall impact of layout on the FoM is limited to 0.3 dB. This minor degradation is primarily attributed to parasitic capacitance and resistance introduced by routing in the  $V_P$  and  $V_N$  signal nets.



Figure 4.5: Co-designed LDO-VCO layout.

Table 4.3: Specifications and post-layout simulation results of the co-designed LDO-VCO.

| Symbol             | Specs.      | Co-design<br>(Nominal) | Co-design<br>(Worst corner) |
|--------------------|-------------|------------------------|-----------------------------|
| FoM (dBc/Hz)       | Minimize    | -192.1                 | -187.8                      |
| Frequency (GHz)    | $\geq 5$    | 5.51                   | 5.19                        |
| PN@100kHz~(dBc/Hz) | $\leq -94$  | -95.9                  | -96.8                       |
| PN@1MHz~(dBc/Hz)   | $\leq -120$ | -123.9                 | -119.1                      |
| PN@10MHz~(dBc/Hz)  | $\leq -140$ | -144.3                 | -139.3                      |
| $P_{dyn}$ (mW)     | $\leq 7$    | 4.67                   | 3.59                        |
| PSR (dB)           | $\leq -30$  | -30.2                  | -30.1                       |
| $V_{DD,max}(V)$    | $\leq 1.32$ | 1.22                   | 1.22                        |
| PM (°)             | $\geq 50$   | 82                     | 81                          |

## 4.8 Summary

The design of LDOs and VCOs is critical in achieving clean power delivery and low PN in RF systems. Traditional sequential design flows often fail to capture the interactions between these subsystems, leading to suboptimal performance. This chapter presents a novel co-design approach for LDO–VCO systems.

The proposed co-design methodology leverages an AI-driven surrogate model-assisted optimization framework that jointly optimizes 43 design variables across 32 process and temperature corners. Unlike the sequential approach, which tunes the LDO after the VCO is finalized, the co-design strategy captures supply sensitivity, noise coupling, and inter block trade-offs during the entire design process. As a result, it delivers improved PSR, reduced dynamic power, and enhanced PN performance.

Pre- and post-layout simulations in a 65 nm CMOS process demonstrate the effectiveness of the co-design approach compared with the conventional sequential design flow, achieving a 1.2 dB reduction in PN at 1 MHz offset and a 28.8% decrease in dynamic power consumption. Additionally, the co-designed system maintains robust performance across PVT variations. Overall, this work establishes an AI-driven subsystem-level design methodology that improves analog integration in complex RF SoCs.

## System-Level Design Automation for SAR ADCs

## 5.1 Background

In AMS IC design, small systems are compact but complete architectures that integrate multiple functional blocks to achieve a higher-level function. Unlike subsystems, which typically capture the interaction between two blocks, small systems may integrate both analog and digital elements, forming a closed loop where block-level interdependencies strongly influence overall performance. This abstraction sits above block-level design but below full SoC integration, making small systems a valuable focus for circuit sizing and optimization.

A prominent example of such a small system is the SAR ADC. Similar to other ADC architectures, the data conversion is achieved through sampling and quantization operations. In SAR ADCs, sampling is implemented using a SC network, where the input voltage is stored on a capacitor array during the sampling phase and subsequently held for quantization. Quantization is realized with a capacitive digital-to-analog converter (CDAC), a comparator, and SAR logic in an iterative conversion loop. During quantization, the CDAC generates trial voltages, the comparator determines the polarity of the input error, and the logic updates the register to refine the digital code. The process continues until all bits are determined. The close interaction between these blocks defines key trade-offs: comparator offset and noise constrain CDAC matching requirements, comparator and CDAC switching energy dominate power efficiency, and comparison speed limits the achievable sampling rate. These interdependencies make the SAR ADC a suitable case study for small system design.

Other small systems include sigma—delta ADCs, which combine oversampling filters with feedback loops, and pipeline ADCs, which cascade low-resolution stages with residue amplifiers. Similarly, time-to-digital converters integrate delay lines and phase detectors to achieve precise timing resolution. In all cases, overall performance metrics such as energy efficiency, resolution, and dynamic range emerge not from isolated block behavior but from the interaction of multiple tightly coupled components.

Traditionally, small systems have been designed using a sequential block-level methodology. For example, the CDAC is first sized to meet kT/C noise and linearity requirements, the comparator is then optimized for speed and noise under assumed load, and the SAR logic is finalized to ensure timing closure. While straightforward, this step-by-step flow often fails to capture inter-block dependencies. As a result, initial designs may underperform once integrated, requiring multiple redesign iterations. For instance, reducing CDAC capacitance to save energy may increase comparator noise, and comparator delay may bottleneck the conversion rate despite adequate CDAC sizing. Such issues highlight the limitations of purely sequential methods.

In summary, small systems represent a crucial level of abstraction in AMS design, where overall performance is dictated by the coupling of multiple functional blocks. SAR ADCs exemplify this category, serving as a compact but complete architecture that illustrates both the opportunities and challenges of small system design. Therefore, this chapter focuses on AI-driven SAR ADC sizing method, beginning with a review of the relevant literature, followed by a clarification of the identified limitations. The proposed methodology is then discussed in detail and demonstrated through 12 design cases, highlighting the benefits of small system–level optimization in analog circuit sizing.

## 5.2 Literature Review

Modern electronic systems increasingly rely on efficient and accurate data conversion between analog and digital domains. Among various data converter architectures, SAR ADCs have gained wide adoption due to their energy efficiency, moderate-to-high resolution, and scalability for low-power applications. Despite these advantages, SAR ADC design remains a complex task due to the intricate interactions among its core building blocks, including the comparator, DAC, S/H circuit, and digital control logic. Each of these components introduces specific design constraints that must be carefully bal-

anced to meet overall performance goals. Traditional manual approaches depend heavily on designer experience and iterative tuning, making the process labor-intensive and time-consuming. This has motivated a growing research for automated optimization techniques to streamline SAR ADC design.

Several automated or semi-automated design methods have been proposed to address the need for more efficient SAR ADC design. One approach replaces analog building blocks with synthesizable digital standard cells, reducing manual efforts and improving design portability across technology nodes [131]. However, this restricts the design flexibility as digital standard cells may not always meet stringent ADC performance requirements.

Another notable method is hybrid automation, exemplified by [132], [133], which combines techniques such as optimization algorithms, lookup tables, and library-based selection for specific blocks. In these methods, designers initially allocate specifications to each individual block, then optimize them independently without fully considering inter-block interactions. As a result, overall performance heavily depends on the initial specification allocation.

A more systematic method determines block-level component sizing in cycles, explicitly considering inter-block effects [134]. Although system-level trade-offs are accounted for, this approach still requires significant manual effort, especially for sizing individual blocks.

In summary, current automated methods face two major limitations. First, they remain restricted to block-level design, which fails to account for inter-block interactions and system-wide trade-offs, ultimately leading to suboptimal system performance. Second, they demand significant manual effort when applied to different specifications or technology nodes.

#### 5.3 Contributions

To address the limitations of prior block-level methods, a system-level design approach is needed. While global optimization offers a holistic solution, its practical use is limited by the long simulation time and high-dimensional design complexity [135]. In response to these challenges, this work proposes an efficient global-local optimization framework. The framework integrates a computationally inexpensive global search algorithm (executed within four hours) for broad design space exploration, with a local optimizer that applies

parallel, multi-fidelity pattern search to unconverged variables (completed within three hours). During global exploration, performance constraints such as noise, distortion, and power are automatically derived from top-level requirements and evaluated using low-cost single points tests, significantly reducing simulation overhead. Once most of the design variables have converged, the local optimizer is applied to the remaining parameters to accelerate convergence. To ensure system-level performance accuracy, the local optimizer targets the remaining variables using blended objective function and full sine wave test, accelerated by parallel simulation. By combining the two stages, the proposed approach ensures both computational efficiency and high-performance design. Different from prior works which focus on optimizing individual building blocks, this work treats the SAR ADC as an integrated system during sizing, which enables more effective exploration of design trade-offs.

The contributions are summarized in the following points:

- 1. System-Level Optimization Framework: An efficient global-local optimization approach was developed that treats the SAR ADC as a unified system. A fast global search explores the design space, followed by a parallel local optimizer that refines unconverged variables, achieving accurate results with practical runtime.
- 2. Automated Constraint Translation Method: A method was proposed to automatically derive dual-level design constraints from top-level specifications, reducing manual effort and enhancing design scalability.
- 3. Multi-Fidelity Evaluation Strategy: A multi-fidelity performance evaluation method was introduced to accelerate the optimization process while maintaining accuracy, thereby reducing simulation overhead.
- 4. Flexible Design Framework: A flexible optimization framework was developed capable of generating high-performance SAR ADC designs across a wide range of resolutions and speeds.

## 5.4 Architecture and Design Considerations of SAR ADC

This section begins by reviewing the fundamentals of SAR ADC, including its architecture and basic operation and the differences between synchronous and asynchronous control schemes [136], [137]. It then discusses key design considerations relevant to their performances.

#### 5.4.1 Architecture and Operation

Fig. 5.1 shows the architecture for the selected N-bit differential SAR ADC, which includes four main blocks: S/H, comparator, control logic, and DAC. In this paper, the topologies of the S/H, comparator, and DAC are fixed to a bootstrapped sampling switch, a dynamic comparator, and a top-plate sampling and fully binary-weighted CDAC with a  $V_{\rm CM}$ -based switching scheme [138], shown in Fig. 5.2. These topologies are selected for their wide applicability and ability to support a broad range of speed and resolution requirements.



Figure 5.1: The architecture of an N-bit asynchronous SAR ADC.

The impedance linearity of single MOSFET switches and transmission gate switches is inadequate for high-precision applications. To improve the linearity of the sampling switch in ADCs, a bootstrap switch is commonly used to maintain a constant gate-to-source voltage  $(V_{GS})$  of the MOS transistor circled in red in Fig. 5.2 (a).

In a traditional N-bit differential  $V_{CM}$ -based CDAC, each side of the differential pair requires  $2^{N-1}$  unit capacitors. The capacitor splitting technique is applied. As illustrated in the Fig. 5.2 (b), a single capacitor originally connected to  $V_{CM}$  is split into two smaller capacitors, with one reset to  $V_{ref}$ . and the other to ground during the sampling phase. During the charge redistribution phase, the bottom plate of the capacitor corresponding to the current bit is conditionally switched based on the comparator comparison result: on the higher-voltage side, it transitions from  $V_{ref}$  to ground, whereas on the lower-voltage side, it switches from ground to  $V_{ref}$ . This symmetric switching operation maintains the differential nature of the CDAC and ensures that the common-mode voltage remains centered at  $V_{CM}$  throughout the conversion process.

The adopted synchronous SAR logic architecture in Fig. 5.2 (c) consists of two rows of D flip-flops (DFFs). The lower row functions as a phase shifter logic circuit, while the upper row serves as a group of bit registers. Based on the output of the lower row, the comparison result is stored in the corresponding bit of the upper-row DFF. In this work, the DFFs are implemented using dynamic DFFs in [139] for their high energy efficiency.

Fig. 5.2 (d) shows the dynamic comparator structure introduced in Chapter 3 and therefore not repeated here. It worth noting that although this work focuses on these specific topologies, the proposed design methodology can be extended to other circuit implementations in a similar way.



Figure 5.2: The circuit diagram for SAR ADC building blocks, including: (a) bootstrap switch, (b) CDAC, (c) SAR logic, and (d) dynamic comparator.

The operation of SAR ADCs can be implemented with the four components. As illustrated in Fig. 5.3, the analog-to-digital conversion process of an N-bit SAR ADC consists of two main stages: a sampling phase (orange region) and N-bit conversion phases (green, purple, and blue). During the sampling phase, the top plates of the differential capacitor arrays ( $V_{dac,p}$  and  $V_{dac,n}$ ) are connected to the input voltages  $V_{inp}$  and  $V_{inn}$ , while the bottom plates are switched to positive reference  $V_{ref,p}$ , negative reference  $V_{ref,n}$ , or the common-mode voltage  $V_{CM}$ , depending on the DAC switching scheme [140], [141]. Once sampling phase is complete, the top-plate switches are cut off, and the DAC holds the sampled input voltages.

In each bit-conversion phase, the comparator evaluates the differential voltage across the top plates  $(V_{dac_p} \text{ and } V_{dac_n})$  when the comparator clock  $(CLK_C)$  goes high. The comparison result determines the current bit value. During the negative phase of  $CLK_C$ , the comparator is reset and the DAC updates  $V_{dac_p}$  and  $V_{dac_n}$  for the next comparison through the charge distribution mechanism, based on the DAC switching scheme. This process continues sequentially until all N bits are resolved.

The comparator output directly determines the ADC output code. In a dynamic SAR logic architecture, the comparator decisions are latched into output registers and subsequently buffered to generate the final ADC output code. For performance measurement, the output is fed to an ideal DAC module which converts the binary code to a decimal value, and the spectral analysis can then be performed on the output of the ideal DAC.

SAR ADCs can be categorized into two types based on their control mechanism: (1) synchronous and (2) asynchronous. These schemes differ in key aspects such as reliance on the external clock, timing constraints, and the verification of sizing results, as detailed below.

The primary distinction between the two schemes lies in the time interval  $T_{CLKC}$  of the internal comparator clock signal. In the synchronous scheme,  $T_{CLKC}$  remains the same across all bit-cycling phases controlled by the external clock, as shown in Fig. 5.3 (a). Here, the internal  $CLK_C$  signal is derived directly from the external clock  $(CLK_{EXT})$ , resulting in equal time intervals for each bit decision. Even if the comparator completes its comparison or resets earlier, the system waits for the next clock edge before proceeding to the next comparison. Since the decision time varies across bits, the conversion time is determined by the worst-case delay, which may introduce idle intervals between comparisons.



Figure 5.3: The timing diagrams of synchronous and asynchronous SAR ADCs.

In contrast, the asynchronous scheme decouples  $CLK_C$  from  $CLK_{EXT}$ , which only determines the duration of one full quantization. The value of  $T_{samp}$  depends on the duty cycle of  $CLK_{EXT}$ . Then all bit-cycling phases must be completed within the remaining time for correct conversion. Assuming that  $CLK_{EXT}$  has a frequency of  $F_S$ , the asynchronous SAR ADC is thus constrained by the following equation:

$$T_{samp} + T_{conv,asyn} \le \frac{1}{F_S},\tag{5.1}$$

where  $T_{conv,asyn}$  represents the total conversion time.

As illustrated in Fig. 5.3 (b), the asynchronous control relies on the internal state of the comparator. Once the comparator determines  $i_{th}$  bit after a comparison time  $T_{cmp,i}$ , it will generate a valid signal to the SAR logic. The SAR logic would change the switching state of CDAC drivers to enable DAC settling. In the meantime, this signal resets the comparator in time  $T_{rst,i}$  after a delay  $T_{delay,f}$ . After the reset, the  $CLK_C$  is toggled after a delay  $T_{delay,r}$  to begin the next comparison. Since each operation begins immediately after the previous one completes, asynchronous SAR ADCs typically achieve higher conversion

speeds compared to their synchronous counterparts and are thus widely used in recent years. The asynchronous architecture will be considered in this work due to its high time efficiency and wide application. However, timing control in asynchronous architectures must be carefully considered during the design process.

As mentioned above, the asynchronous scheme generates  $CLK_C$  according to the feedback signal from the comparator. As shown in Fig. 5.3 (b), this results in an unequal time interval in each bit conversion. The total conversion time is therefore:

$$T_{conv,asyn} = \sum_{i=1}^{N} (T_{delay,r} + T_{cmp,i} + T_{delay,f} + T_{rst,i}).$$
 (5.2)

 $T_{delay,r}$  and  $T_{delay,f}$  can be considered constants, while  $T_{cmp,i}$  depends on the differential input voltage  $V_{cmp,i}$  in the comparator.

#### 5.4.2 Design Considerations and Trade-Offs

As discussed, the comparator becomes the primary timing bottleneck in high-speed asynchronous SAR ADCs, requiring it to respond within a tight window to achieve high conversion rates. The varying comparison time further complicates the manual design strategy. Additionally, the sampling duration must be sufficient for complete charge transfer, yet short enough to support high sampling rates. At the same time, the SAR logic must update the DAC appropriately and the DAC must settle to the correct voltage level before the next comparison begins.

In traditional manual design, asynchronous SAR ADCs are typically sized by evaluating the longest total comparison time across the N conversion phases, often dominated by the final bit decision. To satisfy the speed specification of an asynchronous SAR ADC,  $\sum T_{cmp,i}$  should be shorter than the given time budget for comparison ( $\alpha_{cmp} \times T_{asyn,conv}$ ,  $0 < \alpha_{cmp} < 1$ ). The setting of  $\alpha$  relies on designer's experience. If  $\alpha$  is too tight, it will be hard to meet. If it is large, it will give suboptimal design. In addition to the comparator, the timing trade-offs among other blocks (e.g., DAC settling and sampling) must also be carefully balanced. In manual approaches, these allocations are again mainly based on designer intuition, introducing potential for suboptimality.

Timing errors in SAR ADCs primarily manifest as distortion. One key source of distortion arises during the sampling phase: if the sampling duration is insufficient, incomplete charge transfer in the capacitor array occurs, resulting in a voltage-dependent error at the comparator input. This nonlinearity leads to harmonic distortion in the output spectrum, particularly for high-frequency or large-amplitude inputs where the required settling time increases. Another critical timing issue originates from the decoupled operation of the comparator and SAR logic. Since the comparator initiates the next DAC update only after each bit decision, any mismatch in timing, such as initiating a comparison before the DAC has fully settled, can introduce significant errors. These timing violations can result in inaccurate bit decisions, leading to distortion or even full conversion failure if not properly managed. Such issues will be directly reflected in the SNDR of ADCs.

Mismatch is another major source of distortion. However, capacitor mismatch is not considered in the current methodology. This is because such mismatch is typically corrected through foreground or background calibration techniques, which are widely adopted in practical designs [142]–[144]. Thus, the focus of this work remains on dynamic distortion, which is more closely tied to optimization-level decisions and cannot be easily compensated post-silicon.

Noise is a fundamental limiting factor in the design of high-resolution ADCs, directly impacting their SNDR. In SAR ADCs, thermal noise primarily originates from two key sources: the active devices in the S/H and the comparator. During the sampling phase, thermal noise introduced by the on-resistance of sampling switches and the kT/C noise of the sampling capacitors sets a baseline noise floor. As resolution increases, the acceptable noise margin becomes smaller, making even minor contributions significant. The comparator also contributes noise during each bit decision. Its input-referred noise, often modeled as a Gaussian distribution, can lead to decision uncertainty, particularly when the differential input voltage is small. In high-resolution applications where the least significant bit (LSB) voltage is very small, the comparator's noise can become comparable to or exceed the LSB, resulting in bit errors and degraded linearity.

Power consumption is a critical design constraint in SAR ADCs, particularly for applications in battery-powered and energy-constrained systems such as IoT devices, biomedical implants, and wireless sensors. The total power in a SAR ADC is primarily consumed by three components: the S/H, the comparator, and the digital control logic including the CDAC switching activity. Among these, the comparator often dominates the dynamic power budget, especially at high speeds, due to its frequent regeneration cycles and sensitivity requirements. The CDAC also contributes to power consumption through charge redistribution during each bit decision, with switching energy proportional to the capa-

citance and the square of the reference voltage. To minimize DAC switching power, the capacitor size can be reduced. The cost is increased capacitor-array mismatch and higher kT/C noise. Additionally, the SAR logic can be optimized to further improve energy efficiency. Overall, minimizing power in SAR ADCs requires a holistic design strategy that balances speed, resolution, and architectural efficiency while leveraging low-power circuit design techniques.

Speed, accuracy and power are the major trade-offs in SAR ADC design. Manual design approaches follow the top-down design flow, where system-level specifications (e.g., 10-bit resolution, 50 MS/s speed, etc.) are allocated manually to individual building blocks. This manual allocation means the design quality can vary significantly between designers, even when they start with the same overall architecture and system requirements. In contrast, the proposed system-level approach eliminates the need for manual specification allocation and automates the management of design trade-offs. It inherently considers the interactions between speed, accuracy, and power during optimization without requiring designers to manually assign specifications to each block. The details of the proposed method are illustrated in the next section.

## 5.5 Methodology

First, the sizing task will be formulated as a mathematical optimization problem. Widely accepted specifications are used as constraints, with two automatically generated specification sets derived for different optimization stages, eliminating reliance on designer expertise. Finally, the global and local optimizers will be developed based on prior research and customized for SAR ADC design.

#### 5.5.1 Overview

Recent systematic approaches often adopt a block-level sizing methodology, where components such as the comparator, DAC, S/H, and control logic are designed and validated independently. System-level performance is typically evaluated only after all blocks have been sized. If the simulated system-level performance deviates from the target specifications, designers need to reallocate block-level specifications and repeat the sizing process. This trial-and-error strategy exhibits several limitations. First, there is often no systematic guidance to adjust the specifications or identify which block or combination of blocks

caused the deviation. Since degradation often arises from inter-block interactions, isolated sizing is frequently ineffective. Second, the correction process can be time-consuming even when such interactions are considered, often requiring many iterations to meet system-level targets, particularly in high-resolution or low-power SAR ADC designs. Although global optimization can be a holistic solution, its practical application is limited by the high dimensionality of the design space [135].

To overcome these challenges, the proposed global–local optimization framework explores the design space in two stages, as illustrated in Fig. 5.4. The process begins with userdefined inputs, such as supply voltage, sampling rate, topology, and resolution, followed by automatic specification generation, which derives key performance constraints from system-level targets. In the global phase, a surrogate model-assisted global optimizer is used to efficiently explore the high-dimensional design space of x globally and identify the most promising candidate design that satisfies high-level performance constraints. To reduce simulation overhead, candidate designs are evaluated using low-cost tests (i.e., coarse evaluation) using performance constraints derived from system-level specifications. The purpose of this stage is to fast prune 'bad' designs and guide the candidate toward a promising design region. Once the promising design region is located, the local phase applies a deterministic pattern search algorithm to refine the unconverged design variables. This stage leverages parallel, phase-shifted transient simulations (i.e., fine evaluation) and a multi-fidelity objective to ensure accuracy while maintaining efficiency. By combining coarse global exploration with targeted local refinement, the proposed framework enables efficient and agile exploration of the high-dimensional design space, ultimately producing high-performance SAR ADC solutions. The implementation details of each stage are presented in the following sections.

## 5.5.2 Automatic Specification Derivation

Conventional methods decouple sizing from system evaluation. The proposed approach integrates system-level performance from the outset. Table 5.1 defines two sets of specifications at different levels of evaluation. In single-point tests, the specifications considered include SSRE, sampling error, thermal noise, and power consumption, which together capture the dominant nonlinearity, noise, and timing effects. The parameter  $\alpha$  serves as a user-defined scaling factor to tighten or relax SNDR, enabling trade-off control between performances.  $\alpha$  is set to 1 by default.



Figure 5.4: The flow diagram of the proposed global-local sizing approach. The red blocks are based on the single-point test, while the blue block represents the full sine wave test.

In a binary-weighted SAR ADCs, the actual step size for bit i can deviate from ideal due to capacitor mismatch, incomplete settling, parasitics, and other nonidealities. The actual step can be modeled as:

step<sub>i</sub> = 
$$A_i(1 + \delta_i)$$
, with  $\frac{A_i}{A_{i+1}} = 2$ , (5.3)

where  $A_i$  denotes the ideal amplitude of the *i*th DAC step, and  $\delta_i$  represents its deviation due to nonideal effects. The term  $\delta_i$  captures both static mismatch and dynamic errors such as incomplete settling and parasitic-induced distortion. Then, the SSRE of two succeeding bits can be defined:

$$SSRE_{i} = \left| \frac{\operatorname{step}_{i}}{\operatorname{step}_{i+1}} - 2 \right| = \left| \frac{2(1 + \delta_{i})}{1 + \delta_{i+1}} - 2 \right|.$$
 (5.4)

For small relative errors (i.e.,  $\delta_i, \delta_{i+1} \ll 1$ ), a first-order approximation gives:

$$SSRE_i \approx 2|\delta_i - \delta_{i+1}|. \tag{5.5}$$

Assume that the dynamic errors are uncorrelated across bits. This assumption holds approximately when bit-level settling is sufficiently fast to prevent significant error propagation across bit cycles. Considering the quantization noise power, the upper bound of the total error voltage power is given by:

$$\sum_{i=1}^{N} V_{\varepsilon_i}^2 = \left(\frac{\Delta}{\sqrt{12}}\right)^2,\tag{5.6}$$

with  $\Delta$  denoting the least significant bit (LSB) voltage step and the error voltage for bit i is:

$$V_{\varepsilon_i} = 2^{N-i} \cdot \Delta \cdot \delta_i. \tag{5.7}$$

If the same error budget is applied for each bit:

$$V_{\varepsilon_i}^2 = \left(\frac{\Delta}{\sqrt{12}}\right)^2 \cdot \frac{1}{N} \quad \Rightarrow \quad \delta_i = \frac{1}{2^{N-i} \cdot \sqrt{12N}}$$
 (5.8)

Substituting (7) into the (4) gives:

SSRE<sub>i</sub> 
$$\approx 2|\delta_i - \delta_{i+1}|$$
  

$$= 2 \cdot \left| \frac{1}{2^{N-i}\sqrt{12N}} - \frac{1}{2^{N-i-1}\sqrt{12N}} \right|$$

$$= \frac{1}{2^{N-i-1} \cdot \sqrt{12N}}$$
(5.9)

The SSRE reflects the deviation of consecutive DAC step amplitudes from their ideal binary-weighted ratio, thereby capturing both static and dynamic sources of nonlinearity such as capacitor mismatch, incomplete settling, and parasitic distortion. As such, SSRE serves as a system-level indicator of DAC linearity and dynamic accuracy in single-point tests.

To capture the nonlinearity from the sampling switch and the noise contributions from both the sampling switch and comparator, the constraints for sampling error and thermal noise are derived under the assumption that their power equals the quantization noise. For example, for sampling error,

Sampling Error = 
$$|V_{\text{ideal}} - V_{\text{sampled}}| < \frac{\Delta}{\sqrt{12}} = \frac{V_{\text{DD}}}{2^N \cdot \sqrt{12}}.$$
 (5.10)

Table 5.1: Summary of specifications used in optimization.

| Level                  | Performance                 | Specification                                                                                                   |
|------------------------|-----------------------------|-----------------------------------------------------------------------------------------------------------------|
|                        | $SSRE_i$ $(i = 1,, N-1)$    | $\left  rac{\mathrm{step}_i}{\mathrm{step}_{i+1}} - 2  ight  < lpha \cdot rac{1}{2^{N-i-1} \cdot \sqrt{12N}}$ |
| Coarse<br>Evaluation   | Sampling Error              | $\left V_{ m ideal} - V_{ m sampled} ight  < lpha \cdot rac{V_{ m DD}}{2^N \cdot \sqrt{12}}$                   |
|                        | Thermal Noise               | $\sqrt{rac{2kT}{C} + \overline{v_{n,	ext{cmp}}^2}} < lpha \cdot rac{V_{	ext{DD}}}{2^N \cdot \sqrt{12}}$       |
|                        | Power                       | min $V_{ m DD} \cdot I_{ m avg}$                                                                                |
|                        | SNDR                        | $10\log_{10}\left(\frac{P_{\text{signal}}}{P_{\text{tot, max}}}\right)$ $< 6.02N - 4.25 \text{ dB}$             |
| Accumata               | ENOB                        | $\frac{\text{SNDR}-1.76}{6.02}$                                                                                 |
| Accurate<br>Evaluation | $\mathrm{FoM}_{\mathrm{W}}$ | $min \; rac{P}{2^{\mathrm{ENOB}} \cdot f_{\mathrm{s}}}$                                                        |
|                        | $\mathrm{FoM}_{\mathrm{S}}$ | $min \text{ SNDR} + 10\log_{10}\left(\frac{f_{\mathrm{s}}/2}{P}\right)$                                         |

The corresponding upper bound of the SNDR can be estimated by considering the combined effects of quantization noise, thermal noise, SSRE, and sampling error, as shown by

$$P_{\text{tot,max}} = P_{\text{q}} + P_{\text{thermal,rms}} + P_{\text{ssre}} + P_{\text{sample}}$$

$$= \frac{\Delta^2}{12} + \frac{\Delta^2}{12} + \frac{\Delta^2}{12} + \frac{\Delta^2}{12} = \frac{\Delta^2}{3}.$$
(5.11)

Accordingly, the upper limit of the achievable SNDR is given by

$$SNDR < 10 \log_{10} \left( \frac{P_{\text{signal}}}{P_{\text{tot,max}}} \right) = 6.02N - 4.25 \text{ dB.}$$
 (5.12)

#### 5.5.3 Low-Cost Simulation-Based Global Optimization

Previous systematic optimization strategies for AMS designs often decouple block-level sizing from system-level evaluation, deferring architectural exploration until after electrical parameters are fixed. This sequential flow overlooks critical interactions among system specifications, limiting global optimality. In contrast, the proposed approach incorporates system-level objectives directly into the sizing loop from the outset.

Convergence can be slow without an effective optimization algorithm. Recent work has introduced machine learning into the optimization process as reviewed in Chapter 2. These approaches, such as the surrogate model—assisted method in [17], combine evolutionary algorithms with ANN-based performance prediction to reduce the number of required circuit simulations. This work adopts the same algorithmic components described previously (Fig. 5.4), including ANN-based surrogate modeling, DE, and infill sampling. The optimization is applied here to a design problem with 52 variables, as detailed below.

- Bootstrap Switch: The device sizes of the sampling switch are treated as design variables, as they directly affect both sampling accuracy and speed. To meet high-speed requirements, the sizes of the transistors in the clocking path are also parameterized, given their impact on switching speed.
- CDAC: The unit capacitor size and CDAC driver transistor sizes are optimized. The driver size is scaled down progressively from the most significant bit (MSB) to the LSB, typically halving at each step, until reaching the minimum size allowed by the PDK. This ensures proper settling behavior while minimizing power consumption.
- Comparator: All its transistor sizes are included as key design variables due to their critical role in determining noise characteristics, power consumption, and timing accuracy.
- **DFF:** The transistor sizes of the dynamic DFF are also optimized, as they affect both timing precision and power efficiency.

Transient simulation is typically required for ADCs to enable spectral analysis and extract key performance metrics. For medium-resolution ADCs (e.g., 10-bit), such simulations are essential for evaluating dynamic behavior. However, a single transient simulation can take from several hours to days, posing significant challenges for design validation and making it impractical for use in optimization loops. To minimize simulation time and address this challenge, a coarse performance evaluation can be used during the global optimization phase. The coarse performance evaluation focuses on key contributors to SNDR, which are set as constraints during optimization (Table 5.1). The objective of the optimization is to minimize power consumption. Three tests are performed in parallel. The first measurement involves a single point transient simulation with the differential input fixed at  $V_{DD}$ . At the end of the sampling period, the sampling error can be directly evaluated. To check SSRE, voltage steps are measured during each bit cycle, and step ratios are recorded. The second and third measurements separately simulate the noise contributions from the CDAC and the comparator. Together, the root mean square value is used to estimate the total thermal noise with high accuracy.

The global optimization process is terminated once a specified number of parameters have converged. Convergence in the global phase is defined by a sufficiently small variance in the current population. Once the global solver selects the optimal parameter vector  $x_{\text{best}}$ , the local solver is initialized with  $x_{\text{best}}$  and runs until full convergence. Design variables that have already converged during the global phase remain fixed throughout the local optimization phase to prevent unnecessary move.

# 5.5.4 Fast Local Optimization Using Parallel Multi-Fidelity Transient Simulation

Local optimizers guarantee efficiency in finding a nearby optima in the vicinity of a good global starting point. To reduce the number of simulations, in the local optimization stage, the converged design variables are fixed and only unconverged ones will be further optimized in this stage for exploitation. Local optimization algorithms have been well developed and have been shown to be robust, reliable, and fast in finding a local optimal solution. Their forms vary depending on whether the problem is constrained and whether derivative information is available. A modified PS method [51] is used to locally solve the constrained derivative-free optimization problem, as shown in Fig. 5.4 and Algorithm 3. It starts from the best-known design and iteratively performs exploratory searches. If an improvement is found, a pattern move is applied; otherwise, the step size is reduced. Periodically, the current best design is validated using a full sine wave test to ensure accuracy before continuing. The process repeats until convergence or the maximum iteration limit is reached. The frozen mask M is used to exclude design variables that have reached local convergence from further coordinate updates. Once the step size  $\Delta_i$  for a dimension becomes smaller than a predefined threshold, that variable is marked as frozen and skipped in subsequent exploratory searches. For this task, it typically requires 40-50 iterations for convergence.

Accurate SNDR characterization of high-resolution SAR ADCs through transient simulation is often computationally intensive, due to the long simulation times required to capture sufficient output samples at the Nyquist rate. As the resolution increases, the demands on noise and linearity become more stringent, necessitating longer simulation durations and finer time steps. These computational challenges not only slow down the verification process but also make it impractical to include conventional SNDR evaluation within design automation or optimization loops, where rapid and repeated performance assessments are essential.

To mitigate this, an efficient and scalable simulation method is proposed that leverages the periodic and deterministic nature of the input sinewave and the ADC's response, shown in Fig. 5.5. Instead of performing a single long simulation at the full ADC sampling rate  $f_s$ , the method divides the task into M parallel simulations, each operating at a reduced sampling rate of  $f_s/M$ . In each of the M simulations, the input sine wave is phase-shifted by  $\Delta \phi_k = \frac{2\pi k}{M}$ , or equivalently, a time shift of  $\Delta t_k = \frac{T_{\rm in}}{M} \cdot k$ , where  $T_{\rm in}$  is the sine wave period and  $k = 0, 1, \ldots, M-1$ . This ensures full phase coverage of the sine wave cycle across all simulations, while each simulation only needs to capture a reduced portion of the full-rate behavior. Each parallel simulation collects a subset (e.g., 4 samples) of the 4-bit ADC output data, which is then interleaved to a 16-point output data for FFT-related performance extraction.



Figure 5.5: Illustration of phase-shifted and time-interleaved parallel transient simulation: 16-point coverage via  $4\times4$  samples.

To evaluate the computational benefit, Table 5.2 summarizes the measured simulation times for various M. The results show that while the SNDR remains virtually constant (within 0.04 dB of the full Spectre simulation), the total simulation time decreases almost linearly with M, yielding over  $14 \times$  speed-up for M=16 without any observable loss in accuracy. This confirms that the proposed multi-phase approach achieves substantial efficiency improvement over a conventional full-rate simulation. It significantly reduces simulation time without degrading SNDR accuracy. Owing to its scalable parallel structure, the proposed method can be seamlessly integrated into automated verification flows for fast FFT-based performance evaluation.

The proposed phase-shifted time-interleaved parallel simulation method is embedded within the local optimization algorithm. Even with a parallel setup, a single simulation can take several minutes. So in this framework, full-accuracy simulations are periodically invoked using the proposed technique to ensure accurate evaluation of SNDR. For intermediate query points during the local search, low-cost coarse evaluations are applied. A

#### Algorithm 3 Blended Hooke-Jeeves with Selective Rollback

```
1: Init: x_0, step sizes \Delta_i, frozen mask M, tolerance \varepsilon
 2: x_{\text{best}} \leftarrow x_0, x_{\text{backup}} \leftarrow x_0, w \leftarrow 0.5, c \leftarrow 0
 3: Evaluate f_{\text{cheap}}(x_{\text{best}}) and f_{\text{expensive}}(x_{\text{best}})
 4: f_{\text{backup}} \leftarrow f_{\text{expensive}}(x_{\text{best}})
 5: for k = 1 to max_iter do
 6:
           Exploratory search around x_{\text{best}} (skip frozen)
           if improved x_{\text{new}} found then
 7:
                 Pattern move: extrapolate while improving
 8:
 9:
                 x_{\text{best}} \leftarrow x_{\text{curr}}, \ c \leftarrow c + 1
                 if c \mod \lambda = 0 then
10:
                       Evaluate f_{\text{expensive}}(x_{\text{best}})
11:
                       penalty \leftarrow a \cdot \max(0, f_{\text{exp}} - f_{\text{backup}})
12:
                       f_{\text{blend}} \leftarrow (1 - w) f_{\text{cheap}} + w \cdot \text{penalty}
13:
                       if f_{\text{blend}} > f_{\text{cheap}} then
14:
                            Rollback to x_{\text{backup}}, shrink \Delta[\sim M]
15:
16:
                            w \leftarrow \min(w + \delta_w, 1)
17:
                       else
18:
                             Update backup
                       end if
19:
                 end if
20:
           else
21:
22:
                 Shrink \Delta_i for non-improved, unfrozen i
23:
           if ||\Delta[\sim M]|| < \varepsilon then break
24:
           end if
25:
26: end for
27: Return: x_{\text{best}}, f_{\text{cheap}}, f_{\text{expensive}}
```

Notes:  $\lambda$  is the frequency of expensive evaluations; a is the penalty scale factor;  $\delta_w$  is the update step for w after rollback;  $\varepsilon$  is the termination tolerance; w is the weight in the blended cost function;  $\Delta_i$  are the coordinate step sizes; M is the frozen dimension mask.

Table 5.2: Simulation performance versus number of segments M with a 12-bit 20 MHz SAR ADC.

| $\mathbf{M}$ | SNDR (dB)                                       | Simulation Time (s) | Speed-up Factor |  |  |  |  |
|--------------|-------------------------------------------------|---------------------|-----------------|--|--|--|--|
| 1            | 72.03                                           | 8605                | 1.00            |  |  |  |  |
| 2            | 72.04                                           | 4320                | 1.99            |  |  |  |  |
| 4            | 72.03                                           | 2184                | 3.94            |  |  |  |  |
| 8            | 72.05                                           | 1120                | 7.68            |  |  |  |  |
| 16           | 72.06                                           | 581                 | 14.81           |  |  |  |  |
|              | Reference (Spectre simulation): SNDR = 72.07 dB |                     |                 |  |  |  |  |

blended objective function  $f_{\text{blended}}(x)$ , which linearly combines  $f_{\text{cheap}}(x)$  and a penalty term derived from an expensive evaluation. The blending coefficient  $w \in [0,1]$  determines the degree of trust in the penalty feedback: when w is small, the optimization follows the cheap evaluation, while a larger w increases correction based on accurate feedback. This mechanism maintains convergence reliability while achieving nearly an order-of-magnitude reduction in total runtime.

#### 5.6 Experimental Results

The sizing tool was developed using Matlab, with user inputs managed through a YAML file. The simulator is Cadence Spectre. The global optimization phase requires 3 hours on average, while the local optimization phase takes an additional 3 hours, using a 32-core machine running at 3.5 GHz.

The proposed methodology was validated in a 65 nm CMOS process using ten designs with  $\alpha = 1$  and two designs with  $\alpha = 2$ , as shown in Fig. 5.6. The designs range from 7-bit to 12-bit resolution and operate over sampling rates between 100 kHz and 250 MHz. As illustrated, all design cases meet the required specifications, achieving SNDR values up to 72.2 dB and FoMs up to 177.3 dB. As an example, Fig. 5.7 shows the optimization process of one design case.

Table 5.3 summarizes and compares the proposed methodology and its performance with SAR ADC designs using block-level design approaches with similar resolutions and sampling rates. The proposed approach demonstrates superior speed performance compared to existing synthesized SAR ADC implementations. In addition, it significantly reduces the manual design effort by automating the whole design process, thereby en-



Figure 5.6: SNDR and FoM of 12 design cases: (a) 12 bit (b) 7 bit. 10 design cases with  $\alpha = 1$  and 2 design cases with  $\alpha = 2$ .



Figure 5.7: An example sizing process for the 12 bit SAR ADC, with the convergence plots of both the global and local optimization.

abling efficient exploration of the performance space. Although the methodology is demon-

strated in 65 nm CMOS, the sizing framework is technology-independent, requiring only technology-specific inputs such as the PDK and design specifications. Furthermore, it is architecture-independent, as critical design decisions are made automatically by the tool rather than being manually specified by the designer.

Table 5.3: Comparison with prior SAR ADC designs.

| Feature               | 1     | <b>5-II-18</b> <sup>†</sup><br>31] | <b>TVL</b> 9 | SI-18 <sup>†</sup><br>32] | 1 - 0 | <b>D-22</b> <sup>‡</sup><br>33] | This V | Work*     |
|-----------------------|-------|------------------------------------|--------------|---------------------------|-------|---------------------------------|--------|-----------|
| Process [nm]          | 180   | 28                                 | 40           | 40                        | 40    | 40                              | 65     | 65        |
| Power Supply [V]      | 1.8   | 1                                  | 1            | 1                         | 1.2   | 0.7                             | 1      | 1         |
| $F_s [\mathrm{MS/s}]$ | 0.1   | 50                                 | 32           | 1                         | 80    | 1                               | 150    | <b>20</b> |
| Resolution [bit]      | 12    | 11                                 | 8            | 12                        | 10    | 12                              | 7      | 12        |
| Power $[\mu W]$       | 31.6  | 399                                | 187          | 16.7                      | 754.8 | 9.6                             | 480    | 308       |
| SNDR [dB]             | 63.3  | 56.8                               | 47.4         | 61.1                      | 56.3  | 68.8                            | 42.0   | 72.2      |
| SFDR [dB]             | 70.2  | 69.2                               | 57.8         | 68.3                      | 70.3  | 85.8                            | 58.3   | 89.3      |
| ENOB                  | 10.2  | 9.1                                | 7.6          | 9.9                       | 9.1   | 11.1                            | 6.68   | 11.7      |
| $FoM_S [dB]^1$        | 155.3 | 164.8                              | 156.7        | 165.8                     | 166.8 | 176.0                           | 153.9  | 177.3     |
| $FoM_W [fJ/cstep]^2$  | 265.5 | 14.1                               | 30.7         | 18.1                      | 10.8  | 4.3                             | 31.6   | 4.6       |

 $<sup>\</sup>begin{array}{l} ^{1} \ \mathrm{FoM_{S}} = \mathrm{SNDR} + 10 \cdot \mathrm{log_{10}} (\mathrm{Fs/2/Power}). \\ ^{2} \ \mathrm{FoM_{W}} = \mathrm{Power} \ / \ (2^{\mathrm{ENOB}} \ \cdot \ \mathrm{Fs}). \end{array}$ 

#### 5.7 Summary

The design of SAR ADCs plays a critical role in modern electronic systems due to their energy efficiency and suitability for moderate-to-high resolution applications. However, their design remains complex, requiring careful coordination across multiple building blocks such as the comparator, DAC, S/H, and control logic. This chapter presents a novel contribution to this area through a system-level optimization framework aimed at automating the SAR ADC design process.

The proposed approach introduces a global-local optimization strategy that balances computational efficiency with design accuracy. The global stage performs fast, low-cost exploration using low-fidelity tests, while the local stage applies parallel, multi-fidelity refinement to unconverged parameters using full sine-wave simulations. This strategy significantly reduces simulation overhead while maintaining high fidelity to system-level performance metrics.

<sup>&</sup>lt;sup>†</sup> Measurement results.

<sup>&</sup>lt;sup>‡</sup> Post-layout results.

<sup>\*</sup> Pre-layout results.

In addition, a method was proposed to automatically derive block-level design constraints from top-level specifications such as SNDR and power. This automation minimizes manual efforts and enhances the adaptability of the framework to different performance targets and technology nodes.

The framework was demonstrated to be both practical and scalable, achieving competitive designs across a variety of resolution and speed targets within a typical runtime of seven hours. This system-level approach addresses key limitations of traditional block-level methods and contributes a robust, reusable methodology for AI-driven analog circuit sizing.

## Chapter 6

### Conclusions and Future Work

This thesis explores AI-driven methodologies for automating AMS circuit design, aiming to overcome the inefficiency and heuristic dependence of conventional manual flows. Three main contributions are presented. First, an AI-empowered sizing framework is validated in both simulation and silicon, showing consistent improvements over  $g_m/I_D$ -based design while providing design insights into the trade-offs discovered by AI. Second, a co-design methodology for LDOs and LC-tank oscillators demonstrates that subsystem-level optimization yields superior PN, power efficiency, and PVT robustness compared to traditional sequential flows. Third, a hierarchical global-local framework for asynchronous SAR ADCs achieves state-of-the-art performance across diverse resolutions and sampling rates, while reducing design time to hours and requiring minimal manual intervention.

Together, these studies establish that intelligent optimization-based approaches can accelerate design cycles, improve performance and PVT robustness, and broaden accessibility, bridging the gap between heuristic methods and fully automated design. The results highlight the practicality of integrating AI and optimization into future industrial design flows.

#### 6.1 Analog Bulding Block Sizing

This work investigates the application of AI-empowered optimization for analog building block sizing, with emphasis on wide silicon validation and comprehensive design insight-based comparison. An AI-empowered sizing framework is developed using the ESSAB algorithm, a surrogate model-assisted global optimization technique tailored to analog design. Unlike prior schematic-level studies, this work incorporates designer-in-the-loop

validation and extends to fabricated chips, thereby ensuring credibility beyond simulation. Four representative circuits including a comparator, standard op-amp, low-power op-amp, and LC oscillator are explored across technology nodes ranging from 65 nm to  $0.35\,\mu\mathrm{m}$ , with three case studies validated through silicon measurements.

Results demonstrate that AI-empowered methods consistently outperform traditional  $g_m/I_D$ -based manual approaches, achieving substantial reductions in power and noise while keeping key performances competitive such as UGB, CMRR, and PN. For instance, in the comparator design, noise was reduced by approximately 40% and power by 35% without compromising speed, while the op-amp achieved nearly 44% power savings alongside improved frequency response. Beyond numerical superiority, design insight analysis explains why AI methods succeed: they automatically identify subtle parameter trade-offs, such as transistor biasing points and pole–zero placement, which are difficult to capture with manual heuristics. Importantly, the AI-empowered designs remain consistent with circuit principles while reducing reliance on designer intuition.

Through extensive validation and design-insight comparison, this work provides one of the first comprehensive demonstrations that AI-empowered analog sizing is both practical and advantageous in silicon. The findings establish AI as a credible tool for enhancing design efficiency, reducing development time from weeks to hours, and broadening accessibility for less experienced designers by shifting expertise from heuristic sizing toward problem formulation and result validation.

Future work can proceed in several directions. First, while this study has validated AI-empowered sizing on representative analog building blocks, extending the methodology to larger and more complex mixed-signal systems remains an open challenge. Such circuits often involve higher-dimensional design spaces and stricter performance trade-offs, requiring further advances in surrogate modeling and optimization strategies. Second, improving model generalization across technology nodes would enhance reusability and reduce the overhead of retraining for each process. Finally, tighter integration of AI-based optimization with existing industrial EDA flows could enable a seamless co-design framework, thereby accelerating deployment in practical design environments.

### 6.2 LDO and VCO Co-Design

This work presents an AI-driven EDA methodology for the co-design of LDOs and LC-tank VCOs, addressing the limitations of traditional sequential design approaches. In conventional practice, VCOs are optimized first to minimize PN, followed by LDO design with the VCO load. However, this sequential flow overlooks the mutual noise interactions between the two blocks, particularly the up-conversion of low-frequency LDO noise into VCO output PN. To overcome this limitation, the proposed work introduces a co-design framework powered by the ESSAB algorithm, which is capable of handling complex trade-offs across subsystems.

Using a TSMC 65 nm CMOS process, a 5.6 GHz LC-tank VCO integrated with an LDO is designed and benchmarked under both sequential and co-design methodologies. The co-design approach improves PN by 1.2 dB at a 1 MHz offset and reduces dynamic power consumption by 28.8%, which achieves a 2.4 dBc/Hz improvement in FoM compared to the sequential method. These improvements are attributed to the simultaneous consideration of VCO transistor sizing, LDO PSR, and bypass capacitor effects, which are traditionally optimized in isolation. Moreover, pre-layout and post-layout results confirm that the co-design approach delivers consistent performance across 32 PVT corners, demonstrating both efficiency and robustness.

By highlighting subsystem-level interactions and leveraging AI-driven sizing method, this work establishes the feasibility and advantage of co-design for analog building blocks. It demonstrates that moving beyond block-level optimization to a holistic methodology yields superior results in terms of noise, power, and robustness, while also reducing design time by more than half compared to sequential sizing.

Future work will extend the co-design methodology to larger and more heterogeneous mixed-signal systems, in particular PLLs, where regulators, oscillators, dividers, and amplifiers interact simultaneously. Enhancing the scalability of the optimization framework will be critical to managing the expanded design space. Another direction is the incorporation of layout parasitics and electromagnetic effects directly into the optimization loop, reducing the gap between pre-layout predictions and post-layout outcomes, especially for high frequency circuits.

#### 6.3 SAR ADC Design

This work presents a system-level global-local optimization framework for automated asynchronous SAR ADC design, addressing the limitations of block-level methods that often yield suboptimal system performance and require significant manual effort. The proposed methodology integrates a computationally efficient global optimizer based on ESSAB with a local optimizer using parallel, multi-fidelity pattern search. The global phase rapidly explores the high-dimensional design space using low-cost evaluations such as single-point tests, while the local phase refines unconverged variables with accurate full sine wave simulations. This hierarchical approach ensures both computational efficiency and precise system-level performance evaluation.

The methodology was validated on 12 design cases implemented in TSMC 65 nm CMOS, covering resolutions from 7 to 12 bits and sampling rates from  $100\,\mathrm{kHz}$  to  $250\,\mathrm{MHz}$ . Results show that the framework achieves state-of-the-art performance, with up to  $72.2\,\mathrm{dB}$  SNDR, 11.7-bit ENOB, and  $177.3\,\mathrm{dB}$  FoM<sub>S</sub>, while reducing design time to under six hours. Compared with prior block-level or semi-automated approaches, the proposed framework delivers superior results by explicitly capturing inter-block interactions, automating specification allocation, and minimizing manual intervention. Furthermore, the optimization framework is both technology- and architecture-independent, requiring only PDK information and system specifications, making it broadly applicable across different design contexts.

By unifying global exploration with local refinement and embedding fast yet accurate simulation techniques, this work demonstrates that system-level sizing can efficiently generate high-performance SAR ADC designs. The results establish the practicality of the proposed framework for accelerating design cycles and enhancing applicability across diverse resolution and frequency targets.

Future work will extend the global-local optimization framework to other data converter architectures, such as pipelined ADCs or hybrid SAR—pipeline structures, where interstage interactions and calibration requirements add further complexity. Incorporating layout parasitics and mismatch calibration directly into the optimization loop will be essential for closing the gap between pre-layout and silicon results. Another promising direction is the integration of machine learning-based performance prediction models to further reduce simulation overhead, enabling exploration of even higher-dimensional design spaces. Finally, deployment of the framework within commercial EDA flows could establish a practical pathway toward fully automated, system-level design of AMS data converters.

## Appendices

## A Tables

Table 1: Pre-layout performance comparison of the AI-empowered design and the reference design of the two-stage Miller-compensated op-amp with an additional area constraint.

| Performance                       | Specifi-<br>cation | Ref. design<br>(Nominal) | AI- empowered design (Nominal) | Ref. design<br>(WCC) | AI- empowered design (WCC) |
|-----------------------------------|--------------------|--------------------------|--------------------------------|----------------------|----------------------------|
| Power $(\mu W)$                   | Minimize           | 856                      | 443                            | 856                  | 393                        |
| CMRR (dB)                         | $\geq 90$          | 91                       | 122                            | 89                   | 120                        |
| PSRR (dB)                         | $\geq 100$         | 109                      | 124                            | 111                  | 120                        |
| ADM (dB)                          | $\geq 100$         | 101                      | 108                            | 101                  | 107                        |
| PM (°)                            | $\geq 60$          | 66                       | 75                             | 63                   | 79                         |
| GBW (MHz)                         | $\geq 2$           | 2.97                     | 5.36                           | 2.80                 | 4.26                       |
| IRN $(\mu V_{rms})$               | $\leq 6$           | 5.71                     | 3.94                           | 7.62                 | 5.66                       |
| Rise/Fall slew rate $(V/\mu s)$   | $\geq 2/2$         | 2.28/2.12                | 2.80/5.16                      | 2.50/2.37            | 2.54/4.83                  |
| Rise/Fall settling time $(\mu s)$ | $\leq 1/1$         | 0.67/0.97                | 0.57/0.38                      | 1.23/1.05            | 0.59/0.41                  |
| THD at 1Vpp, 1kHz (dB)            | $\leq -90$         | -115                     | -116                           | -101                 | -93                        |
| Area (mm <sup>2</sup> )           | $\leq 0.02$        | 0.0220                   | 0.0196                         | 0.0220               | 0.0196                     |

## Bibliography

- [1] W. M. Sansen, *Analog Design Essentials*. Springer Science & Business Media, 2007, vol. 859.
- [2] C. Toumazou, G. S. Moschytz and B. Gilbert, *Trade-offs in analog circuit design:* the designer's companion. Springer Science & Business Media, 2004.
- [3] P. R. Kinget, 'Device mismatch and tradeoffs in the design of analog circuits,' *IEEE Journal of Solid-State Circuits*, vol. 40, no. 6, pp. 1212–1224, 2005.
- [4] G. Gielen and R. Rutenbar, 'Computer-aided design of analog and mixed-signal integrated circuits,' *Proceedings of the IEEE*, vol. 88, no. 12, pp. 1825–1854, 2000. DOI: 10.1109/5.899053.
- [5] P. Michel, U. Lauther and P. Duzy, *The synthesis approach to digital system design*. Springer Science & Business Media, 1992, vol. 170.
- [6] J. Bürger, C. Teuscher and M. Perkowski, 'Digital logic synthesis for memristors,' Reed-Muller 2013, pp. 31–40, 2013.
- [7] A. Sangiovanni-Vincentelli, 'The tides of eda,' *IEEE Design & Test of Computers*, vol. 20, no. 6, pp. 59–75, 2003. DOI: 10.1109/MDT.2003.1246165.
- [8] P. Wambacq, F. Fernandez, G. Gielen, W. Sansen and A. Rodriguez-Vazquez, 'Efficient symbolic computation of approximated small-signal characteristics of analog integrated circuits,' *IEEE Journal of Solid-State Circuits*, vol. 30, no. 3, pp. 327–330, 1995. DOI: 10.1109/4.364450.
- [9] R. Phelps, M. Krasnicki, R. Rutenbar, L. Carley and J. Hellums, 'Anaconda: Simulation-based synthesis of analog circuits via stochastic pattern search,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 19, no. 6, pp. 703–717, 2000. DOI: 10.1109/43.848091.
- [10] G. Gielen, P. Wambacq and W. M. Sansen, 'Symbolic analysis methods and applications for analog circuits: A tutorial overview,' *Proceedings of the IEEE*, vol. 82, no. 2, pp. 287–304, 2002.
- [11] M. Hershenson, S. Boyd and T. Lee, 'Optimal design of a cmos op-amp via geometric programming,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 20, no. 1, pp. 1–21, 2001. DOI: 10.1109/43.905671.

- [12] D. Nam, Y. D. Seo, L.-J. Park, C. H. Park and B. Kim, 'Parameter optimization of an on-chip voltage reference circuit using evolutionary programming,' *IEEE Transactions on Evolutionary Computation*, vol. 5, no. 4, pp. 414–421, 2001. DOI: 10.1109/4235.942535.
- [13] M. Degrauwe, O. Nys, E. Dijkstra *et al.*, 'Idac: An interactive design tool for analog cmos circuits,' *IEEE Journal of Solid-State Circuits*, vol. 22, no. 6, pp. 1106–1116, 1987. DOI: 10.1109/JSSC.1987.1052861.
- [14] C. Goh and Y. Li, 'Ga automated design and synthesis of analog circuits with practical constraints,' in *Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546)*, vol. 1, 2001, 170–177 vol. 1. DOI: 10.1109/CEC.2001.934386.
- [15] G. Alpaydin, S. Balkir and G. Dundar, 'An evolutionary approach to automatic synthesis of high-performance analog integrated circuits,' *IEEE Transactions on Evolutionary Computation*, vol. 7, no. 3, pp. 240–252, 2003. DOI: 10.1109/TEVC. 2003.808914.
- [16] B. Liu, Y. Wang, Z. Yu et al., 'Analog circuit optimization system based on hybrid evolutionary algorithms,' Integration, vol. 42, no. 2, pp. 137–148, 2009, ISSN: 0167-9260. DOI: https://doi.org/10.1016/j.vlsi.2008.04.003. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167926008000126.
- [17] A. F. Budak, M. Gandara, W. Shi, D. Z. Pan, N. Sun and B. Liu, 'An Efficient Analog Circuit Sizing Method Based on Machine Learning Assisted Global Optimization,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 41, no. 5, pp. 1209–1221, May 2022, ISSN: 0278-0070, 1937-4151. DOI: 10.1109/TCAD.2021.3081405.
- [18] M. Emmerich, K. Giannakoglou and B. Naujoks, 'Single- and multiobjective evolutionary optimization assisted by gaussian random field metamodels,' *IEEE Transactions on Evolutionary Computation*, vol. 10, no. 4, pp. 421–439, 2006. DOI: 10.1109/TEVC.2005.859463.
- [19] A. F. Budak, P. Bhansali, B. Liu, N. Sun, D. Z. Pan and C. V. Kashyap, 'DNN-Opt: An RL Inspired Optimization for Analog Circuit Sizing using Deep Neural Networks,' in 2021 58th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA: IEEE, Dec. 2021, pp. 1219–1224, ISBN: 978-1-66543-274-0. DOI: 10.1109/DAC18074.2021.9586139.
- [20] Y. Choi, S. Park, M. Choi, K. Lee and S. Kang, 'MA-Opt: Reinforcement Learning-Based Analog Circuit Optimization Using Multi-Actors,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 71, no. 5, pp. 2045–2056, May 2024, ISSN: 1549-8328, 1558-0806. DOI: 10.1109/TCSI.2024.3356582.

- [21] Q. Zhang, W. Liu, E. Tsang and B. Virginas, 'Expensive multiobjective optimization by moea/d with gaussian process model,' *IEEE Transactions on Evolutionary Computation*, vol. 14, no. 3, pp. 456–474, 2010. DOI: 10.1109/TEVC.2009.2033671.
- [22] I. Voutchkov and A. Keane, 'Multi-objective optimization using surrogates,' in Computational Intelligence in Optimization: Applications and Implementations, Springer, 2010, pp. 155–175.
- [23] K. Deb, A. Pratap, S. Agarwal and T. Meyarivan, 'A fast and elitist multiobjective genetic algorithm: Nsga-ii,' *IEEE Transactions on Evolutionary Computation*, vol. 6, no. 2, pp. 182–197, 2002. DOI: 10.1109/4235.996017.
- [24] B. Liu, F. V. Fernández, G. Gielen, R. Castro-López and E. Roca, 'A memetic approach to the automatic design of high-performance analog integrated circuits,' ACM Trans. Des. Autom. Electron. Syst., vol. 14, no. 3, Jun. 2009, ISSN: 1084-4309. DOI: 10.1145/1529255.1529264. [Online]. Available: https://doi.org/10.1145/1529255.1529264.
- [25] B. Liu, Q. Zhang and G. G. E. Gielen, 'A gaussian process surrogate model assisted evolutionary algorithm for medium scale expensive optimization problems,' *IEEE Transactions on Evolutionary Computation*, vol. 18, no. 2, pp. 180–192, 2014. DOI: 10.1109/TEVC.2013.2248012.
- [26] B. He, S. Zhang, Y. Wang et al., 'A Batched Bayesian Optimization Approach for Analog Circuit Synthesis via Multi-Fidelity Modeling,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 42, no. 2, pp. 347–359, Feb. 2023, ISSN: 0278-0070, 1937-4151. DOI: 10.1109/TCAD.2022.3175241.
- [27] A. F. Budak, M. Gandara, W. Shi, D. Z. Pan, N. Sun and B. Liu, 'An efficient analog circuit sizing method based on machine learning assisted global optimization,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 41, no. 5, pp. 1209–1221, 2021.
- [28] K. Gulati and H.-S. Lee, 'A high-swing cmos telescopic operational amplifier,' *IEEE Journal of Solid-State Circuits*, vol. 33, no. 12, pp. 2010–2019, 1998.
- [29] R. S. Assaad and J. Silva-Martinez, 'The recycling folded cascode: A general enhancement of the folded cascode amplifier,' *IEEE Journal of Solid-State Circuits*, vol. 44, no. 9, pp. 2535–2542, 2009.
- [30] P. J. Hurst, S. H. Lewis, J. P. Keane, F. Aram and K. C. Dyer, 'Miller compensation using current buffers in fully differential cmos two-stage operational amplifiers,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 51, no. 2, pp. 275–285, 2004.
- [31] S. Koziel, S. Szczepanski and R. Schaumann, 'A general approach to continuoustime gm-c filters,' *International Journal of Circuit Theory and Applications*, vol. 31, no. 4, pp. 361–383, 2003.

- [32] Y. Jiang and E. K. Lee, 'Design of low-voltage bandgap reference using transim-pedance amplifier,' *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 47, no. 6, pp. 552–555, 2000.
- [33] C.-H. Huang, Y.-T. Ma and W.-C. Liao, 'Design of a low-voltage low-dropout regulator,' *IEEE transactions on very large scale integration (VLSI) systems*, vol. 22, no. 6, pp. 1308–1313, 2013.
- [34] G. Xing, S. H. Lewis and T. Viswanathan, 'Self-biased unity-gain buffers with low gain error,' *IEEE transactions on circuits and systems II: express briefs*, vol. 56, no. 1, pp. 36–40, 2009.
- [35] C. Kurth and G. Moschytz, 'Nodal analysis of switched-capacitor networks,' *IEEE Transactions on Circuits and Systems*, vol. 26, no. 2, pp. 93–105, 2003.
- [36] A. Mirzaei, H. Darabi, J. C. Leete and Y. Chang, 'Analysis and optimization of direct-conversion receivers with 25% duty-cycle current-driven passive mixers,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 57, no. 9, pp. 2353–2366, 2010.
- [37] B. Razavi, 'The strongarm latch [a circuit for all seasons],' *IEEE Solid-State Circuits Magazine*, vol. 7, no. 2, pp. 12–17, 2015.
- [38] S. Babayan-Mashhadi and R. Lotfi, 'Analysis and design of a low-voltage low-power double-tail comparator,' *IEEE transactions on very large scale integration (vlsi)* systems, vol. 22, no. 2, pp. 343–352, 2013.
- [39] D. Ham and A. Hajimiri, 'Concepts and methods in optimization of integrated lc vcos,' *IEEE journal of solid-state circuits*, vol. 36, no. 6, pp. 896–909, 2002.
- [40] D. Banerjee, PLL performance, simulation and design. Dog Ear Publishing, 2006.
- [41] C.-C. Liu, S.-J. Chang, G.-Y. Huang, Y.-Z. Lin and C.-M. Huang, 'A 1v 11fj/conversion-step 10bit 10ms/s asynchronous sar adc in 0.18  $\mu$ m cmos,' in 2010 Symposium on VLSI Circuits, IEEE, 2010, pp. 241–242.
- [42] L. Chen, A. Sanyal, J. Ma, X. Tang and N. Sun, 'Comparator common-mode variation effects analysis and its application in SAR ADCs,' in 2016 IEEE International Symposium on Circuits and Systems (ISCAS), IEEE, 2016, pp. 2014—2017.
- [43] A. Hajimiri and T. Lee, 'Design issues in CMOS differential LC oscillators,' *IEEE Journal of Solid-State Circuits*, vol. 34, no. 5, pp. 717–724, May 1999, ISSN: 00189200. DOI: 10.1109/4.760384.
- [44] M. Garampazzi, P. M. Mendes, N. Codega, D. Manstretta and R. Castello, 'Analysis and design of a 195.6 dBc/Hz peak FoM P-N class-B oscillator with transformer-based tail filtering,' *IEEE Journal of Solid-State Circuits*, vol. 50, no. 7, pp. 1657–1668, 2015. DOI: 10.1109/JSSC.2015.2413851.

- [45] W. Tung and S.-C. Huang, 'An Energy-Efficient 11-bit 10-MS/s SAR ADC with Monotonie Switching Split Capacitor Array,' in 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence: IEEE, May 2018, pp. 1–5, ISBN: 978-1-5386-4881-0. DOI: 10.1109/ISCAS.2018.8351306.
- [46] H. Koh, C. Sequin and P. Gray, 'Opasyn: A compiler for cmos operational amplifiers,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 9, no. 2, pp. 113–125, 1990. DOI: 10.1109/43.46777.
- [47] A. S. Lewis and M. L. Overton, 'Nonsmooth optimization via quasi-newton methods,' *Mathematical Programming*, vol. 141, no. 1, pp. 135–163, 2013.
- [48] P. T. Boggs and J. W. Tolle, 'Sequential quadratic programming,' *Acta numerica*, vol. 4, pp. 1–51, 1995.
- [49] F. Gao and L. Han, 'Implementing the nelder-mead simplex algorithm with adaptive parameters,' *Computational Optimization and Applications*, vol. 51, no. 1, pp. 259–277, 2012.
- [50] M. A. Luersen and R. Le Riche, 'Globalized nelder-mead method for engineering optimization,' *Computers & structures*, vol. 82, no. 23-26, pp. 2251–2260, 2004.
- [51] I. Moser and R. Chiong, 'A Hooke-Jeeves Based Memetic Algorithm for Solving Dynamic Optimisation Problems,' in Hybrid Artificial Intelligence Systems,
  E. Corchado, X. Wu, E. Oja, Á. l. Herrero and B. Baruque, Eds., vol. 5572, Berlin,
  Heidelberg: Springer Berlin Heidelberg, 2009, pp. 301–309, ISBN: 978-3-642-02318-7 978-3-642-02319-4. DOI: 10.1007/978-3-642-02319-4\_36.
- [52] K. Joki, A. M. Bagirov, N. Karmitsa, M. M. Makela and S. Taheri, 'Double bundle method for finding clarke stationary points in nonsmooth dc programming,' SIAM Journal on Optimization, vol. 28, no. 2, pp. 1892–1919, 2018.
- [53] R. Storn and K. Price, 'Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces,' *Journal of global optimization*, vol. 11, no. 4, pp. 341–359, 1997.
- [54] M. Fakhfakh, Y. Cooren, A. Sallem, M. Loulou and P. Siarry, 'Analog circuit design optimization through the particle swarm optimization technique,' *Analog integrated circuits and signal processing*, vol. 63, no. 1, pp. 71–82, 2010.
- [55] R. A. Vural and T. Yildirim, 'Analog circuit sizing via swarm intelligence,' AEU-International journal of electronics and communications, vol. 66, no. 9, pp. 732–740, 2012.
- [56] J. Zhang, H. S.-H. Chung, A. W.-L. Lo and T. Huang, 'Extended ant colony optimization algorithm for power electronic circuit design,' *IEEE Transactions on power Electronics*, vol. 24, no. 1, pp. 147–162, 2008.
- [57] W.-g. Wang, Y.-b. Ling, J. Zhang and Y. Wang, 'Ant colony optimization algorithm for design of analog filters,' in 2012 IEEE congress on evolutionary computation, IEEE, 2012, pp. 1–6.

- [58] J. Zhao, C. Yan, Z. Bi, F. Yang, X. Zeng and D. Zhou, 'A Novel and Efficient Bayesian Optimization Approach for Analog Designs with Multi-Testbench,' in 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC), Taipei, Taiwan: IEEE, Jan. 2022, pp. 86–91, ISBN: 978-1-66542-135-5. DOI: 10.1109/ASP-DAC52403.2022.9712590.
- [59] S. Yin, R. Wang, J. Zhang, X. Liu and Y. Wang, 'Fast Surrogate-Assisted Constrained Multiobjective Optimization for Analog Circuit Sizing via Self-Adaptive Incremental Learning,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 42, no. 7, pp. 2080–2093, Jul. 2023, ISSN: 0278-0070, 1937-4151. DOI: 10.1109/TCAD.2022.3221694.
- [60] T. Gu, W. Li, A. Zhao et al., 'BBGP-sDFO: Batch Bayesian and Gaussian Process Enhanced Subspace Derivative Free Optimization for High-Dimensional Analog Circuit Synthesis,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 43, no. 2, pp. 417–430, Feb. 2024, ISSN: 0278-0070, 1937-4151. DOI: 10.1109/TCAD.2023.3314519.
- [61] B. He, S. Zhang, F. Yang, C. Yan, D. Zhou and X. Zeng, 'An efficient bayesian optimization approach for analog circuit synthesis via sparse gaussian process modeling,' in 2020 Design, automation & test in Europe conference & exhibition (DATE), IEEE, 2020, pp. 67–72.
- [62] W. Lyu, P. Xue, F. Yang et al., 'An efficient bayesian optimization approach for automated optimization of analog circuits,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 65, no. 6, pp. 1954–1967, 2018. DOI: 10.1109/ TCSI.2017.2768826.
- [63] W. Lyu, F. Yang, C. Yan, D. Zhou and X. Zeng, 'Batch bayesian optimization via multi-objective acquisition ensemble for automated analog circuit design,' in *International conference on machine learning*, PMLR, 2018, pp. 3306–3314.
- [64] S. Daulton, M. Balandat and E. Bakshy, 'Differentiable expected hypervolume improvement for parallel multi-objective bayesian optimization,' *Advances in neural information processing systems*, vol. 33, pp. 9851–9864, 2020.
- [65] S. Ament, S. Daulton, D. Eriksson, M. Balandat and E. Bakshy, 'Unexpected improvements to expected improvement for bayesian optimization,' *Advances in Neural Information Processing Systems*, vol. 36, pp. 20577–20612, 2023.
- [66] D. Zhan and H. Xing, 'Expected improvement for expensive optimization: A review,' *Journal of Global Optimization*, vol. 78, no. 3, pp. 507–544, 2020.
- [67] S.-C. Wang, 'Artificial neural network,' in *Interdisciplinary computing in java programming*, Springer, 2003, pp. 81–100.

- [68] Y. Gal and Z. Ghahramani, 'Dropout as a bayesian approximation: Representing model uncertainty in deep learning,' in *Proceedings of The 33rd International Conference on Machine Learning*, M. F. Balcan and K. Q. Weinberger, Eds., ser. Proceedings of Machine Learning Research, vol. 48, New York, New York, USA: PMLR, 20–22 Jun 2016, pp. 1050–1059. [Online]. Available: https://proceedings.mlr.press/v48/gal16.html.
- [69] J. Lambert, O. Sener and S. Savarese, 'Deep learning under privileged information using heteroscedastic dropout,' in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 8886–8895. DOI: 10.1109/CVPR.2018.00926.
- [70] P. Goel and L. Chen, 'On the robustness of monte carlo dropout trained with noisy labels,' in *Proceedings of the IEEE/CVF conference on computer vision and pattern recognition*, 2021, pp. 2219–2228.
- [71] A. K. Gupta and S. Nadarajah, *Handbook of beta distribution and its applications*. CRC press, 2004.
- [72] Q. Xie, J. Xu and Y. Taur, 'Review and critique of analytic models of MOSFET short-channel effects in subthreshold,' *IEEE Transactions on Electron Devices*, vol. 59, no. 6, pp. 1569–1579, 2012. DOI: 10.1109/TED.2012.2191556.
- [73] F. Silveira, D. Flandre and P. Jespers, 'A gm/ID based methodology for the design of CMOS analog circuits and its application to the synthesis of a silicon-on-insulator micropower OTA,' *IEEE Journal of Solid-State Circuits*, vol. 31, no. 9, pp. 1314–1319, 1996. DOI: 10.1109/4.535416.
- [74] D. Flandre, A. Viviani, J.-P. Eggermont, B. Gentinne and P. Jespers, 'Improved synthesis of gain-boosted regulated-cascode CMOS stages using symbolic analysis and gm/ID methodology,' *IEEE Journal of Solid-State Circuits*, vol. 32, no. 7, pp. 1006–1012, 1997. DOI: 10.1109/4.597291.
- [75] T. Konishi, K. Inazu, J. Lee, M. Natsui, S. Masui and B. Murmann, 'Design optimization of high-speed and low-power operational transconductance amplifier using gm/ID lookup table methodology,' *IEICE Transactions*, vol. 94-C, Mar. 2011. DOI: 10.1587/transele.E94.C.334.
- [76] P. G. Jespers and B. Murmann, Systematic Design of Analog CMOS Circuits. Cambridge University Press, 2017.
- [77] M. N. Sabry, H. Omran and M. Dessouky, 'Systematic design and optimization of operational transconductance amplifier using gm/ID design methodology,' *Microelectronics Journal*, vol. 75, pp. 87–96, 2018, ISSN: 0026-2692. DOI: 10.1016/j.mejo.2018.02.002.
- [78] H. Aminzadeh, 'Systematic circuit design and analysis using generalised gm/ID functions of MOS devices,' *IET Circuits, Devices & Systems*, vol. 14, no. 4, pp. 432–443, 2020.

- [79] J. Ou and P. M. Ferreira, 'Implications of small geometry effects on g<sub>m</sub>/I <sub>D</sub> based design methodology for analog circuits,' *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 66, no. 1, pp. 81–85, 2019. DOI: 10.1109/TCSII.2018. 2846484.
- [80] G. Pollissard-Quatremère, G. Gosset and D. Flandre, 'A modified g m /I D design methodology for deeply scaled CMOS technologies,' Analog Integrated Circuits and Signal Processing, vol. 78, no. 3, pp. 771–784, Mar. 2014, ISSN: 0925-1030, 1573-1979. DOI: 10.1007/s10470-013-0166-z.
- [81] A. A. Youssef, B. Murmann and H. Omran, 'Analog IC design using precomputed lookup tables: Challenges and solutions,' *IEEE access: practical innovations, open solutions*, vol. 8, pp. 134 640–134 652, 2020. DOI: 10.1109/ACCESS.2020.3010875.
- [82] R. Martins, N. Lourenço, N. Horta *et al.*, 'Design of a 4.2-to-5.1 GHz ultralow-power complementary class-B /C hybrid-mode VCO in 65-nm CMOS fully supported by EDA tools,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 67, no. 11, pp. 3965–3977, 2020. DOI: 10.1109/TCSI.2020.3009857.
- [83] R. Martins, N. Lourenço, N. Horta, J. Yin, P.-I. Mak and R. P. Martins, 'Many-objective sizing optimization of a class-C/D VCO for ultralow-power IoT and ultralow-phase-noise cellular applications,' *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 27, no. 1, pp. 69–82, 2019. DOI: 10.1109/TVLSI. 2018.2872410.
- [84] L. Compassi-Severo, T. C. De-Oliveira, P. C. C. de Aguirre, W. V. Noije and A. G. Girardi, 'Variable Conversion Approach for Design Optimization of Low-Voltage Low-Pass Filters,' *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, pp. 1–14, 2023, ISSN: 1063-8210, 1557-9999. DOI: 10.1109/TVLSI.2023.3335877.
- [85] B. Liu, F. V. Fernández, G. Gielen, R. Castro-López and E. Roca, 'A memetic approach to the automatic design of high-performance analog integrated circuits,' *ACM Transactions on Design Automation of Electronic Systems (TODAES)*, vol. 14, no. 3, pp. 1–24, 2009.
- [86] S. Zhang, W. Lyu, F. Yang, C. Yan, D. Zhou and X. Zeng, 'Bayesian Optimization Approach for Analog Circuit Synthesis Using Neural Network,' in 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy: IEEE, Mar. 2019, pp. 1463–1468, ISBN: 978-3-9819263-2-3. DOI: 10.23919/DATE. 2019.8714788.
- [87] T. Liao and L. Zhang, 'Analog Integrated Circuit Sizing and Layout Dependent Effects: A Review,' *Microelectronics and Solid State Electronics*, 2014.

- [88] B. Liu, D. Zhao, P. Reynaert and G. G. E. Gielen, 'GASPAD: A General and Efficient mm-Wave Integrated Circuit Synthesis Method Based on Surrogate Model Assisted Evolutionary Algorithm,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 33, no. 2, pp. 169–182, Feb. 2014, ISSN: 0278-0070, 1937-4151. DOI: 10.1109/TCAD.2013.2284109.
- [89] W. Lyu, P. Xue, F. Yang et al., 'An Efficient Bayesian Optimization Approach for Automated Optimization of Analog Circuits,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 65, no. 6, pp. 1954–1967, Jun. 2018, ISSN: 1549-8328, 1558-0806. DOI: 10.1109/TCSI.2017.2768826.
- [90] K. Touloupas, N. Chouridis and P. P. Sotiriadis, 'Local Bayesian Optimization For Analog Circuit Sizing,' in 2021 58th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA: IEEE, Dec. 2021, pp. 1237–1242, ISBN: 978-1-66543-274-0. DOI: 10.1109/DAC18074.2021.9586172.
- [91] W. Cao, J. Gao, T. Ma, R. Ma, M. Benosman and X. Zhang, 'RoSE-Opt: Robust and Efficient Analog Circuit Parameter Optimization With Knowledge-Infused Reinforcement Learning,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 44, no. 2, pp. 627–640, Feb. 2025, ISSN: 0278-0070, 1937-4151. DOI: 10.1109/TCAD.2024.3435692.
- [92] M. Ahmadzadeh, J. Lappas, N. Wehn and G. Gielen, 'AnaCraft: Duel-Play Probabilistic-Model-based Reinforcement Learning for Sample-Efficient PVT-Robust Analog Circuit Sizing Optimization,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, pp. 1–1, 2025, ISSN: 0278-0070, 1937-4151. DOI: 10.1109/TCAD.2025.3582175.
- [93] T. Mukherjee, L. Carley and R. Rutenbar, 'Efficient handling of operating range and manufacturing line variations in analog cell synthesis,' *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 19, no. 8, pp. 825–839, Aug./2000, ISSN: 02780070. DOI: 10.1109/43.856971.
- [94] X. Tang, L. Shen, B. Kasap et al., 'An Energy-Efficient Comparator With Dynamic Floating Inverter Amplifier,' IEEE Journal of Solid-State Circuits, vol. 55, no. 4, pp. 1011–1022, Apr. 2020, ISSN: 0018-9200, 1558-173X. DOI: 10.1109/JSSC.2019. 2960485.
- [95] H. S. Bindra, C. E. Lokin, D. Schinkel, A.-J. Annema and B. Nauta, 'A 1.2-V Dynamic Bias Latch-Type Comparator in 65-nm CMOS With 0.4-mV Input Noise,' IEEE Journal of Solid-State Circuits, vol. 53, no. 7, pp. 1902–1912, Jul. 2018, ISSN: 0018-9200, 1558-173X. DOI: 10.1109/JSSC.2018.2820147.
- [96] L. Chen, A. Sanyal, J. Ma, X. Tang and N. Sun, 'Comparator common-mode variation effects analysis and its application in SAR ADCs,' in 2016 IEEE International Symposium on Circuits and Systems (ISCAS), Montréal, QC, Canada: IEEE, May 2016, pp. 2014–2017, ISBN: 978-1-4799-5341-7. DOI: 10.1109/ISCAS. 2016.7538972.

- [97] B. Razavi, 'The StrongARM Latch [A Circuit for All Seasons],' IEEE Solid-State Circuits Magazine, vol. 7, no. 2, pp. 12–17, 2015, ISSN: 1943-0582. DOI: 10.1109/ MSSC.2015.2418155.
- [98] B. Kamath, R. Meyer and P. Gray, 'Relationship between frequency response and settling time of operational amplifiers,' *IEEE Journal of Solid-State Circuits*, vol. 9, no. 6, pp. 347–352, Dec. 1974, ISSN: 0018-9200. DOI: 10.1109/JSSC.1974. 1050527.
- [99] A. Worapishet, A. Demosthenous and X. Liu, 'A CMOS Instrumentation Amplifier With 90-dB CMRR at 2-MHz Using Capacitive Neutralization: Analysis, Design Considerations, and Implementation,' *IEEE Transactions on Circuits and Systems* I: Regular Papers, vol. 58, no. 4, pp. 699–710, Apr. 2011, ISSN: 1549-8328, 1558-0806. DOI: 10.1109/TCSI.2010.2078850.
- [100] Z. Zhou, L. Zhu, W. Wang, J. Li, Q. Meng and Z. Wang, 'A Capacitively Coupled Chopper Instrumentation Amplifier With Compensated Auto-Zeroed DC Servo-Loop for Neural Signal Recording,' *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 70, no. 12, pp. 4314–4318, Dec. 2023, ISSN: 1549-7747, 1558-3791. DOI: 10.1109/TCSII.2023.3291375.
- [101] K. Watcharapongvinit, I. Yongpanich and W. Wattanapanitch, 'Design of a Low-Power Ground-Free Analog Front End for ECG Acquisition,' *IEEE Transactions on Biomedical Circuits and Systems*, vol. 17, no. 2, pp. 299–311, Apr. 2023, ISSN: 1932-4545, 1940-9990. DOI: 10.1109/TBCAS.2023.3249742.
- [102] Y.-K. Huang and S. Rodriguez, 'Noise Analysis and Design Methodology of Chopper Amplifiers With Analog DC-Servo Loop for Biopotential Acquisition Applications,' *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 32, no. 1, pp. 55–67, Jan. 2024, ISSN: 1063-8210, 1557-9999. DOI: 10.1109/TVLSI. 2023.3315417.
- [103] A. Hajimiri and T. H. Lee, 'Design issues in cmos differential lc oscillators,' *IEEE Journal of Solid-State Circuits*, vol. 34, no. 5, pp. 717–724, 2002.
- [104] B. Soltanian, H. Ainspan, Woogeun Rhee, D. Friedman and P. Kinget, 'An Ultra-Compact Differentially Tuned 6-GHz CMOS LC-VCO With Dynamic Common-Mode Feedback,' *IEEE Journal of Solid-State Circuits*, vol. 42, no. 8, pp. 1635–1641, Aug. 2007, ISSN: 0018-9200. DOI: 10.1109/JSSC.2007.903068.
- [105] E.-S. A. Kytonaki and Y. Papananos, 'A Low-Voltage Differentially Tuned Current-Adjusted 5.5-GHz Quadrature VCO in 65-nm CMOS Technology,' *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 58, no. 5, pp. 254–258, May 2011, ISSN: 1549-7747, 1558-3791. DOI: 10.1109/TCSII.2011.2149010.
- [106] T.-P. Wang and S.-Y. Wang, 'Frequency-tuning negative-conductance boosted structure and applications for low-voltage low-power wide-tuning-range vco,' *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 23, no. 6, pp. 1137–1144, 2015. DOI: 10.1109/TVLSI.2014.2333692.

- [107] M. Fang and T. Yoshimasu, 'An Ultra-Low-Power Octave-Tuning VCO IC With a Single Analog Voltage-Controlled Novel Varactor,' *IEEE Transactions on Circuits* and Systems I: Regular Papers, vol. 69, no. 12, pp. 4751–4760, Dec. 2022, ISSN: 1549-8328, 1558-0806. DOI: 10.1109/TCSI.2022.3193976.
- [108] H. Yoon, Y. Lee, J. J. Kim and J. Choi, 'A Wideband Dual-Mode \$LC\$-VCO With a Switchable Gate-Biased Active Core,' *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 61, no. 5, pp. 289–293, May 2014, ISSN: 1549-7747, 1558-3791. DOI: 10.1109/TCSII.2014.2305216.
- [109] A. Bhat and N. Krishnapura, '26.3 a 25-to-38ghz, 195db fomt lc qvco in 65nm lp cmos using a 4-port dual-mode resonator for 5g radios,' in 2019 IEEE International Solid-State Circuits Conference-(ISSCC), IEEE, 2019, pp. 412–414.
- [110] K. Settaluri, A. Haj-Ali, Q. Huang, K. Hakhamaneshi and B. Nikolic, AutoCkt: Deep Reinforcement Learning of Analog Circuit Designs, Jan. 2020. arXiv: 2001. 01808 [eess].
- [111] H. Wang, K. Wang, J. Yang et al., 'GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning,' in 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA: IEEE, Jul. 2020, pp. 1–6, ISBN: 978-1-72811-085-1. DOI: 10.1109/DAC18072. 2020.9218757.
- [112] J. Zhang, J. Bao, Z. Huang, X. Zeng and Y. Lu, 'Automated Design of Complex Analog Circuits with Multiagent based Reinforcement Learning,' in 2023 60th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA: IEEE, Jul. 2023, pp. 1–6, ISBN: 9798350323481. DOI: 10.1109/DAC56929.2023.10247909.
- [113] D. Leenaerts, J. Van der Tang and C. S. Vaucher, Circuit design for RF transceivers. Springer, 2001.
- [114] Q. Gu, RF system design of transceivers for wireless communications. Springer, 2005.
- [115] B. A. Floyd, 'A 16–18.8-ghz sub-integer-n frequency synthesizer for 60-ghz transceivers,' *IEEE Journal of Solid-State Circuits*, vol. 43, no. 5, pp. 1076–1086, 2008.
- [116] R. Rajsuman, System-on-a-chip: Design and Test. Artech, 2000.
- [117] D. R. Stauffer, J. T. Mechler, M. A. Sorna et al., High speed serdes devices and applications. Springer Science & Business Media, 2008.
- [118] X. Zheng, C. Zhang, F. Lv et al., 'A 40-gb/s quarter-rate serdes transmitter and receiver chipset in 65-nm cmos,' *IEEE Journal of Solid-State Circuits*, vol. 52, no. 11, pp. 2963–2978, 2017.
- [119] Xuejin Wang and B. Bakkaloglu, 'Systematic Design of Supply Regulated *LC* Tank Voltage-Controlled Oscillators,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 55, no. 7, pp. 1834–1844, Aug. 2008, ISSN: 1549-8328, 1558-0806. DOI: 10.1109/TCSI.2008.918004.

- [120] B. S. Rikan, H. Abbasizadeh, T. T. K. Nga, S. J. Kim and K.-Y. Lee, 'A low leakage retention ldo and leakage-based bgr with 120na quiescent current,' in 2017 International SoC Design Conference (ISOCC), IEEE, 2017, pp. 200–201.
- [121] D.-K. Kim, S.-U. Shin and H.-S. Kim, 'A bgr-recursive low-dropout regulator achieving high psr in the low-to mid-frequency range,' *IEEE Transactions on Power Electronics*, vol. 35, no. 12, pp. 13441–13454, 2020.
- [122] A. N. Karanicolas, 'A 2.7-v 900-mhz cmos lna and mixer,' *IEEE Journal of Solid-State Circuits*, vol. 31, no. 12, pp. 1939–1944, 2002.
- [123] J. Ferreira, I. Bastos, L. Oliveira et al., 'Lna, oscillator and mixer co-design for compact rf-cmos ism receivers,' in 2009 MIXDES-16th International Conference Mixed Design of Integrated Circuits Systems, 2009, pp. 291–295.
- [124] W.-K. Chong, H. Ramiah, G.-H. Tan, N. Vitee and J. Kanesan, 'Design of ultralow voltage integrated cmos based lna and mixer for zigbee application,' AEU-International Journal of Electronics and Communications, vol. 68, no. 2, pp. 138– 142, 2014.
- [125] D. Bhatt, J. Mukherjee and J.-M. Redouté, 'Low-power switched transconductance mixer and lna design for wi-fi and wimax applications in 65 nm cmos,' *IET Microwaves, Antennas & Propagation*, vol. 12, no. 10, pp. 1736–1744, 2018.
- [126] A. Mazzanti and P. Andreani, 'Class-C harmonic CMOS VCOs, with a general result on phase noise,' *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2716–2729, 2008. DOI: 10.1109/JSSC.2008.2004867.
- [127] Y.-C. Huang, C.-F. Liang, H.-S. Huang and P.-Y. Wang, '15.3 A 2.4GHz ADPLL with digital-regulated supply-noise-insensitive and temperature-self-compensated ring DCO,' in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), San Francisco, CA, USA: IEEE, Feb. 2014, pp. 270–271, ISBN: 978-1-4799-0920-9 978-1-4799-0918-6. DOI: 10.1109/ISSCC.2014.6757430.
- [128] A. Urso, Y. Chen, J. F. Dijkhuis, Y.-H. Liu, M. Babaie and W. A. Serdijn, 'Analysis and Design of Power Supply Circuits for RF Oscillators,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 67, no. 12, pp. 4233–4246, Dec. 2020, ISSN: 1549-8328, 1558-0806. DOI: 10.1109/TCSI.2020.3020001.
- [129] T. Tsang and M. El-Gamal, 'A high figure of merit and area-efficient low-voltage (0.7-1 V) 12 GHz CMOS VCO,' in *IEEE Radio Frequency Integrated Circuits* (RFIC) Symposium, 2003, Philadelphia, PA, USA: IEEE, 2003, pp. 89–92, ISBN: 978-0-7803-7694-6. DOI: 10.1109/RFIC.2003.1213900.
- [130] D. B. Leeson, 'A simple model of feedback oscillator noise spectrum,' *Proceedings* of the IEEE, vol. 54, no. 2, pp. 329–330, 1966.

- [131] M.-J. Seo, Y.-J. Roh, D.-J. Chang, W. Kim, Y.-D. Kim and S.-T. Ryu, 'A Reusable Code-Based SAR ADC Design With CDAC Compiler and Synthesizable Analog Building Blocks,' *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 65, no. 12, pp. 1904–1908, Dec. 2018, ISSN: 1549-7747, 1558-3791. DOI: 10.1109/TCSII.2018.2822811.
- [132] M. Ding, P. Harpe, G. Chen et al., 'A Hybrid Design Automation Tool for SAR ADCs in IoT,' IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 12, pp. 2853–2862, Dec. 2018, ISSN: 1063-8210, 1557-9999. DOI: 10.1109/TVLSI.2018.2865404.
- [133] M. Liu, X. Tang, K. Zhu, H. Chen, N. Sun and D. Z. Pan, 'Opensar: An open source automated end-to-end sar adc compiler,' in 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 2021, pp. 1–9. DOI: 10.1109/ ICCAD51958.2021.9643494.
- [134] C.-P. Huang, J.-M. Lin, Y.-T. Shyu and S.-J. Chang, 'A Systematic Design Methodology of Asynchronous SAR ADCs,' *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 24, no. 5, pp. 1835–1848, May 2016, ISSN: 1063-8210, 1557-9999. DOI: 10.1109/TVLSI.2015.2494063.
- [135] R. Lyu, Y. Meng, A. Zhao *et al.*, 'A Study on Exploring and Exploiting the High-dimensional Design Space for Analog Circuit Design Automation: (Invited Paper),' in *2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)*, Incheon, Korea, Republic of: IEEE, Jan. 2024, pp. 671–678, ISBN: 9798350393545. DOI: 10.1109/ASP-DAC58780.2024.10473920.
- [136] P. J. Harpe, C. Zhou, Y. Bi et al., 'A 26  $\mu$  w 8 bit 10 ms/s asynchronous sar adc for low energy radios,' *IEEE Journal of Solid-State Circuits*, vol. 46, no. 7, pp. 1585–1595, 2011.
- [137] X. Tang, J. Liu, Y. Shen *et al.*, 'Low-power sar adc design: Overview and survey of state-of-the-art techniques,' *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 69, no. 6, pp. 2249–2262, 2022.
- [138] Y. Zhu, C.-H. Chan, U.-F. Chio *et al.*, 'A 10-bit 100-MS/s Reference-Free SAR ADC in 90 nm CMOS,' *IEEE Journal of Solid-State Circuits*, vol. 45, no. 6, pp. 1111–1121, Jun. 2010, ISSN: 0018-9200, 1558-173X. DOI: 10.1109/JSSC. 2010.2048498.
- [139] J. Shaikh and H. Rahaman, 'High speed and low power preset-able modified TSPC D flip-flop design and performance comparison with TSPC D flip-flop,' in 2018 International Symposium on Devices, Circuits and Systems (ISDCS), Howrah: IEEE, Mar. 2018, pp. 1–4, ISBN: 978-1-5386-5122-3. DOI: 10.1109/ISDCS.2018.8379677.
- [140] Z. Zhu, Y. Xiao and X. Song, 'Vcm-based monotonic capacitor switching scheme for sar adc,' *Electronics Letters*, vol. 49, no. 5, pp. 327–329, 2013.

- [141] Y. Ni, L. Liu and S. Xu, 'Mixed capacitor switching scheme for sar add with highest switching energy efficiency,' *Electronics Letters*, vol. 51, no. 6, pp. 466–467, 2015.
- [142] M. Bagheri, F. Schembari, N. Pourmousavian, H. Zare-Hoseini, D. Hasko and R. B. Staszewski, 'A Mismatch Calibration Technique for SAR ADCs Based on Deterministic Self-Calibration and Stochastic Quantization,' *IEEE Transactions* on Circuits and Systems I: Regular Papers, vol. 67, no. 9, pp. 2883–2896, Sep. 2020, ISSN: 1549-8328, 1558-0806. DOI: 10.1109/TCSI.2020.2985816.
- [143] X. Wang, F. Li and Z. Wang, 'A Simple Histogram-Based Capacitor Mismatch Calibration in SAR ADCs,' *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 67, no. 12, pp. 2838–2842, Dec. 2020, ISSN: 1549-7747, 1558-3791. DOI: 10.1109/TCSII.2020.2988646.
- [144] Z. Du, B. Yao, W. Xu, X. Wang, H. Hu and L. Qiu, 'Capacitor Mismatch Calibration of a 16-Bit SAR ADC Using Optimized Segmentation and Shuffling Scheme,' IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 70, no. 8, pp. 2789–2793, Aug. 2023, ISSN: 1549-7747, 1558-3791. DOI: 10.1109/TCSII. 2023.3252675.