Fast and high-precision wafer metrology is critical for the semiconductor industry. In this work, we explore the use of simple and cost-effective optical sensors in combination with data-driven algorithms. We propose and compare three data-driven approaches with varying complexity that can directly infer sub-nanometer metrology parameters from low-numerical-aperture optical coherent microscope images with the focus on precision, noise robustness, and data efficiency. In particular, we apply Vision Transformers (ViTs), Convolutional Neural Networks, and Multilayer Perceptrons to simulated datasets with varying aberrations. We report sub-nanometer measurement accuracy and precision for all models in the presence of strong optical aberrations and noise also. Furthermore, we find that ViTs consistently achieve low errors and excel under limited data regimes compared to other models.

AIP Publishing
ASML, ARCNL, VU, UvA, RUG, NWO
National Growth Fund program NXTGEN HIGHTECH
doi.org/10.1063/5.0301749
APL Mach. Learn.
Nanoscale Imaging and Metrology

de Wolf, L., Lipp, M., Cochez, M., den Boef, A., & Amitonova, L. (2026). Vision Transformer network for optical overlay metrology on semiconductor wafers. APL Mach. Learn., 4(1), 016101: 1–10. doi:10.1063/5.0301749