OpenLithoHub v0.1 — Open computational lithography for the AI era¶
Published 2026-05-19 · Tags: lithography, EUV, OPC, ILT, ML-for-EDA, open-source
TL;DR — OpenLithoHub is an open, Apache-2.0-licensed benchmarking and workflow framework for computational lithography. It unifies the three things that have kept ML-for-OPC research irreproducible: the datasets (LithoBench, LithoSim, GAN-OPC, ICCAD'16), the metrics (EPE, PV-Band, shot count, EUV stochastic, MRC), and the forward physics (a differentiable Hopkins/SOCS model). Bring your own model, plug it into one interface, and get the same numbers everyone else gets.
Repo: https://github.com/OpenLithoHub/OpenLithoHub · Playground: https://huggingface.co/spaces/OpenLithoHub/playground · Docs: https://docs.openlithohub.com · Colab BYOM: https://colab.research.google.com/github/OpenLithoHub/OpenLithoHub/blob/main/notebooks/colab_byom.ipynb
Why this exists¶
ML-for-OPC and ML-for-ILT have been hot for five years, but the field has a reproducibility problem that nobody quite owns:
- Datasets are scattered. LithoBench is on one GitHub repo, GAN-OPC on another, ICCAD'16 hotspot is on a benchmark page no one updates, LithoSim is a separate NeurIPS dataset. Every paper rolls its own loader, and a third of them have silent off-by-one bugs in coordinate conventions.
- Metrics drift. "EPE" in one paper means edge-placement error measured at fragment endpoints; in the next it's measured along arc length; in the third it's RMS over a contour band. PV-Band is even worse.
- There's no shared forward model. Half the papers compare against a Gaussian-PSF "imaging" stand-in that is not a lithographic model. The other half black-box a commercial simulator and cannot release the reference numbers.
- MRC is treated as cosmetic. Plenty of beautiful AI-OPC results produce masks that no foundry would actually let through DRC. Without a hard MRC gate, the benchmark is fiction.
OpenLithoHub takes a strong position on each of these:
- One adapter interface, every dataset behind it. Today: LithoBench, LithoSim, GAN-OPC, ICCAD'16. Tomorrow: synthetic generators against FreePDK45/ASAP7.
- Metrics with one canonical implementation per name. If you call
it
EPE, it is computed bycompute_epeinopenlithohub.benchmark. Report another version, fine — but don't call it EPE. - A real, differentiable Hopkins/SOCS forward model. SVD-truncated,
per-(params, grid) cache, supports circular / annular / dipole /
quasar illumination plus defocus. Auto-differentiable end-to-end —
drop it into your AI-OPC training loop. (See
openlithohub._utils.hopkins.) - MRC is a hard gate. A mask that violates MRC fails the benchmark, no matter how good its EPE looks. We added curvilinear MRC checks (curvature radius + min area) for ILT outputs that aren't manhattan.
What's in v0.1¶
- Datasets — LithoBench, LithoSim, GAN-OPC paired masks (~4875), ICCAD'16 Problem C hotspot, plus a hermetic dummy-layout generator that runs in CI without the workflow extras.
- Models —
dummy-identity,rule-based-opc(directional hammerheads, inner-corner serifs, iso/dense bias, MRC self-check),levelset-ilt,neural-ilt. All conform to the sameLithographyModelinterface; bring-your-own in <50 lines. - Forward models — Hopkins/SOCS (default) and Gaussian PSF (legacy).
- Metrics — EPE, PV-Band, shot count, EUV stochastic robustness, hotspot detection (recall/precision/F1).
- MRC/DRC — manhattan and curvilinear; hard-fails the run.
- Workflow — tiling/stitching with deduplication, contour extraction, B-spline fitting, OASIS round-trip, EDA bridge templates for Calibre nmDRC and Synopsys IC Validator.
- Visualization —
paper_stylecontext manager withIEEE_STYLEandSPIE_STYLEpresets, vector PDF, Type-42 fonts, colorblind-safe. - Playground — HuggingFace Space with 3 preset designs, BYOM upload, and an EPE-error heatmap with red MRC violation overlay.
- Colab BYOM notebook — install → register your model → eval against LithoBench → submit to leaderboard, end-to-end.
- Auto-Leaderboard CI — open a PR with your numbers, the workflow re-runs them on the standard test set, and merges if they verify.
- Docs —
Lithography for AI Engineerstranslates the field's vocabulary (Mask → Input Image, OPC → Image-to-Image inverse problem, PV-Band → Robustness Margin, MRC → Output Constraint Satisfaction).
Who this is for¶
- ML researchers who want to publish ML-for-OPC results that other groups can actually reproduce.
- Lithography engineers who want to evaluate ML approaches against the same metrics they care about in production (MRC compliance, EUV stochastic margin).
- Foundry / IDM teams who want a vendor-neutral common ground when comparing internal tools.
- Students entering the field who don't want to spend their first six months wiring up data loaders.
What's next¶
- Synthetic layout generator — diffusion + rule-based against FreePDK45/ASAP7 to break the dataset-size ceiling.
- Simulator hooks — pluggable adapters for Calibre nmOPC and Tachyon, so labs with commercial-tool access can use them as ground-truth oracles without forking the framework.
- EUV 3D-mask + Monte Carlo stochastic eval — moving past thin-mask Hopkins toward what really matters at 3nm and below.
- Pre-trained base models — MAE-style self-supervised pretraining on polygon rasters; user fine-tune in a few thousand samples.
- Layout tokenization — making polygons first-class for transformer research.
The full roadmap lives at https://github.com/OpenLithoHub/OpenLithoHub/blob/main/CHANGELOG.md.
How to get involved¶
- Try it:
pip install openlithohubor open the Colab notebook. - Submit a model to the leaderboard — see docs/leaderboard-submission.md.
- Discord is launching 2026-Q3. Until then, GitHub Issues and
Discussions are the main forum. Watch the repo or open an issue with
the
communitylabel to be notified. - Cite via
CITATION.cffif you use it in a paper.
Paste-ready posts¶
The sections below are short adaptations of the announcement above for specific platforms. Copy the relevant block, fill in your own handle / links / images, and post.
X / Twitter (thread, 6 posts)¶
1/ Today we're releasing OpenLithoHub — an open, Apache-2.0 benchmarking
& workflow framework for computational lithography (OPC, ILT, EUV).
The pitch: bring your own ML model, plug it into one interface, and get
the same numbers everyone else gets.
🔗 github.com/OpenLithoHub/OpenLithoHub
2/ Why? ML-for-OPC has a reproducibility problem that nobody owns.
Datasets are scattered. "EPE" means three different things in three
different papers. Half the field compares against a Gaussian PSF that
isn't a real lithographic model. MRC is treated as cosmetic.
3/ What we ship in v0.1:
- Unified loaders: LithoBench, LithoSim, GAN-OPC (~4875 paired masks),
ICCAD'16 hotspot
- A real differentiable Hopkins/SOCS forward model — dipole, quasar,
defocus, all auto-grad
- MRC as a HARD gate. Bad mask → benchmark fail.
4/ Plus:
- Curvilinear MRC checks (curvature radius + min area)
- OASIS round-trip + Calibre/IC Validator runset templates
- Paper-ready vis (IEEE / SPIE column-width, Type-42 PDFs)
- HF Space playground with EPE error heatmaps
5/ For ML researchers: bring your own model in <50 lines. Colab BYOM
notebook walks you from install → eval → leaderboard submit:
colab.research.google.com/github/OpenLithoHub/OpenLithoHub/blob/main/notebooks/colab_byom.ipynb
6/ Roadmap: synthetic layout gen, Calibre/Tachyon simulator hooks, EUV
3D-mask + Monte Carlo stochastic eval, pretrained base models.
Star the repo, try the playground, open an issue with feedback.
huggingface.co/spaces/OpenLithoHub/playground
LinkedIn (one long post)¶
After ~6 months of work, we're open-sourcing OpenLithoHub today — a
benchmarking and workflow framework for computational lithography that
takes a strong stance on the reproducibility crisis in ML-for-OPC.
The state of the field: every paper rolls its own dataset loaders, "EPE"
means different things in different publications, half the work compares
against a Gaussian-PSF stand-in that no lithographer would call a
forward model, and MRC compliance is too often a footnote rather than a
hard gate.
OpenLithoHub fixes the foundation:
✅ Unified adapters for LithoBench, LithoSim, GAN-OPC, ICCAD'16 hotspot
✅ One canonical implementation per metric (EPE, PV-Band, shot count,
EUV stochastic, hotspot detection)
✅ A real differentiable Hopkins/SOCS forward model — dipole, quasar,
defocus — auto-grad end-to-end so it drops into AI-OPC training
✅ MRC/DRC as a hard gate, including curvilinear checks
✅ OASIS round-trip + Calibre/IC Validator runset templates
✅ Paper-ready visualization (IEEE / SPIE / vector PDF)
✅ HuggingFace Space playground with EPE error heatmaps
✅ Colab BYOM notebook: install → register model → eval → submit, in
one runtime.
It's Apache-2.0. Bring your own model, plug it into one interface, and
publish numbers other groups can reproduce.
If you work on OPC, ILT, AI-EDA, or computational lithography — please
try it and tell us where it falls short.
GitHub: https://github.com/OpenLithoHub/OpenLithoHub
Playground: https://huggingface.co/spaces/OpenLithoHub/playground
Docs: https://docs.openlithohub.com
#computationallithography #EUV #OPC #ILT #MachineLearning #EDA
#OpenSource #SemiconductorManufacturing
知乎 (中文,长文版)¶
标题:OpenLithoHub v0.1 发布——给计算光刻领域一个可复现的开源底座
过去五年,ML-for-OPC / ML-for-ILT 的论文越来越多,但整个领域有一个谁都不
愿意正面承认的复现性问题:
- 数据集散落在不同 GitHub 仓库,每篇论文自己写 loader,半数有静默的坐标
约定 bug;
- "EPE" 在一篇论文里是 fragment 端点测,在另一篇是沿弧长测,在第三篇是
contour band 上的 RMS;
- 一半工作的"光刻仿真"是一个高斯 PSF 卷积,根本不是部分相干成像模型;
- MRC 经常被当作"画图修饰",而不是真正决定 mask 能否进 fab 的硬约束。
我们今天开源 OpenLithoHub,对以上每一条都给出明确立场:
1. **一个统一的 DatasetAdapter 接口**,覆盖 LithoBench / LithoSim /
GAN-OPC(约 4875 对掩膜)/ ICCAD'16 Problem C 热点检测。
2. **每个 metric 一个权威实现**。叫 EPE 就只能是 `compute_epe`;叫别的
名字没问题,但别再混用。
3. **可微分 Hopkins/SOCS 前向模型**。SVD 截断、per-(params, grid) 缓存,
支持 circular / annular / dipole / quasar 光源以及 defocus,端到端
auto-grad,可以直接丢进 AI-OPC 训练 loop。
4. **MRC 是硬门槛**。Mask 违反 MRC 直接 benchmark 失败,EPE 再漂亮也
没用。曲线形 mask 的 MRC(曲率半径 + 最小面积)也实现了。
v0.1 还包含:
- 4 个 baseline 模型(dummy-identity / rule-based-opc / levelset-ilt /
neural-ilt),都遵循同一个 LithographyModel 接口;BYOM 50 行内搞定。
- OASIS round-trip + Calibre nmDRC / Synopsys IC Validator runset 模板。
- 论文级可视化(IEEE / SPIE 栏宽、矢量 PDF、Type-42 字体、色盲友好配色)。
- HuggingFace Space playground,自带 EPE 误差热力图 + 红色 MRC 违规高亮。
- Colab BYOM 一键运行 notebook。
- Auto-Leaderboard CI:PR 提交模型后,CI 自动在标准测试集复现一遍。
下一步路线图:
- 合成版图生成器(rule-based + 扩散模型,对接 FreePDK45 / ASAP7);
- Calibre nmOPC / Tachyon 仿真器接口;
- EUV 3D 掩膜 + Monte Carlo 随机失效评估;
- 自监督预训练 base model(MAE 风格);
- Layout tokenization for transformer research.
License Apache-2.0,欢迎 PR、Issue、提模型。
项目地址:https://github.com/OpenLithoHub/OpenLithoHub
Playground:https://huggingface.co/spaces/OpenLithoHub/playground
文档:https://docs.openlithohub.com
Hugging Face Forum (Show and Tell)¶
Title: OpenLithoHub — open benchmarking + Hopkins/SOCS forward model for
computational lithography (OPC, ILT, EUV)
Hey HF community,
We're shipping OpenLithoHub today — Apache-2.0, full ML-for-OPC stack
with HF Space playground, BYOM Colab, and a real differentiable
Hopkins/SOCS forward model.
For folks not in the lithography world: this is the "ML for chip mask
optimization" subfield. Imagine image-to-image inverse problems where
the input is your desired silicon pattern and the output is a mask
shape that, when imaged through a partial-coherent optical system and a
nonlinear resist, will print correctly. EPE is the ML-friendly word for
"how wrong is the printed contour vs. target."
What might interest this community specifically:
- Differentiable Hopkins/SOCS forward model — drop it into your training
loop the same way you'd drop in a perceptual loss. Supports dipole /
quasar / annular illumination + defocus.
- HF Space playground:
https://huggingface.co/spaces/OpenLithoHub/playground
3 presets (SRAM cell, contact array, random routing), upload your own
GDS, get EPE heatmap + MRC overlay back.
- The model registry uses an interface compatible with HF Hub-style
pretrained-weight downloads (`from_pretrained`-ish), making it easy
to host model weights on HF.
- Colab BYOM notebook: install → register your `LithographyModel` →
eval on LithoBench → submit to leaderboard, all in one runtime.
- 4 baselines included: identity, rule-based OPC, level-set ILT,
neural-ILT (U-Net).
We'd love feedback on the BYOM interface from people who've built
analogous "bring your own model" pipelines on HF — the friction points
in our 50-LOC integration are exactly what we want to hear about.
GitHub: https://github.com/OpenLithoHub/OpenLithoHub
Docs: https://docs.openlithohub.com
Happy to answer questions in this thread.