Skip to content

OpenLithoHub v0.1 — Open computational lithography for the AI era

Published 2026-05-19 · Tags: lithography, EUV, OPC, ILT, ML-for-EDA, open-source

TL;DR — OpenLithoHub is an open, Apache-2.0-licensed benchmarking and workflow framework for computational lithography. It unifies the three things that have kept ML-for-OPC research irreproducible: the datasets (LithoBench, LithoSim, GAN-OPC, ICCAD'16), the metrics (EPE, PV-Band, shot count, EUV stochastic, MRC), and the forward physics (a differentiable Hopkins/SOCS model). Bring your own model, plug it into one interface, and get the same numbers everyone else gets.

Repo: https://github.com/OpenLithoHub/OpenLithoHub · Playground: https://huggingface.co/spaces/OpenLithoHub/playground · Docs: https://docs.openlithohub.com · Colab BYOM: https://colab.research.google.com/github/OpenLithoHub/OpenLithoHub/blob/main/notebooks/colab_byom.ipynb


Why this exists

ML-for-OPC and ML-for-ILT have been hot for five years, but the field has a reproducibility problem that nobody quite owns:

  • Datasets are scattered. LithoBench is on one GitHub repo, GAN-OPC on another, ICCAD'16 hotspot is on a benchmark page no one updates, LithoSim is a separate NeurIPS dataset. Every paper rolls its own loader, and a third of them have silent off-by-one bugs in coordinate conventions.
  • Metrics drift. "EPE" in one paper means edge-placement error measured at fragment endpoints; in the next it's measured along arc length; in the third it's RMS over a contour band. PV-Band is even worse.
  • There's no shared forward model. Half the papers compare against a Gaussian-PSF "imaging" stand-in that is not a lithographic model. The other half black-box a commercial simulator and cannot release the reference numbers.
  • MRC is treated as cosmetic. Plenty of beautiful AI-OPC results produce masks that no foundry would actually let through DRC. Without a hard MRC gate, the benchmark is fiction.

OpenLithoHub takes a strong position on each of these:

  1. One adapter interface, every dataset behind it. Today: LithoBench, LithoSim, GAN-OPC, ICCAD'16. Tomorrow: synthetic generators against FreePDK45/ASAP7.
  2. Metrics with one canonical implementation per name. If you call it EPE, it is computed by compute_epe in openlithohub.benchmark. Report another version, fine — but don't call it EPE.
  3. A real, differentiable Hopkins/SOCS forward model. SVD-truncated, per-(params, grid) cache, supports circular / annular / dipole / quasar illumination plus defocus. Auto-differentiable end-to-end — drop it into your AI-OPC training loop. (See openlithohub._utils.hopkins.)
  4. MRC is a hard gate. A mask that violates MRC fails the benchmark, no matter how good its EPE looks. We added curvilinear MRC checks (curvature radius + min area) for ILT outputs that aren't manhattan.

What's in v0.1

  • Datasets — LithoBench, LithoSim, GAN-OPC paired masks (~4875), ICCAD'16 Problem C hotspot, plus a hermetic dummy-layout generator that runs in CI without the workflow extras.
  • Modelsdummy-identity, rule-based-opc (directional hammerheads, inner-corner serifs, iso/dense bias, MRC self-check), levelset-ilt, neural-ilt. All conform to the same LithographyModel interface; bring-your-own in <50 lines.
  • Forward models — Hopkins/SOCS (default) and Gaussian PSF (legacy).
  • Metrics — EPE, PV-Band, shot count, EUV stochastic robustness, hotspot detection (recall/precision/F1).
  • MRC/DRC — manhattan and curvilinear; hard-fails the run.
  • Workflow — tiling/stitching with deduplication, contour extraction, B-spline fitting, OASIS round-trip, EDA bridge templates for Calibre nmDRC and Synopsys IC Validator.
  • Visualizationpaper_style context manager with IEEE_STYLE and SPIE_STYLE presets, vector PDF, Type-42 fonts, colorblind-safe.
  • Playground — HuggingFace Space with 3 preset designs, BYOM upload, and an EPE-error heatmap with red MRC violation overlay.
  • Colab BYOM notebook — install → register your model → eval against LithoBench → submit to leaderboard, end-to-end.
  • Auto-Leaderboard CI — open a PR with your numbers, the workflow re-runs them on the standard test set, and merges if they verify.
  • DocsLithography for AI Engineers translates the field's vocabulary (Mask → Input Image, OPC → Image-to-Image inverse problem, PV-Band → Robustness Margin, MRC → Output Constraint Satisfaction).

Who this is for

  • ML researchers who want to publish ML-for-OPC results that other groups can actually reproduce.
  • Lithography engineers who want to evaluate ML approaches against the same metrics they care about in production (MRC compliance, EUV stochastic margin).
  • Foundry / IDM teams who want a vendor-neutral common ground when comparing internal tools.
  • Students entering the field who don't want to spend their first six months wiring up data loaders.

What's next

  • Synthetic layout generator — diffusion + rule-based against FreePDK45/ASAP7 to break the dataset-size ceiling.
  • Simulator hooks — pluggable adapters for Calibre nmOPC and Tachyon, so labs with commercial-tool access can use them as ground-truth oracles without forking the framework.
  • EUV 3D-mask + Monte Carlo stochastic eval — moving past thin-mask Hopkins toward what really matters at 3nm and below.
  • Pre-trained base models — MAE-style self-supervised pretraining on polygon rasters; user fine-tune in a few thousand samples.
  • Layout tokenization — making polygons first-class for transformer research.

The full roadmap lives at https://github.com/OpenLithoHub/OpenLithoHub/blob/main/CHANGELOG.md.

How to get involved

  • Try it: pip install openlithohub or open the Colab notebook.
  • Submit a model to the leaderboard — see docs/leaderboard-submission.md.
  • Discord is launching 2026-Q3. Until then, GitHub Issues and Discussions are the main forum. Watch the repo or open an issue with the community label to be notified.
  • Cite via CITATION.cff if you use it in a paper.

Paste-ready posts

The sections below are short adaptations of the announcement above for specific platforms. Copy the relevant block, fill in your own handle / links / images, and post.

X / Twitter (thread, 6 posts)

1/ Today we're releasing OpenLithoHub — an open, Apache-2.0 benchmarking
& workflow framework for computational lithography (OPC, ILT, EUV).

The pitch: bring your own ML model, plug it into one interface, and get
the same numbers everyone else gets.

🔗 github.com/OpenLithoHub/OpenLithoHub

2/ Why? ML-for-OPC has a reproducibility problem that nobody owns.

Datasets are scattered. "EPE" means three different things in three
different papers. Half the field compares against a Gaussian PSF that
isn't a real lithographic model. MRC is treated as cosmetic.

3/ What we ship in v0.1:

- Unified loaders: LithoBench, LithoSim, GAN-OPC (~4875 paired masks),
  ICCAD'16 hotspot
- A real differentiable Hopkins/SOCS forward model — dipole, quasar,
  defocus, all auto-grad
- MRC as a HARD gate. Bad mask → benchmark fail.

4/ Plus:
- Curvilinear MRC checks (curvature radius + min area)
- OASIS round-trip + Calibre/IC Validator runset templates
- Paper-ready vis (IEEE / SPIE column-width, Type-42 PDFs)
- HF Space playground with EPE error heatmaps

5/ For ML researchers: bring your own model in <50 lines. Colab BYOM
notebook walks you from install → eval → leaderboard submit:
colab.research.google.com/github/OpenLithoHub/OpenLithoHub/blob/main/notebooks/colab_byom.ipynb

6/ Roadmap: synthetic layout gen, Calibre/Tachyon simulator hooks, EUV
3D-mask + Monte Carlo stochastic eval, pretrained base models.

Star the repo, try the playground, open an issue with feedback.

huggingface.co/spaces/OpenLithoHub/playground

LinkedIn (one long post)

After ~6 months of work, we're open-sourcing OpenLithoHub today — a
benchmarking and workflow framework for computational lithography that
takes a strong stance on the reproducibility crisis in ML-for-OPC.

The state of the field: every paper rolls its own dataset loaders, "EPE"
means different things in different publications, half the work compares
against a Gaussian-PSF stand-in that no lithographer would call a
forward model, and MRC compliance is too often a footnote rather than a
hard gate.

OpenLithoHub fixes the foundation:

✅ Unified adapters for LithoBench, LithoSim, GAN-OPC, ICCAD'16 hotspot
✅ One canonical implementation per metric (EPE, PV-Band, shot count,
   EUV stochastic, hotspot detection)
✅ A real differentiable Hopkins/SOCS forward model — dipole, quasar,
   defocus — auto-grad end-to-end so it drops into AI-OPC training
✅ MRC/DRC as a hard gate, including curvilinear checks
✅ OASIS round-trip + Calibre/IC Validator runset templates
✅ Paper-ready visualization (IEEE / SPIE / vector PDF)
✅ HuggingFace Space playground with EPE error heatmaps
✅ Colab BYOM notebook: install → register model → eval → submit, in
   one runtime.

It's Apache-2.0. Bring your own model, plug it into one interface, and
publish numbers other groups can reproduce.

If you work on OPC, ILT, AI-EDA, or computational lithography — please
try it and tell us where it falls short.

GitHub: https://github.com/OpenLithoHub/OpenLithoHub
Playground: https://huggingface.co/spaces/OpenLithoHub/playground
Docs: https://docs.openlithohub.com

#computationallithography #EUV #OPC #ILT #MachineLearning #EDA
#OpenSource #SemiconductorManufacturing

知乎 (中文,长文版)

标题:OpenLithoHub v0.1 发布——给计算光刻领域一个可复现的开源底座

过去五年,ML-for-OPC / ML-for-ILT 的论文越来越多,但整个领域有一个谁都不
愿意正面承认的复现性问题:

- 数据集散落在不同 GitHub 仓库,每篇论文自己写 loader,半数有静默的坐标
  约定 bug;
- "EPE" 在一篇论文里是 fragment 端点测,在另一篇是沿弧长测,在第三篇是
  contour band 上的 RMS;
- 一半工作的"光刻仿真"是一个高斯 PSF 卷积,根本不是部分相干成像模型;
- MRC 经常被当作"画图修饰",而不是真正决定 mask 能否进 fab 的硬约束。

我们今天开源 OpenLithoHub,对以上每一条都给出明确立场:

1. **一个统一的 DatasetAdapter 接口**,覆盖 LithoBench / LithoSim /
   GAN-OPC(约 4875 对掩膜)/ ICCAD'16 Problem C 热点检测。
2. **每个 metric 一个权威实现**。叫 EPE 就只能是 `compute_epe`;叫别的
   名字没问题,但别再混用。
3. **可微分 Hopkins/SOCS 前向模型**。SVD 截断、per-(params, grid) 缓存,
   支持 circular / annular / dipole / quasar 光源以及 defocus,端到端
   auto-grad,可以直接丢进 AI-OPC 训练 loop。
4. **MRC 是硬门槛**。Mask 违反 MRC 直接 benchmark 失败,EPE 再漂亮也
   没用。曲线形 mask 的 MRC(曲率半径 + 最小面积)也实现了。

v0.1 还包含:
- 4 个 baseline 模型(dummy-identity / rule-based-opc / levelset-ilt /
  neural-ilt),都遵循同一个 LithographyModel 接口;BYOM 50 行内搞定。
- OASIS round-trip + Calibre nmDRC / Synopsys IC Validator runset 模板。
- 论文级可视化(IEEE / SPIE 栏宽、矢量 PDF、Type-42 字体、色盲友好配色)。
- HuggingFace Space playground,自带 EPE 误差热力图 + 红色 MRC 违规高亮。
- Colab BYOM 一键运行 notebook。
- Auto-Leaderboard CI:PR 提交模型后,CI 自动在标准测试集复现一遍。

下一步路线图:
- 合成版图生成器(rule-based + 扩散模型,对接 FreePDK45 / ASAP7);
- Calibre nmOPC / Tachyon 仿真器接口;
- EUV 3D 掩膜 + Monte Carlo 随机失效评估;
- 自监督预训练 base model(MAE 风格);
- Layout tokenization for transformer research.

License Apache-2.0,欢迎 PR、Issue、提模型。
项目地址:https://github.com/OpenLithoHub/OpenLithoHub
Playground:https://huggingface.co/spaces/OpenLithoHub/playground
文档:https://docs.openlithohub.com

Hugging Face Forum (Show and Tell)

Title: OpenLithoHub — open benchmarking + Hopkins/SOCS forward model for
computational lithography (OPC, ILT, EUV)

Hey HF community,

We're shipping OpenLithoHub today — Apache-2.0, full ML-for-OPC stack
with HF Space playground, BYOM Colab, and a real differentiable
Hopkins/SOCS forward model.

For folks not in the lithography world: this is the "ML for chip mask
optimization" subfield. Imagine image-to-image inverse problems where
the input is your desired silicon pattern and the output is a mask
shape that, when imaged through a partial-coherent optical system and a
nonlinear resist, will print correctly. EPE is the ML-friendly word for
"how wrong is the printed contour vs. target."

What might interest this community specifically:

- Differentiable Hopkins/SOCS forward model — drop it into your training
  loop the same way you'd drop in a perceptual loss. Supports dipole /
  quasar / annular illumination + defocus.
- HF Space playground:
  https://huggingface.co/spaces/OpenLithoHub/playground
  3 presets (SRAM cell, contact array, random routing), upload your own
  GDS, get EPE heatmap + MRC overlay back.
- The model registry uses an interface compatible with HF Hub-style
  pretrained-weight downloads (`from_pretrained`-ish), making it easy
  to host model weights on HF.
- Colab BYOM notebook: install → register your `LithographyModel` →
  eval on LithoBench → submit to leaderboard, all in one runtime.
- 4 baselines included: identity, rule-based OPC, level-set ILT,
  neural-ILT (U-Net).

We'd love feedback on the BYOM interface from people who've built
analogous "bring your own model" pipelines on HF — the friction points
in our 50-LOC integration are exactly what we want to hear about.

GitHub: https://github.com/OpenLithoHub/OpenLithoHub
Docs: https://docs.openlithohub.com

Happy to answer questions in this thread.