Benchmark Metrics¶
Metrics¶
openlithohub.benchmark.metrics.epe
¶
Edge Placement Error (EPE) computation.
Two flavors live here:
- :func:
compute_epe— mask-level. Compares predicted mask edges directly to target edges. An Identity model (mask passed straight through) scores 0 by construction, which is useful as a sanity baseline but does NOT reflect what would actually print on the wafer. - :func:
compute_wafer_epe— wafer-level. Pushes the predicted mask through a forward optical/resist simulator and compares the resist contour to the target. This is the physically meaningful quantity for OPC quality: a square mask will round at the corners after diffraction, so an Identity model lands at a nonzero EPE.
Both report the same EPEResult schema; the leaderboard surfaces them
under separate keys (epe_* vs epe_wafer_*) so existing dashboards
that compare against historical mask-level numbers stay valid.
EPEResult
¶
Bases: TypedDict
Per-sample EPE summary. Numeric fields are always float so callers
can do arithmetic on them without first narrowing away bool.
Source code in src/openlithohub/benchmark/metrics/epe.py
compute_epe(predicted, target, pixel_size_nm=1.0)
¶
Compute Edge Placement Error between predicted and target contours.
Symmetric edge-distance: for every edge pixel in both sets we compute the minimum distance to the other set, then aggregate over the union. The asymmetric form (predicted→target only) reports zero error for "missing entirely" failure modes — if predicted has no edges where target has a feature, predicted's edge set is empty and the loop has nothing to penalize. The symmetric form catches under-printing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predicted
|
Tensor
|
Binary mask of predicted pattern (H, W), values in {0, 1}. |
required |
target
|
Tensor
|
Binary mask of target/reference pattern (H, W), values in {0, 1}. |
required |
pixel_size_nm
|
float
|
Physical size of each pixel in nanometers. |
1.0
|
Returns:
| Type | Description |
|---|---|
EPEResult
|
Dictionary with keys |
EPEResult
|
and |
EPEResult
|
|
EPEResult
|
|
EPEResult
|
|
Source code in src/openlithohub/benchmark/metrics/epe.py
compute_wafer_epe(predicted_mask, target, pixel_size_nm=1.0, simulator=None)
¶
Compute EPE between the printed wafer contour and the target.
Pushes predicted_mask through a forward optical/resist simulator
and compares the resulting binarised resist image to target using
the same edge-distance routine as :func:compute_epe. This is the
physically meaningful EPE for OPC quality — an Identity model (mask
returned unchanged) lands at a nonzero value here because diffraction
rounds corners that the original mask had as right angles.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predicted_mask
|
Tensor
|
Predicted mask (H, W), values in [0, 1]. The simulator will be applied to this tensor. |
required |
target
|
Tensor
|
Target wafer/contour pattern (H, W), values in {0, 1}. |
required |
pixel_size_nm
|
float
|
Physical pixel size in nanometers. |
1.0
|
simulator
|
BaseSimulator | None
|
Forward simulator. Defaults to a fresh
:class: |
None
|
Returns:
| Type | Description |
|---|---|
EPEResult
|
Same |
Source code in src/openlithohub/benchmark/metrics/epe.py
openlithohub.benchmark.metrics.l2_error
¶
L2 wafer error — Neural-ILT canonical mask-printability metric.
The standard academic OPC scoring contract, as established by Neural-ILT (ICCAD'20) and used by GAN-OPC / MOSAIC, is:
wafer = lithosim(mask, dose=1.0, threshold=0.225)
score = (wafer - target).abs().sum() # L1 / SAD pixel count
i.e. forward-simulate the predicted mask through SOCS optics and the
resist threshold, then count the pixel-wise differences against the
target layout (not against the input mask). The result is in pixel
units; multiply by pixel_size_nm**2 for an area in nm² if needed.
Naming note: the published Neural-ILT paper calls this scalar "L2
error". On the binary {0, 1} wafer/target images the formula
emits, the squared-L2 norm (w - t).square().sum() and the L1 norm
(w - t).abs().sum() produce the same integer (since
x ∈ {-1, 0, 1} ⇒ x² = |x|), so reference implementations
canonically use the L1 form for speed. They are not equal in general
— if you ever supply a non-binary wafer (e.g. soft resist contours),
the two diverge — but for the canonical contract they agree. The
l2_error_pixels field name is preserved for cross-paper
comparability; do not "fix" it to L1 without coordinating against the
upstream tables.
Like :func:openlithohub.benchmark.metrics.epe.compute_wafer_epe, this
metric requires the forward simulator in the loop. The
:func:compute_epe mask-level metric scores 0 for an Identity model;
compute_l2_error does not, because diffraction reshapes the printed
contour even when the mask is unchanged.
L2ErrorResult
¶
Bases: TypedDict
Per-sample L2 wafer-error summary.
Attributes:
| Name | Type | Description |
|---|---|---|
l2_error_pixels |
float
|
|
l2_error_nm2 |
float
|
Same quantity expressed as a physical area in nm²
( |
wafer_pixels |
int
|
Number of foreground pixels in the simulated wafer image. Reported alongside the error so a normalised ratio can be derived downstream without re-running the simulator. |
target_pixels |
int
|
Foreground pixel count of the target layout. |
Source code in src/openlithohub/benchmark/metrics/l2_error.py
compute_l2_error(predicted_mask, target, pixel_size_nm=1.0, simulator=None)
¶
Compute L2 wafer error per the Neural-ILT eval contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predicted_mask
|
Tensor
|
Predicted mask (H, W), values in [0, 1]. |
required |
target
|
Tensor
|
Target layout (H, W), values in {0, 1}. Compared against the simulated wafer, not against the predicted mask. |
required |
pixel_size_nm
|
float
|
Physical pixel size, used only to convert the
pixel-unit error to an nm² area. Does not affect simulator
sampling — pass a configured |
1.0
|
simulator
|
BaseSimulator | None
|
Forward simulator. Defaults to a fresh
:class: |
None
|
Returns:
| Type | Description |
|---|---|
L2ErrorResult
|
class: |
L2ErrorResult
|
nm² conversion and the supporting pixel counts. |
Source code in src/openlithohub/benchmark/metrics/l2_error.py
openlithohub.benchmark.metrics.pvband
¶
Process Variation Band (PV Band) computation.
Two forward-model paths are available:
-
Default — fast Gaussian-PSF aerial-image approximation at four dose/focus corners. Cheap diagnostic that runs in inner loops and on every commit; this is what the baseline tables in
baselines/results.mdand the README report. The Gaussian model is calibrated so the absolute PV Band number tracks the SOCS result at the published Neural-ILT corners — both are stable signals of process-window robustness, but they are not interchangeable numerically. -
SOCS-faithful —
simulator=keyword (added 2026-05-23). When a :class:BaseSimulatorinstance is passed, this metric drives the simulator at each(dose, defocus)corner via :meth:BaseSimulator.with_config, takes the binarised resist contour at each corner, and reports outer-vs-inner band thickness from the same kernels :func:compute_l2_erroruses. This closes the "Gaussian PVB ≠ SOCS PVB" reproducibility footgun for paper authors comparing OPC numbers across implementations: when you need PVB derived from the same SOCS kernels as L2/EPE, pass the same configured simulator instance.
This path is opt-in to keep existing baseline numbers stable —
passing simulator= will change the absolute number reported.
compute_pvband(mask, nominal_dose=1.0, dose_variation=0.05, defocus_range_nm=20.0, pixel_size_nm=1.0, simulator=None, resist_diffusion_nm=0.0, quencher=0.0)
¶
Compute Process Variation Band width for a given mask.
PV Band measures the perpendicular distance between the resist contours at process window extremes.
With simulator=None (default) the cheap Gaussian-PSF approximation
is used — see module docstring path (1).
With simulator=<BaseSimulator instance> the simulator is driven
at four (dose × defocus) corners via with_config, and the
band is computed from the same kernels — see module docstring path
(2). The simulator's existing defocus_nm is used as the
nominal centre; ±defocus_range_nm/2 is applied at the corners.
The factor of two converts "distance to the nearest contour" (half-width at the band's centerline) into the full perpendicular contour-to-contour distance that the literature publishes.
Source code in src/openlithohub/benchmark/metrics/pvband.py
openlithohub.benchmark.metrics.shot_count
¶
Shot count estimation for mask manufacturing cost.
estimate_shot_count(mask, writer_type='mbmw', min_shot_size_nm=5.0, pixel_size_nm=1.0)
¶
Estimate the number of shots needed to write a mask.
Shot count is a direct proxy for mask writing time and manufacturing cost.
For multi-beam mask writers (MBMW), each foreground pixel corresponds to one beam exposure position. Shot count equals the number of foreground pixels scaled by the ratio of pixel area to beam grid area.
For variable shaped beam (VSB) writers, shots are rectangular exposures. The estimate uses the mask complexity (perimeter/area ratio) to approximate the number of rectangles needed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Tensor
|
Binary mask tensor (H, W). |
required |
writer_type
|
str
|
'vsb' (variable shaped beam) or 'mbmw' (multi-beam). |
'mbmw'
|
min_shot_size_nm
|
float
|
Minimum addressable shot dimension. |
5.0
|
pixel_size_nm
|
float
|
Physical pixel size in nanometers. |
1.0
|
Returns:
| Type | Description |
|---|---|
dict[str, int | float]
|
Dictionary with 'shot_count' and 'estimated_write_time_s'. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If writer_type is not 'mbmw' or 'vsb'. |
Source code in src/openlithohub/benchmark/metrics/shot_count.py
openlithohub.benchmark.metrics.stochastic
¶
EUV stochastic robustness evaluation.
StochasticDefectRates
dataclass
¶
Per-class stochastic defect rates in failures per cm^2.
The four classes follow the imec EUV stochastic-defectivity convention (microbridge / broken line / missing contact / merged contact). Per-cm^2 rates are the industry reporting unit and let users compare against published defectivity floors regardless of mask tile size.
Source code in src/openlithohub/benchmark/metrics/stochastic.py
compute_stochastic_robustness(mask, num_trials=100, dose_photons_per_nm2=30.0, pixel_size_nm=1.0, seed=0, resist_threshold=THRESHOLD_ICCAD16, resist_diffusion_nm=0.0, quencher=0.0)
¶
Evaluate mask robustness against EUV photon shot noise.
Simulates stochastic resist exposure via Poisson photon noise to quantify probability of micro-bridging and line breaks.
seed defaults to 0 so leaderboard runs are reproducible. Pass
seed=None to draw from system entropy (intentional non-determinism,
e.g. ensemble runs).
resist_threshold defaults to 0.225 to match the LithoBench/Yang2023
calibration the leaderboard L2/PVB metrics use; pass 0.5 for the legacy
mid-grey cut. Issue #19: previously hard-coded to 0.5, which gave a
different resist contour than the metrics it is reported alongside.
Source code in src/openlithohub/benchmark/metrics/stochastic.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 | |
compute_stochastic_defect_classes(mask, num_trials=100, dose_photons_per_nm2=30.0, pixel_size_nm=1.0, seed=0, contact_aspect_max=1.5, contact_area_max=64, resist_threshold=THRESHOLD_ICCAD16, resist_diffusion_nm=0.0, quencher=0.0)
¶
Per-class EUV stochastic defect rates in failures/cm^2.
Extends :func:compute_stochastic_robustness (which returns aggregate
bridge/break probabilities) with the four imec-style defect classes
reported by the EUV stochastic-defectivity literature: microbridges,
broken lines, missing contacts, and merged contacts. Output is
normalised to failures per cm^2 so results are comparable across
different mask tile sizes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Tensor
|
Real-valued mask tensor (H, W) or 4D, values in [0, 1]. |
required |
num_trials
|
int
|
Number of Poisson trials. More trials → tighter rate estimates; 100 is a reasonable benchmarking default. |
100
|
dose_photons_per_nm2
|
float
|
Exposure dose in photons / nm^2 at the wafer. Scales the Poisson rate map. |
30.0
|
pixel_size_nm
|
float
|
Mask pixel size in nm; used both for the Poisson rate scaling and for converting failure counts to per-cm^2. |
1.0
|
seed
|
int | None
|
Optional RNG seed. |
0
|
contact_aspect_max
|
float
|
Maximum bounding-box long/short ratio for a component to count as contact-like. Lines are everything else. |
1.5
|
contact_area_max
|
int
|
Maximum pixel area for a component to count as contact-like. Tune for the contact size on your process node. |
64
|
Returns:
| Type | Description |
|---|---|
StochasticDefectRates
|
StochasticDefectRates with per-class and total failure rates. |
Source code in src/openlithohub/benchmark/metrics/stochastic.py
318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 | |
openlithohub.benchmark.metrics.monte_carlo
¶
Monte Carlo stochastic-failure evaluation against a simulator backend.
Complements :func:compute_stochastic_robustness (which uses a fast
Gaussian-PSF model and Poisson photon noise) by letting callers run the
same Monte Carlo loop against any
:class:openlithohub.simulators.BaseSimulator — including the bundled
Hopkins/SOCS model or, with the appropriate adapter, a commercial
simulator.
This is the "give me a stochastic-failure number against my preferred forward model" entry point that the v0.1 roadmap calls for.
MonteCarloFailureResult
dataclass
¶
Result of a Monte Carlo stochastic-failure run.
Source code in src/openlithohub/benchmark/metrics/monte_carlo.py
monte_carlo_failure_probability(mask, simulator, num_trials=50, dose_jitter_sigma=0.02, threshold_jitter_sigma=0.01, seed=0, perturb=None)
¶
Estimate stochastic-failure probability against a simulator backend.
Runs num_trials independent simulations with small per-trial
perturbations to dose and resist threshold (and, if provided, a
user-supplied perturb operator on the mask itself). Counts how
often the resulting resist contour acquires extra connected
components ("breaks") or merges existing ones ("bridges") relative
to the nominal run.
A trial that simultaneously bridges one component pair and breaks
a different component is counted as a failure on both axes — the
earlier net component count heuristic would have masked the
pair as a no-op (issue #55).
Dose jitter is applied as a post-hoc aerial scaling rather than via
config.dose: the bundled HopkinsSimulator's threshold scales
with dose (threshold = cfg.threshold * cfg.dose), so pushing
jitter into cfg.dose cancels at the threshold and the perturbation
becomes a no-op (issue #54, downstream of #52).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Tensor
|
|
required |
simulator
|
BaseSimulator
|
Simulator backend. Must produce a |
required |
num_trials
|
int
|
Number of perturbed simulations. |
50
|
dose_jitter_sigma
|
float
|
Std-dev of multiplicative dose jitter. |
0.02
|
threshold_jitter_sigma
|
float
|
Std-dev of additive resist-threshold jitter. |
0.01
|
seed
|
int | None
|
PRNG seed; defaults to |
0
|
perturb
|
Callable[[Tensor, Generator], Tensor] | None
|
Optional |
None
|
Returns:
| Type | Description |
|---|---|
MonteCarloFailureResult
|
class: |
Source code in src/openlithohub/benchmark/metrics/monte_carlo.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 | |
openlithohub.benchmark.metrics.euv_3d
¶
EUV 3D-mask shadow-effect proxy metric.
Real EUV mask 3D simulation (rigorous Maxwell) is expensive and lives in commercial tools like HyperLith / EM-Suite. This module ships a cheap proxy that captures the dominant first-order effect: shadowing-induced bias that depends on feature orientation relative to the chief-ray direction.
What we model¶
For an EUV reflective mask at a non-zero chief-ray angle of incidence (typically 6° in NXE:3400-class scanners), the absorber casts a geometric shadow whose magnitude depends on:
- absorber thickness (≈70 nm Ta-based, ≈30 nm low-n attenuated PSM);
- angle of incidence;
- feature orientation (horizontal vs vertical lines respond differently — the well-known H–V CD bias).
We compute a per-pixel shadow displacement field and convolve the binary mask with an anisotropic shadow kernel, then compare the resulting "3D-corrected" aerial against a thin-mask aerial. The L2 residual between the two is a reasonable proxy for "how much rigorous 3D simulation would disagree with the Hopkins thin-mask result on this layout".
This is a proxy, not a substitute, for rigorous 3D-mask EMF
simulation. Its purpose is to flag layouts that are at risk of large 3D
errors at evaluation time without paying the cost of a Maxwell solver.
For papers that require ground-truth 3D, hook a real simulator via
:class:openlithohub.simulators.BaseSimulator.
Mask3DParams
dataclass
¶
Parameters for the EUV 3D-mask shadow proxy.
Attributes:
| Name | Type | Description |
|---|---|---|
absorber_thickness_nm |
float
|
Absorber stack height. 70 nm = Ta-based, 30 nm = low-n attenuated PSM. |
chief_ray_angle_deg |
float
|
Chief-ray angle of incidence at the mask. 6° for NXE:3400-class scanners. |
chief_ray_azimuth_deg |
float
|
Azimuth of the chief ray (0° = +x). Sets the shadow direction. |
pixel_size_nm |
float
|
Mask-side pixel pitch. |
Source code in src/openlithohub/benchmark/metrics/euv_3d.py
apply_3d_shadow(mask, params=None)
¶
Apply the 3D-shadow proxy operator to a binary mask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Tensor
|
|
required |
params
|
Mask3DParams | None
|
Shadow parameters; defaults to NXE:3400-like. |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Same-shape mask with the shadow operator applied. The result is |
Tensor
|
no longer strictly binary — it represents the effective |
Tensor
|
attenuation seen by the optical model. |
Source code in src/openlithohub/benchmark/metrics/euv_3d.py
compute_3d_mask_residual(mask, params=None, sim_config=None)
¶
Quantify expected disagreement between thin-mask and 3D-mask aerials.
Runs the bundled Hopkins/SOCS simulator twice — once on the input mask (thin-mask assumption) and once on the shadow-corrected mask — and reports the L2 and L_inf residuals plus the H–V CD-bias proxy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Tensor
|
|
required |
params
|
Mask3DParams | None
|
Shadow parameters. |
None
|
sim_config
|
SimulatorConfig | None
|
Optional simulator config; defaults to EUV-ish (13.5 nm, NA 0.33). |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dict with keys |
dict[str, float]
|
|
dict[str, float]
|
vertical lines after 3D-shadow correction). The H–V bias is |
dict[str, float]
|
derived from thresholded shadowed-mask area, normalised by the |
dict[str, float]
|
original mask's contour length, so it carries genuine units of |
dict[str, float]
|
length — see :func: |
Source code in src/openlithohub/benchmark/metrics/euv_3d.py
openlithohub.benchmark.metrics.hotspot
¶
Hotspot detection metric — recall / precision / F1 with distance-tolerant matching against a ground-truth point list.
This is the canonical evaluation used by ICCAD'16 Problem C and the
hotspot-detection literature (e.g. Yang et al., TCAD 2020): a predicted
point counts as a true positive if any ground-truth point lies within a
configurable radius (match_radius_nm). Each GT point may be matched
at most once — duplicate predictions inside the same tolerance disk
become false positives. GT points with no predictor inside the disk are
false negatives.
The matching is point-based, not pixel-based. If your predictor outputs
a binary heatmap, run connected-components and feed the centroids (in
nm) as predicted_points. openlithohub._utils.morphology has the
primitives — there is no need to reinvent them.
Coordinates are in nanometers throughout to match the rest of the benchmark stack (LithoSample.metadata exposes nm units consistently).
compute_hotspot_detection(predicted_points, ground_truth_points, match_radius_nm=1.0)
¶
Score a hotspot predictor against a ground-truth point list.
A predicted point is a true positive iff it can be paired with a GT
point within match_radius_nm, under a maximum-cardinality
minimum-cost assignment (Hungarian algorithm) — independent of the
order predicted_points arrives in.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predicted_points
|
Tensor
|
|
required |
ground_truth_points
|
Tensor
|
|
required |
match_radius_nm
|
float
|
Maximum nm distance at which a predicted point is considered to have located a GT hotspot. ICCAD'16 literature commonly uses 1 nm (exact-pixel match) or a few nm to allow for centroid jitter. |
1.0
|
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dict with |
dict[str, float]
|
|
dict[str, float]
|
result merges cleanly with other |
dict[str, float]
|
Edge cases: |
dict[str, float]
|
|
dict[str, float]
|
|
dict[str, float]
|
|
Source code in src/openlithohub/benchmark/metrics/hotspot.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | |
openlithohub.benchmark.metrics.sraf
¶
SRAF non-printing penalty.
Sub-Resolution Assist Features should bias the diffraction pattern around main features without ever clearing the resist threshold themselves. A printed SRAF shows up on the wafer as a stray defect, which is a yield killer.
This module provides a differentiable penalty that callers add to their ILT or OPC training loss. It is complementary to (not a substitute for) the curvilinear MRC loss requested in issue #8 — that one polices mask geometry, this one polices the aerial-image response inside SRAF regions.
sraf_print_penalty(aerial_image, sraf_mask, *, print_threshold=0.3, margin=0.05)
¶
Differentiable penalty for SRAFs whose aerial intensity risks printing.
For every pixel inside sraf_mask, penalise the amount by which the
aerial intensity exceeds print_threshold - margin. Squared-ReLU keeps
the gradient growing as the violation deepens, which empirically converges
faster than plain L1 inside ILT inner loops.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
aerial_image
|
Tensor
|
Simulated aerial image. Either |
required |
sraf_mask
|
Tensor
|
Binary tensor of the same shape as |
required |
print_threshold
|
float
|
Resist-clearing threshold. Defaults to |
0.3
|
margin
|
float
|
Safety headroom subtracted from |
0.05
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Scalar |
Tensor
|
exceeds the budget; positive otherwise. |
Source code in src/openlithohub/benchmark/metrics/sraf.py
openlithohub.benchmark.metrics.mrc_loss
¶
Differentiable Mask Rule Check (MRC) loss for curvilinear masks.
Companion to benchmark.compliance.mrc.check_curvilinear_mrc (post-hoc
binary verdict) and sraf.sraf_print_penalty (aerial-image-side penalty).
This module gives optimisers a smooth, differentiable signal so curvilinear
ILT / level-set / Neural-ILT models can learn to respect MRC during training
instead of being scored on it afterwards.
Drop into a training loop:
loss = epe_loss + alpha * curvilinear_mrc_loss(mask, pdk="asap7")
See issue #8 for motivation.
curvilinear_mrc_loss(mask, pdk=None, *, min_width_nm=None, min_spacing_nm=None, min_curvature_radius_nm=20.0, pixel_size_nm=None, weight_min_cd=1.0, weight_min_spacing=1.0, weight_min_curvature=1.0)
¶
Differentiable MRC penalty for curvilinear masks.
Three additive terms, each non-negative and zero on a clean mask:
- Min-CD — soft morphological opening with structuring radius
r = floor(min_width_nm / (2 * pixel_size_nm)). Pixels the mask claims that the opening drops contributerelu(mask - opening), summed and normalised by area. This mirrors the binary check incompliance.mrc.check_mrcso the loss and the verdict agree on what a violation is. - Min-spacing — same opening applied to
1 - mask; gaps too narrow to host the structuring element get penalised. - Min-curvature — boundary-band integral of the squared image
gradient.
‖∇mask‖²peaks at sharp transitions, so any region where the local gradient magnitude exceeds the curvature budget1 / min_curvature_radius_nm(in per-nm units) is squared-ReLU penalised. The "boundary band" is the symmetric differencedilation(mask, 1) - erosion(mask, 1), restricting the cost to pixels actually on a contour and keeping the loss well-defined for large flat interior regions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Tensor
|
Continuous mask in |
required |
pdk
|
PdkRules | str | None
|
PDK rules to source defaults from. May be a |
None
|
min_width_nm
|
float | None
|
Override for |
None
|
min_spacing_nm
|
float | None
|
Override for |
None
|
min_curvature_radius_nm
|
float
|
Minimum allowed local radius of curvature.
Defaults to |
20.0
|
pixel_size_nm
|
float | None
|
Override for |
None
|
weight_min_cd
|
float
|
Weight for the min-CD term. |
1.0
|
weight_min_spacing
|
float
|
Weight for the min-spacing term. |
1.0
|
weight_min_curvature
|
float
|
Weight for the min-curvature term. |
1.0
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Scalar |
Tensor
|
rule-respecting mask, positive otherwise. |
Source code in src/openlithohub/benchmark/metrics/mrc_loss.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 | |
Compliance¶
openlithohub.benchmark.compliance.mrc
¶
Mask Rule Check (MRC) — minimum width/spacing for mask manufacturing.
MRCResult
dataclass
¶
Result of a Mask Rule Check.
.. note::
violation_count is the count of violating pixels, the
sum of per-pixel boolean masks for width and spacing
rules. It is unclipped and scales with the feature area of the
layout — a 4096² mask with 1% violation density reports a
bigger number than a 256² one with the same fractional rate.
Use violation_rate (count / total pixels) for area-
independent comparison.
MRC ``violation_count`` is **not directly comparable** to DRC
``violation_count`` — DRC counts connected components and is
clipped at the rule's ``max_reports`` cap, while MRC counts
pixels. ``passed`` / ``passed`` comparisons are well-defined;
magnitude comparisons are not.
``violations`` is a per-violation sample list (capped at
``max_reports``, evenly spaced) used for visualisation and
debug; do not derive counts from it — use ``violation_count``
directly.
Source code in src/openlithohub/benchmark/compliance/mrc.py
CurvilinearMRCResult
dataclass
¶
Result of a curvilinear-specific Mask Rule Check.
Curvilinear masks (post-ILT, EUV) cannot be validated with Manhattan-only rules. This adds two checks aimed at MBMW writability: - Minimum curvature radius (sharp cusps cannot be written). - Minimum feature area (sub-resolution dots cannot be reliably exposed).
Source code in src/openlithohub/benchmark/compliance/mrc.py
check_mrc(mask, min_width_nm=40.0, min_spacing_nm=40.0, pixel_size_nm=1.0)
¶
Check mask against minimum width and spacing rules.
MRC violations are a hard-fail metric — a mask that violates these rules cannot be manufactured regardless of optical performance.
Width check: morphological opening with structuring element of size
kernel = floor(min_width_nm / pixel_size_nm) (i.e. the largest disk
that physically fits inside a feature of exactly min_width_nm). The
kernel half-width passed to binary_erosion is therefore
(kernel - 1) // 2. Features that disappear under this opening are
width violations. A feature exactly min_width_nm wide passes.
Spacing check: same logic on the inverted mask — gaps that disappear under opening are too narrow.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Tensor
|
Binary mask tensor (H, W) or (B, C, H, W). |
required |
min_width_nm
|
float
|
Minimum allowed feature width. |
40.0
|
min_spacing_nm
|
float
|
Minimum allowed spacing between features. |
40.0
|
pixel_size_nm
|
float
|
Physical pixel size for unit conversion. |
1.0
|
Returns:
| Type | Description |
|---|---|
MRCResult
|
MRCResult with pass/fail status and violation details. |
Source code in src/openlithohub/benchmark/compliance/mrc.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 | |
check_curvilinear_mrc(mask, min_curvature_radius_nm=20.0, min_feature_area_nm2=1600.0, pixel_size_nm=1.0, smoothing_window=5, max_reports=100)
¶
Check curvilinear-specific manufacturing rules on a binary mask.
Two rules, both targeting MBMW writability of post-ILT curvilinear shapes:
- Minimum curvature radius. The contour is traced, smoothed with a periodic
moving average to suppress rasterization aliasing, then discrete curvature
is computed at each point via the Menger (three-point circumscribed
circle) formula. A point violates if its radius (1/|kappa|) falls below
min_curvature_radius_nm. The smoothing offset (smoothing_window // 2) skips evaluation near sharp 90 degree corners typical of Manhattan input, so right-angled designs do not falsely fail. - Minimum feature area. 4-connected components below
min_feature_area_nm2are flagged as sub-resolution dots.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Tensor
|
Binary mask tensor (H, W) or (B, C, H, W). |
required |
min_curvature_radius_nm
|
float
|
Minimum allowed local radius of curvature. |
20.0
|
min_feature_area_nm2
|
float
|
Minimum allowed area for a connected feature. |
1600.0
|
pixel_size_nm
|
float
|
Physical pixel size for unit conversion. |
1.0
|
smoothing_window
|
int
|
Window size for the periodic moving-average smoother. Set to 1 to disable smoothing. Larger values relax the curvature check; the default suits 1 nm/pixel ILT outputs. |
5
|
max_reports
|
int
|
Cap on per-category violation reports. |
100
|
Returns:
| Type | Description |
|---|---|
CurvilinearMRCResult
|
CurvilinearMRCResult with pass/fail status and violation details. |
Source code in src/openlithohub/benchmark/compliance/mrc.py
303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 | |
openlithohub.benchmark.compliance.drc
¶
Design Rule Check (DRC) — layout-level geometric constraint validation.
DRCRuleDeck
dataclass
¶
DRCResult
dataclass
¶
Result of a Design Rule Check.
.. note::
violation_count is the number of reported violations,
i.e. len(violations). Each rule check caps how many
per-component reports it adds (typically max_reports = 50,
evenly sampled), so on a heavily-violating layout this number
is clipped, not the true total. Use rule_summary for the
per-rule reported counts and treat passed (any violation
at all) as the only sound binary signal.
DRC ``violation_count`` is **not directly comparable** to MRC
``violation_count`` — the latter counts violating *pixels*, an
unclipped scalar that scales with feature area, while DRC
counts (clipped) connected components. ``passed`` / ``passed``
comparisons are well-defined; magnitude comparisons are not.
Source code in src/openlithohub/benchmark/compliance/drc.py
check_drc(mask, rule_deck='default', pixel_size_nm=1.0)
¶
Run Design Rule Check on a mask layout.
Checks: minimum width, minimum spacing, minimum area, notch detection.
Notch semantics. min_notch_nm flags only fully-enclosed background
concavities — small bg pockets surrounded on all sides by foreground —
that a closing of the foreground at radius min_notch_nm / 2 would fill
in. Through-channels (narrow bg gaps that touch the image border) and the
open exterior background are intentionally excluded; those are spacing
violations and are reported by min_spacing_nm instead. This split
avoids double-counting the same physical defect under two rules.
Source code in src/openlithohub/benchmark/compliance/drc.py
Report¶
openlithohub.benchmark.report
¶
Evaluation report generation.
generate_report(metrics, output_format='table')
¶
Generate a formatted evaluation report from computed metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
dict[str, Any]
|
Dictionary of metric names to values. |
required |
output_format
|
str
|
'table' (rich terminal), 'json', or 'markdown'. |
'table'
|
Returns:
| Type | Description |
|---|---|
str
|
Formatted report string. |