Evaluation

Last updated: Mar 20th 2022

Platform

All registered teams will be provided access to their dedicated, private Google Drive folder. Note the following:

    • Please follow the provided folder structure for uploading submission files:
      • there are dedicated subfolders for the alpha and final round, and
      • within there are subfolders for each benchmark — upload your submissions into those subfolders.
    • Each team may upload submission files anytime, upon which these files are automatically downloaded to our servers for evaluation.
      • Submission files are: the DEF file and the post-layout netlist — no other files from the benchmark ZIP archives have to be re-uploaded again.
      • You can upload your submission files as follows:
        • directly as is (what you did until now);
        • within a subfolder (use level/depth of 1 only, i.e., no further subfolders inside that subfolder; do not use folder names containing ‘results’);
        • or as zip archive.
      • DEF and netlist files that go together must use the same basename, e.g., trial1.def and trial1.v. You may still upload multiple trials at once; they will be handled in separate runs.
        • This also applies to subfolder submission.
        • For zip submission, you cannot put multiple trials into one archive; each trial must go into a separate zip.
      • Re-uploaded submission files with the same name (option offered by Drive: “Replace existing file”) are not re-evaluated. Thus, you want to either select the Drive option “Keep both files” when re-uploading, or use files of different names when re-uploading.
      • Once uploaded and processed, renaming of files will also not trigger re-evaluation.
    • Results will be returned/uploaded into the same benchmark subfolder. Results will include scores and report files as generated by our evaluation scripts.
    • Participants will receive an email notification once their results are available.
    • You may also keep any other files in the Drive, and you’re free to just put them in the home directory or organize in folders separate from the provided folder structure. These other files won’t be downloaded for evaluation.

Scoring

The scoring considers multiple security and design metrics, as explained here in detail. Besides, there are some constraints to be followed, as outlined further below.

As of now (alpha round), the details for scoring are still subject to change. The key ideas and metrics, however, are finalized. 

For Trojan insertion, metrics are based on exploitable regions, i.e., sets of (w.l.o.g. 20+) spatially continuous placement sites that are either a) free or b) occupied only by filler cells or other non-functional cells, or by functional but unconnected cells. Routing resources are also considered, as Trojans would require some connectivity as well. In other words, exploitable regions are those where an attacker would be able to find or make some space and routing resources to insert and connect their Trojans.

Note that exploitable regions are only defined and evaluated within a exploitable distance of cells related to security assets. That is because Trojans need to be placed and routed such that the timing requirements of the original design are still met, so placement sites which are closer to security assets tend to be more vulnerable.

More specifically, the exploitable distances are determined in both horizontal and vertical direction as follows. For the paths related to the cells, the positive timing slack (if any) is determined. Next, the delay impact of adding an additional NAND gate—representing an exemplary, most simple form of Trojan—into the paths are determined, by analyzing routing delays and the standard cell library for different output loads and input transition times. To simplify and limit runtime for evaluation, note that we do not do actual routing here. We rather look at estimated wiring loads and cap loads extracted from the lib for NAND gate to estimate slacks. Then we derive what maximal distance by Trojan routing can be afforded to consume just that slack but not more (i.e., still meet timing).

For frontside probing and fault injection, metrics are based on the exposed area of cells and nets related to security assets. The notion of exposed area describes any region (be it continuous or fragmented) of those cell/net assets which is accessible via direct line of sight through the metal stack. A visual example for the exposed area of some standard cells is provided along with the sample benchmarks.

Note that more general background on Trojan insertion as well as frontside probing and fault injection is provided in the detailed contest description.

A short description of all the considered metrics, along with their variable names listed in italic, is given in the following classification.

    • Security
      • Trojan insertion
        • Placement sites of exploitable regions
          • Max number of sites across all exploitable regions — sites_max
          • Avg number of sites across all exploitable regions — sites_avg
          • Total number of sites across all exploitable regions — sites_total
        • Routing resources of exploitable regions
          • Max number of free tracks across all exploitable regions — fts_max
          • Avg number of free tracks across all exploitable regions — fts_avg
          • Total number of free tracks across all exploitable regions — fts_total
          • Note: for each exploitable region, free tracks are summed up across all the globally utilized metal layers
      • Frontside probing and fault injection
        • Exposed area of standard cell assets
          • Max percentage of exposed area across all cell assets — ea_c_max
          • Avg percentage of exposed area across all cell assets — ea_c_avg
          • Total exposed area across all cell assets — ea_c_total
        • Exposed area of net assets
          • Max percentage of exposed area across all net assets — ea_n_max
          • Avg percentage of exposed area across all net assets — ea_n_avg
          • Total exposed area across all net assets — ea_n_total
    • Design cost
      • Power
        • Total power — p_total
      • Performance
        • Worst negative slack for setup timing requirement — setup_WNS
        • Total negative slack for setup timing requirement — setup_TNS
        • Failing endpoints for setup timing requirement — setup_FEP
      • Area
        • Total die area (not standard cell area) — die_area
      • Routing
        • Number of DRC violations — DRC_viol

The actual score calculation is detailed below. More importantly, note the following aspects of the scoring:

    • All metrics are grouped following the above classification. All metrics are weighted and summed up within their related group. Weighting is also applied across the classification hierarchy.
    • For the score, we consider the product of (summed and weighted) metrics for Trojan insertion and frontside probing, fault injection as well as (summed and weighted) metrics for design cost. Thus, security and design cost are to be optimized at once, or rather security should be incorporated without negative impact on design cost.
    • All metrics are normalized to their respective nominal baseline, obtained from the provided benchmark layouts.
      • A layout that improves on some metric will be scored a related value between 0 and 1, whereas a deteriorated layout be scored a value of > 1. Note that, for metrics requiring case-specific considerations, this does not strictly hold true. For example, assuming the baseline as well as the submission layouts have value 0 for some metric, the related score will be 0 as well (and not 1 for no improvement).
      • Some metrics require case-specific considerations for such normalized scoring. Details are given along with the actual score calculation below.
      • Such normalized scoring is more sensitive to deterioration than it is to improvements. This is on purpose—the main objective is to further improve the layouts, not deteriorate them. Note that it is actually not important whether some benchmark layouts are already aggressively optimized in terms of design cost for their baseline layout or not; a more (versus less) aggressive benchmark layout will just provide less (versus more) flexibility for optimizing both on security and design cost.

The actual score calculation is detailed next. The overall score, to be minimized, is defined as:

sec * des = (ti + fsp_fi)/2 * des, with sub referring to the submission layout and base to the baseline benchmark layout:

    1. Security — sec
      1. Trojan insertion — ti
        • 50% weighted: placement sites of exploitable regions
          • 50% weighted: score(sites_total) = sites_total(sub) / sites_total(base)
          • 33.3% weighted: score(sites_max) = sites_max(sub) / sites_max(base)
          • 16.6% weighted: score(sites_avg) = sites_avg(sub) / sites_avg(base)
        • 50% weighted: routing resources of exploitable regions
          • 50% weighted: score(fts_total) = fts_total(sub) / fts_total(base)
          • 33.3% weighted: score(fts_max) = fts_max(sub) / fts_max(base)
          • 16.6% weighted: score(fts_avg) = fts_avg(sub) / fts_avg(base)
      2. Frontside probing and fault injection — fsp_fi
        • 50% weighted: exposed area of standard cell assets
          • 50% weighted: score(ea_c_total) = ea_c_total(sub) / ea_c_total(base)
          • 33.3% weighted: score(ea_c_max) = ea_c_max(sub) / ea_c_max(base)
          • 16.6% weighted: score(ea_c_avg) = ea_c_avg(sub) / ea_c_avg(base)
        • 50% weighted: exposed area of net assets
          • 50% weighted: score(ea_n_total) = ea_n_total(sub) / ea_n_total(base)
          • 33.3% weighted: score(ea_n_max) = ea_n_max(sub) / ea_n_max(base)
          • 16.6% weighted: score(ea_n_avg) = ea_n_avg(sub) / ea_n_avg(base)
    2. Design cost — des
      • 25% weighted: power
        • score(p_total) = p_total(sub) / p_total(base) 
      • 25% weighted: performance
        • 50% weighted: setup TNS
          • If setup_TNS(sub) < 0 and setup_TNS(base) < 0:
            score(setup_TNS) = setup_TNS(sub) / setup_TNS(base)
          • Else if setup_TNS(sub) < 0 and setup_TNS(base) >= 0:
            score(setup_TNS) = abs(setup_TNS(sub))
          • Else if setup_TNS(sub) >= 0:
            score(setup_TNS) = 0
        • 33.3% weighted: setup WNS
          • If setup_WNS(sub) < 0 and setup_WNS(base) < 0:
            score(setup_WNS) = setup_WNS(sub) / setup_WNS(base)
          • Else if setup_WNS(sub) < 0 and setup_WNS(base) >= 0:
            score(setup_WNS) = abs(setup_WNS(sub))
          • Else (setup_WNS(sub) >= 0):
            score(setup_WNS) = 0
        • 16.6% weighted: setup FEP
          • For setup_FEP(base) > 0:
            score(setup_FEP) = setup_FEP(sub) / setup_FEP(base)
          • For setup_FEP(base) = 0:
            score(setup_FEP) = setup_FEP(sub)
      • 25% weighted: area
        • score(die_area) = die_area(sub) / die_area(base)
      • 25% weighted: routing
        • For DRC_viol(base) > 0:
          score(DRC_viol) = DRC_viol(sub) / DRC_viol(base)
        • For DRC_viol(base) = 0:
          score(DRC_viol) = DRC_viol(sub)

Constraints

For a fair and focused competition, participants have to adhere to some constraints. Please note that for violations during the alpha round, there would be feedback given to the respective team(s), whereas for the final round, any repeated cases of excessive violations may result in disqualification and any case of violations would result in dropping out of the related benchmarks’ submission.

The list of constraints is as follows. Note that, as of now (alpha round), this list is still subject to change; however, no significant changes are to be expected, if any at all.

    • Participants must maintain the overall functional equivalence of the underlying design. However, participants are free to revise parts of the design for implementation, as long as this and the next constraint are met.
    • Participants must maintain the sensitive components, which are declared along with each benchmark. More specifically, cells and nets declared as assets cannot be removed or entirely restructured. However, participants are free to revise the physical design of assets as well as other logic in general.
    • Participants cannot add dedicated, custom circuitry (e.g., sensors to detect laser fault injection).
    • Participants cannot design custom cells; only those cells defined in the provided LIB/LEF files can be utilized.
    • Participants cannot add additional metal layers (e.g., to protect against frontside probing attacks).
    • Participants cannot move the power delivery network (PDN) to different layers. More specifically, for each metal layer, participants need to maintain the ratio of area covered by PDN metals to total die area approximately constant; deviations of up to +/- 10% in area coverage is permissible. Note that the applied ratio for die area is scaled automatically for any re-sizing of dies for submission layouts. Also note that, for any metal layer not holding originally some part of the PDN in the baseline layout, such layer is still allowed to cover a small area of up to 5 um^2 in the submission layout.
    • Participants must include a clock tree but are free to revise its implementation, as long as the other constraints are met.
    • Participants must maintain the relative IO pin placement. Note that re-scaling the die outline is allowed in general; if considered, the IO pins must afterwards be placed in similar relative coordinates. Deviations of up to +/-5% along the horizontal/vertical boundary are permissible.
    • Participants cannot incorporate trivial defenses; specifically, filler cells, other non-functional cells, and functional but unconnected cells are considered as free placement sites for evaluation of exploitable regions.