Evaluation

Last updated: May 18, 2023 (added some info on final-round scoring)

Platform

With the beginning of the alpha round, all registered teams will be provided access to their dedicated, private Google Drive folder.

Important: Please read the following instructions carefully before using the Google Drive.

    • Folders: You must keep and follow the provided folder structure for uploading your submission files.
      • There are dedicated folders for the alpha and the final round.
        • Within these round folders, there are subfolders for each benchmark — upload your submissions only into these benchmark subfolders.
      • You may also keep any other files in the Drive, but only in the home directory or in other subfolders separate from the provided folder structure. These other files won’t be downloaded for evaluation.
    • Submission: Upon submission, files are automatically downloaded to our server and queued for evaluation in batch processing.
      • You can submit at any time and as many times as you wish. While all submissions are evaluated (and backed-up), only your currently best solutions will be considered for any ranking. Official, final rankings are subject to manual checks by the contest organizers.
      • Participants will receive an email notification once processing of their submission is started. Depending on the workload of the evaluation server and the number of your currently ongoing runs, this may take some time.
      • Submission files are: the DEF file generated from your security-closure flow, and the corresponding post-route netlist. No other files from the benchmark archives have to be re-uploaded again. (And even if you re-upload some again, those other files would be ignored.)
      • You should upload your two submission files (.DEF, .v) as ZIP archive.
        • You cannot put DEF and netlist files for multiple runs into one archive — each run (set of .DEF, .v files) must go into a separate ZIP file.
        • However, you can upload multiple ZIP files for multiple runs at once.
        • You can upload ZIP files either:
          • directly into the respective benchmark’s subfolder directly, or
          • within another level of sub-subfolders in the respective benchmark’s subfolder. You can only use one level/depth of sub-subfolders. For example, files could be arranged like that: team/alpha/AES/methodA/trial1.zip, team/alpha/AES/methodA/trial2.zip, team/alpha/AES/methodB/trial1.zip, etc.
      • Once uploaded, re-naming ZIP files or re-uploading the same ZIP files will not trigger re-evaluation — Google Drive maintains the same internal file ID for such cases. You must do one of the following instead:materials after contest
        • Delete the file(s) from the Drive and upload again as new file, with the same name; or
        • Upload again, making sure to select “Keep both files” in the “Upload options” pop-up window.
    • Results: Results will be returned/uploaded into the same benchmark subfolder.
      • Participants will receive an email notification once their results are available.
      • Results will include report and log files as generated by the same set of evaluation scripts that are also shared with the Benchmarks release.
      • The results for all submissions are being logged. Any rankings will automatically to the currently best results for each team.

Scoring

Scoring accounts for different security and design metrics, as explained here in detail. There are also constraints to be met, which are outlined further below.

For the final round, scoring is extend with a scoresheet for actual, ECO-based insertion of different Trojans. The basic idea is outlined in the Details page, and more details for the actual scoring are found in the final-round benchmarks release, in the file _scripts/scores.sh .

For evaluating a submitted layout’s resilience against Trojan insertion during the alpha/qualifying round (but also the final round), we propose the notion of exploitable regions. An exploitable region is defined as a set of 20+ spatially continuous placement sites, within and across rows, that are either (a) completely free or (b) occupied only by filler, decap, or tap cells. Such regions would be attractive for attackers seeking to insert some additional logic for Trojan components. Routing resources are also considered, as Trojans would require some additional routing as well.

A short description of considered metrics, along with their variable names listed after ‘ — ‘, is given in the following list.

    • Security — sec
      • Trojan insertion — sec_ti
        • Placement sites — sec_ti_sts
          • Max number of sites across all exploitable regions — sec_ti_sts_max
          • Median number of sites across all exploitable regions — sec_ti_sts_med
          • Total number of sites across all exploitable regions — sec_ti_sts_sum
        • Routing resources — sec_fts
          • Total number of free tracks across whole layout, across all metal layers — sec_ti_fts_sum
    • Design cost — des
      • Power — des_pwr
        • Total power — des_pwr_tot
      • Performance — des_prf
        • Worst negative slack for setup check — des_prf_WNS_set
        • Worst negative slack for hold check — des_prf_WNS_hld
      • Area — des_ara
        • Total die area (not standard-cell area) — des_ara_die

Notes:

    • Timing checks, PDN checks, as well as DRC checks are hard constraints, i.e., must be met.
      • Accordingly, since timing checks are based on WNS, only positive slack values are accepted and considered for scoring.
    • All further design checks/metrics (see the Benchmarks page and the README files in the benchmark releases for more details) are not considered for scoring.
      • Thus, participants can neither improve nor worsen their scores by fixing or worsening those design issues.
      • However, they are considered as soft constraints with a margin of +10 issues — any deviation above these margins is considered as fail.
    • All metrics are normalized to their respective nominal baseline values, obtained from the provided benchmark layouts. A submission that improves on some metric will be scored a related value between 0 and 1, whereas a deteriorated layout be scored a value of > 1.
      • For positive WNS values, this means to compute baseline_WNS / submission_WNS.
      • For all other metrics, this means to compute submission_metric / baseline_metric.
    • Such normalized scoring is more sensitive to deterioration than it is to improvements. This is on purpose — the main objective is to further improve the layouts, not deteriorate them, so deterioration for any metric(s) should have relative large detrimental impact on the overall score.
  • The actual score calculation for the alpha/qualifying round is shown below. For the final round, the actual score calculation can be found in the final-round benchmarks release, in the file _scripts/scores.sh . As indicated, note that the Benchmarks releases also contain the scripts used for scoring for your reference.
    • Overall score = 1/2 * sec + 1/2 * des, with
      1. sec = sec_ti = 1/2 * sec_ti_sts + 1/2 * sec_ti_fts
        1. sec_ti_sts = 1/2 * sec_ti_sts_sum + 1/3 * sec_ti_sts_ max + 1/6 * sec_ti_sts_med
        2. sec_ti_fts = sec_ti_fts_sum
      2. des = 1/3 * des_pwr + 1/3 * des_prf + 1/3 des_ara
        1. des_pwr = des_pwr_tot
        2. des_prf = 1/2 * des_prf_WNS_set + 1/2 * des_prf_WNS_hld
        3. des_ara = des_ara_die

Constraints

For a fair and focused competition, participants have to adhere to some constraints as follows. As of now (alpha round), this list is still subject to change.

    • Submissions cannot incorporate trivial defenses. Specifically, filler, decap, and tap cells are scrubbed and thus considered as free placement sites for evaluation of exploitable regions.
    • Submissions must meet setup and hold timing checks, following the provided settings and SDC files for timing analysis.
    • Submissions must have 0 DRC violations.
    • For all other design checks — see the Benchmarks page and the README files in the benchmark releases for more details — there is a margin of +10 issues over the provided baseline numbers. Any increase going beyond that margin, for any check, renders the submission invalid.
    • Participants must maintain the overall functional equivalence of the underlying design. However, participants are free to revise (parts of) the design implementation, as long as this constraint and the next one are met.
    • Participants must maintain the assets, i.e., sensitive components, which are declared along with each benchmark. More specifically, cells declared as assets cannot be removed or restructured. However, participants are free to revise the physical design of assets as well as other logic in general.
    • Participants cannot design custom cells; only those cells defined in the provided LIB/LEF files can be utilized.
    • Participants cannot revise the metal layers.
    • Participants must include a clock tree in their submission but are free to revise its implementation, as long as other constraints are met.
    • Participants must follow the PDN recipe provided in the reference flow. The PDN structure’s stripes are checked for dimensions, area, and locations.
    • Submissions must maintain the general IO pin placement. More specifically, pins must remain placed at the left or right side assigned in the baseline layout, but actual pin locations (along the y-axis) can be revised.