ETAPS Logo TACAS 2023
12th Competition on Software Verification (SV-COMP 2023)

Results of the Competition on Witness Validation

This web page presents the results of SV-COMP 2023 - 12th International Competition on Software Verification.

The background color is gold for the winner, silver for the second, and bronze for the third.

Here some brief directions for navigating in the BenchExec-generated tables with the results:


Contents

  1. Validation of Violation Witnesses
  2. Validation of Correctness Witnesses

1. Validation of Violation Witnesses


Ranking by Category (with Score-Based Quantile Plots)

ReachSafety
1. UAutomizer
2. CProver-witness2test
3. CPAchecker
Quantile-Plot
MemSafety
1. UAutomizer
2. CPAchecker
3. CPA-witness2test
Quantile-Plot
ConcurrencySafety
1. Dartagnan
2. CPAchecker
3. UAutomizer
Quantile-Plot
NoOverflows
1. UAutomizer
2. CProver-witness2test
3. CPAchecker
Quantile-Plot
Termination
1. UAutomizer
2. CPAchecker
3. MetaVal
Quantile-Plot
SoftwareSystems
1. Symbiotic-Witch
2. UAutomizer
3. CPAchecker
Quantile-Plot
Overall
1. UAutomizer
2. CPAchecker
3. Symbiotic-Witch
Quantile-Plot

Table of All Results

In every table cell for competition results, we list the points in the first row and the CPU time (rounded to two significant digits) for successful runs in the second row.

The entry '–' means that the competition candidate was not executed in the category.
The definition of the scoring schema can be found in the literature [Proc. SAS 2022] and the categories are defined on the respective SV-COMP web page.

ParticipantsPlotsCPAcheckerCPA-witness2testDartagnanCProver-witness2testGWITMetaValNITWITSymbiotic-WitchUAutomizerWIT4JAVA
Representing Jury MemberHenrik WachowitzHenrik WachowitzHernán Ponce de LeónMichael TautschnigFalk HowarMartin SpiesslJana (Philipp) BergerPaulína AyaziováDaniel DietschTong Wu
AffiliationLMU Munich, GermanyLMU Munich, GermanyHuawei Dresden Research Center, GermanyQueen Mary University of London, UKTU Dortmund University, GermanyLMU Munich, GermanyRWTH Aachen, GermanyMasaryk University, Brno, CzechiaUniversity of Freiburg, GermanyUniversity of Manchester, UK
ReachSafety
36231 valid tasks (30763 correct, 5468 wrong, 512 void), max. score: 90578
Quantile-Plot 3393812336495452480788692829062966
CPU time330000 s150000 s59000 s190000 s9100 s150000 s360000 s
ReachSafety-Arrays
2035 valid tasks (1505 correct, 530 wrong, 0 void), max. score: 5088
Quantile-Plot 59067229484788719704389
CPU time6400 s2900 s1400 s13000 s500 s2200 s19000 s
ReachSafety-BitVectors
357 valid tasks (286 correct, 71 wrong, 12 void), max. score: 892
Quantile-Plot 545220658251134386760
CPU time2000 s960 s340 s2500 s2.7 s1400 s5800 s
ReachSafety-ControlFlow
127 valid tasks (108 correct, 19 wrong, 2 void), max. score: 318
Quantile-Plot 18722222511378270
CPU time670 s310 s100 s380 s0.30 s680 s1200 s
ReachSafety-ECA
6613 valid tasks (5444 correct, 1169 wrong, 1 void), max. score: 16533
Quantile-Plot 5677165312864212587185011189
CPU time90000 s68000 s30000 s1600 s6900 s38000 s90000 s
ReachSafety-Floats
1257 valid tasks (729 correct, 528 wrong, 18 void), max. score: 3143
Quantile-Plot 96187720241633-3051433720
CPU time5000 s2900 s1100 s11000 s6.0 s220 s12000 s
ReachSafety-Heap
1208 valid tasks (1073 correct, 135 wrong, 19 void), max. score: 3020
Quantile-Plot 2858912135121820913191976
CPU time5600 s2600 s1200 s13000 s4.7 s1100 s9000 s
ReachSafety-Loops
4494 valid tasks (3511 correct, 983 wrong, 46 void), max. score: 11235
Quantile-Plot 3937173068724390151160899602
CPU time18000 s7800 s3500 s47000 s29 s9700 s39000 s
ReachSafety-ProductLines
5414 valid tasks (4697 correct, 717 wrong, 222 void), max. score: 13535
Quantile-Plot 12287052007872129319611485
CPU time48000 s0 s7500 s33000 s80 s44000 s44000 s
ReachSafety-Recursive
538 valid tasks (473 correct, 65 wrong, 0 void), max. score: 1345
Quantile-Plot 82431116310802045221147
CPU time2900 s1100 s690 s640 s140 s930 s3300 s
ReachSafety-Sequentialized
4686 valid tasks (4463 correct, 223 wrong, 54 void), max. score: 11715
Quantile-Plot 27441062874882155524619349
CPU time45000 s23000 s1800 s37000 s1200 s14000 s56000 s
ReachSafety-XCSP
917 valid tasks (857 correct, 60 wrong, 43 void), max. score: 2293
Quantile-Plot 180225342172530415681629
CPU time18000 s800 s1300 s18000 s87 s11000 s7500 s
ReachSafety-Combinations
5253 valid tasks (4822 correct, 431 wrong, 71 void), max. score: 13132
Quantile-Plot 3094597115562699709118933
CPU time62000 s29000 s10000 s15000 s100 s19000 s53000 s
ReachSafety-Hardware
3332 valid tasks (2795 correct, 537 wrong, 24 void), max. score: 8330
Quantile-Plot 19682350004731397
CPU time29000 s13000 s0 s0 s0 s7200 s18000 s
MemSafety
18856 valid tasks (17215 correct, 1641 wrong, 60 void), max. score: 47140
Quantile-Plot 90131241-113630-213431156
CPU time140000 s2700 s1600 s0 s17000 s180000 s
MemSafety-Arrays
233 valid tasks (208 correct, 25 wrong, 1 void), max. score: 583
Quantile-Plot -24123-6750-505535
CPU time570 s290 s170 s0 s210 s4300 s
MemSafety-Heap
1005 valid tasks (938 correct, 67 wrong, 38 void), max. score: 2512
Quantile-Plot 194-13660-8861948
CPU time5500 s1700 s850 s0 s580 s6600 s
MemSafety-LinkedLists
303 valid tasks (271 correct, 32 wrong, 13 void), max. score: 757
Quantile-Plot 20615970113360
CPU time1800 s230 s340 s0 s150 s490 s
MemSafety-Other
285 valid tasks (272 correct, 13 wrong, 3 void), max. score: 713
Quantile-Plot 1134485094620
CPU time1400 s530 s210 s0 s140 s2600 s
MemSafety-Juliet
16500 valid tasks (14998 correct, 1502 wrong, 0 void), max. score: 41250
Quantile-Plot 40542000613834415
CPU time130000 s0 s0 s0 s16000 s150000 s
MemSafety-MemCleanup
530 valid tasks (528 correct, 2 wrong, 5 void), max. score: 1325
Quantile-Plot 196130685122
CPU time2500 s37 s0 s210 s7200 s
ConcurrencySafety
13400 valid tasks (5525 correct, 7875 wrong, 152 void), max. score: 31825
Quantile-Plot 26589777912
CPU time52000 s160000 s4500 s
ConcurrencySafety-Main
6515 valid tasks (5205 correct, 1310 wrong, 39 void), max. score: 16287
Quantile-Plot 332774650
CPU time52000 s130000 s0 s
ConcurrencySafety-MemSafety
2920 valid tasks (0 correct, 2920 wrong, 34 void), max. score: 5840
00
CPU time0 s0 s
ConcurrencySafety-NoOverflows
2032 valid tasks (99 correct, 1933 wrong, 36 void), max. score: 5080
Quantile-Plot 5750553
CPU time430 s0 s4500 s
NoDataRace-Main
1933 valid tasks (221 correct, 1712 wrong, 43 void), max. score: 4832
Quantile-Plot 034260
CPU time0 s26000 s0 s
NoOverflows
31813 valid tasks (27691 correct, 4122 wrong, 3087 void), max. score: 79532
Quantile-Plot 2860036386184802597174933
CPU time18000 s5200 s23000 s0 s65000 s530000 s
NoOverflows-Main
3628 valid tasks (3285 correct, 343 wrong, 322 void), max. score: 9070
Quantile-Plot 65238306989029628805
CPU time18000 s5200 s2900 s0 s11000 s54000 s
NoOverflows-Juliet
28185 valid tasks (24406 correct, 3779 wrong, 2765 void), max. score: 70462
Quantile-Plot 005529402300964370
CPU time0 s0 s20000 s0 s54000 s480000 s
Termination
4953 valid tasks (4922 correct, 31 wrong, 410 void), max. score: 9906
Quantile-Plot 42303017
CPU time67000 s0 s27000 s
Termination-BitVectors
52 valid tasks (44 correct, 8 wrong, 7 void), max. score: 130
Quantile-Plot 86
CPU time370 s
Termination-MainControlFlow
328 valid tasks (313 correct, 15 wrong, 47 void), max. score: 820
Quantile-Plot 0106
CPU time0 s4300 s
Termination-MainHeap
4 valid tasks (4 correct, 0 wrong, 1 void), max. score: 2
Quantile-Plot 002
CPU time0 s0 s75 s
Termination-Other
4569 valid tasks (4561 correct, 8 wrong, 355 void), max. score: 11423
Quantile-Plot 15590415
CPU time67000 s0 s23000 s
SoftwareSystems
7933 valid tasks (2376 correct, 5557 wrong, 31 void), max. score: 15866
Quantile-Plot 16201216-732292533042468
CPU time49000 s2100 s1100 s21000 s2000 s110000 s
SoftwareSystems-AWS-C-Common-ReachSafety
438 valid tasks (53 correct, 385 wrong, 5 void), max. score: 1095
Quantile-Plot 26702071740511703
CPU time3100 s0 s220 s2400 s0 s1200 s7700 s
SoftwareSystems-BusyBox-ReachSafety
9 valid tasks (9 correct, 0 wrong, 0 void), max. score: 5
Quantile-Plot 0000000
CPU time33 s0 s0 s19 s0 s0 s0 s
SoftwareSystems-BusyBox-MemSafety
83 valid tasks (28 correct, 55 wrong, 1 void), max. score: 207
Quantile-Plot 160-4850-218
CPU time73 s0 s31 s0 s45 s220 s
SoftwareSystems-BusyBox-NoOverflows
75 valid tasks (44 correct, 31 wrong, 0 void), max. score: 187
Quantile-Plot 10060510
CPU time64 s0 s4.3 s0 s2.7 s41 s
SoftwareSystems-DeviceDriversLinux64-ReachSafety
5548 valid tasks (1091 correct, 4457 wrong, 22 void), max. score: 13870
Quantile-Plot 485415109514900128089
CPU time39000 s48 s770 s15000 s0 s9.5 s97000 s
SoftwareSystems-DeviceDriversLinux64Large-ReachSafety
14 valid tasks (0 correct, 14 wrong, 0 void), max. score: 28
0000000
CPU time0 s0 s0 s0 s0 s0 s0 s
SoftwareSystems-DeviceDriversLinux64-MemSafety
532 valid tasks (532 correct, 0 wrong, 0 void), max. score: 266
Quantile-Plot 138197046
CPU time3700 s560 s33 s0 s25 s490 s
SoftwareSystems-OpenBSD-MemSafety
22 valid tasks (19 correct, 3 wrong, 3 void), max. score: 55
Quantile-Plot 00-112000
CPU time0 s0 s35 s0 s0 s0 s
SoftwareSystems-uthash-ReachSafety
342 valid tasks (0 correct, 342 wrong, 0 void), max. score: 684
Quantile-Plot 00019204080
CPU time0 s0 s0 s3300 s0 s190 s0 s
SoftwareSystems-uthash-MemSafety
666 valid tasks (600 correct, 66 wrong, 0 void), max. score: 1665
Quantile-Plot 800001130
CPU time2900 s0 s0 s0 s320 s0 s
SoftwareSystems-uthash-NoOverflows
204 valid tasks (0 correct, 204 wrong, 0 void), max. score: 408
Quantile-Plot 0336004080
CPU time0 s1500 s0 s0 s220 s0 s
Overall
113186 valid tasks (88492 correct, 24694 wrong, 4252 void), max. score: 261743
Quantile-Plot 5285112712137643369015116461835851127030
CPU time660000 s160000 s160000 s84000 s210000 s9100 s230000 s1200000 s
ParticipantsPlotsCPAcheckerCPA-witness2testDartagnanCProver-witness2testGWITMetaValNITWITSymbiotic-WitchUAutomizerWIT4JAVA

2. Validation of Correctness Witnesses


Ranking by Category (with Score-Based Quantile Plots)

ReachSafety
1. UAutomizer
2. CPAchecker
3. MetaVal
Quantile-Plot
MemSafety
1. UAutomizer
2. MetaVal
3. –
Quantile-Plot
ConcurrencySafety
1. UAutomizer
2. –
3. –
Quantile-Plot
NoOverflows
1. UAutomizer
2. CPAchecker
3. MetaVal
Quantile-Plot
Termination
1. –
2. –
3. –
SoftwareSystems
1. CPAchecker
2. UAutomizer
3. MetaVal
Quantile-Plot
Overall
1. UAutomizer
2. CPAchecker
3. MetaVal
Quantile-Plot

Table of All Results

In every table cell for competition results, we list the points in the first row and the CPU time (rounded to two significant digits) for successful runs in the second row.

The entry '–' means that the competition candidate was not executed in the category.
The definition of the scoring schema can be found in the literature [Proc. SAS 2022] and the categories are defined on the respective SV-COMP web page.

ParticipantsPlotsCPAcheckerMetaValUAutomizer
Representing Jury MemberHenrik WachowitzMartin SpiesslDaniel Dietsch
AffiliationLMU Munich, GermanyLMU Munich, GermanyUniversity of Freiburg, Germany
ReachSafety
32835 valid tasks (30679 correct, 2156 wrong, 785 void), max. score: 63144
Quantile-Plot 17816-8908821499
CPU time810000 s1100000 s1300000 s
ReachSafety-Arrays
1007 valid tasks (981 correct, 26 wrong, 0 void), max. score: 2014
Quantile-Plot 17-1687758
CPU time95 s3300 s3800 s
ReachSafety-BitVectors
503 valid tasks (496 correct, 7 wrong, 14 void), max. score: 1006
Quantile-Plot 542596493
CPU time16000 s53000 s14000 s
ReachSafety-ControlFlow
195 valid tasks (189 correct, 6 wrong, 12 void), max. score: 390
Quantile-Plot 162-5137168
CPU time1200 s1900 s4000 s
ReachSafety-ECA
4915 valid tasks (4719 correct, 196 wrong, 9 void), max. score: 9830
Quantile-Plot 3369174209
CPU time320000 s3700 s310000 s
ReachSafety-Floats
6470 valid tasks (6457 correct, 13 wrong, 192 void), max. score: 12940
Quantile-Plot 7918-55415269
CPU time89000 s190000 s370000 s
ReachSafety-Heap
1058 valid tasks (1036 correct, 22 wrong, 17 void), max. score: 2116
Quantile-Plot 700-17951123
CPU time18000 s35000 s19000 s
ReachSafety-Loops
6248 valid tasks (6123 correct, 125 wrong, 54 void), max. score: 12496
Quantile-Plot 276312637256
CPU time120000 s280000 s270000 s
ReachSafety-ProductLines
4406 valid tasks (4406 correct, 0 wrong, 241 void), max. score: 4406
Quantile-Plot 174626412949
CPU time100000 s510000 s94000 s
ReachSafety-Recursive
332 valid tasks (321 correct, 11 wrong, 3 void), max. score: 664
Quantile-Plot 0-2606229
CPU time0 s2600 s17000 s
ReachSafety-Sequentialized
1275 valid tasks (1178 correct, 97 wrong, 3 void), max. score: 2550
Quantile-Plot 74023297
CPU time110000 s33000 s10000 s
ReachSafety-XCSP
856 valid tasks (830 correct, 26 wrong, 50 void), max. score: 1712
Quantile-Plot 8967520
CPU time18000 s25000 s0 s
ReachSafety-Combinations
1150 valid tasks (1030 correct, 120 wrong, 3 void), max. score: 2300
Quantile-Plot 10112268
CPU time4100 s10000 s39000 s
ReachSafety-Hardware
4420 valid tasks (2913 correct, 1507 wrong, 187 void), max. score: 8840
Quantile-Plot 36601584
CPU time7500 s0 s110000 s
MemSafety
18784 valid tasks (18466 correct, 318 wrong, 60 void), max. score: 37568
Quantile-Plot 018219
CPU time0 s2600000 s
MemSafety-Arrays
207 valid tasks (140 correct, 67 wrong, 5 void), max. score: 414
Quantile-Plot 0311
CPU time0 s5000 s
MemSafety-Heap
712 valid tasks (646 correct, 66 wrong, 21 void), max. score: 1424
Quantile-Plot 0533
CPU time0 s33000 s
MemSafety-LinkedLists
407 valid tasks (371 correct, 36 wrong, 6 void), max. score: 814
Quantile-Plot 082
CPU time0 s930 s
MemSafety-Other
157 valid tasks (130 correct, 27 wrong, 14 void), max. score: 314
Quantile-Plot 0227
CPU time0 s5000 s
MemSafety-Juliet
17250 valid tasks (17174 correct, 76 wrong, 14 void), max. score: 34500
Quantile-Plot 024197
CPU time0 s2500000 s
MemSafety-MemCleanup
51 valid tasks (5 correct, 46 wrong, 0 void), max. score: 102
Quantile-Plot 027
CPU time0 s860 s
ConcurrencySafety
16441 valid tasks (16180 correct, 261 wrong, 381 void), max. score: 20551
Quantile-Plot 12994
CPU time490000 s
ConcurrencySafety-Main
3042 valid tasks (2797 correct, 245 wrong, 372 void), max. score: 6084
Quantile-Plot 3028
CPU time130000 s
ConcurrencySafety-MemSafety
5799 valid tasks (5799 correct, 0 wrong, 0 void), max. score: 5799
Quantile-Plot 1979
CPU time82000 s
ConcurrencySafety-NoOverflows
7600 valid tasks (7584 correct, 16 wrong, 9 void), max. score: 15200
Quantile-Plot 13868
CPU time280000 s
NoOverflows
39131 valid tasks (35585 correct, 3546 wrong, 2984 void), max. score: 78262
Quantile-Plot 27151065478
CPU time52000 s0 s1400000 s
NoOverflows-Main
3829 valid tasks (3489 correct, 340 wrong, 472 void), max. score: 7658
Quantile-Plot 531406453
CPU time52000 s0 s110000 s
NoOverflows-Juliet
35302 valid tasks (32096 correct, 3206 wrong, 2512 void), max. score: 70604
Quantile-Plot 0058649
CPU time0 s0 s1300000 s
Termination
0 valid tasks (0 correct, 0 wrong, 0 void)
CPU time
SoftwareSystems
25516 valid tasks (24650 correct, 866 wrong, 125 void), max. score: 37114
Quantile-Plot 3147-1213123027
CPU time130000 s2100000 s1100000 s
SoftwareSystems-AWS-C-Common-ReachSafety
1593 valid tasks (1569 correct, 24 wrong, 101 void), max. score: 3186
Quantile-Plot 766-22339466
CPU time31000 s65000 s60000 s
SoftwareSystems-BusyBox-ReachSafety
15 valid tasks (0 correct, 15 wrong, 0 void), max. score: 15
Quantile-Plot 0-3500
CPU time0 s38 s0 s
SoftwareSystems-BusyBox-MemSafety
168 valid tasks (130 correct, 38 wrong, 0 void), max. score: 336
00
CPU time0 s0 s
SoftwareSystems-BusyBox-NoOverflows
244 valid tasks (112 correct, 132 wrong, 0 void), max. score: 488
Quantile-Plot 147035
CPU time9900 s0 s790 s
SoftwareSystems-DeviceDriversLinux64-ReachSafety
20632 valid tasks (20242 correct, 390 wrong, 22 void), max. score: 41264
Quantile-Plot 5656-33750617805
CPU time89000 s2000000 s1000000 s
SoftwareSystems-DeviceDriversLinux64Large-ReachSafety
14 valid tasks (14 correct, 0 wrong, 0 void), max. score: 14
Quantile-Plot 0130
CPU time0 s1300 s0 s
SoftwareSystems-DeviceDriversLinux64-MemSafety
150 valid tasks (0 correct, 150 wrong, 0 void), max. score: 150
Quantile-Plot 01
CPU time0 s55 s
SoftwareSystems-OpenBSD-MemSafety
2 valid tasks (2 correct, 0 wrong, 1 void), max. score: 2
00
CPU time0 s0 s
SoftwareSystems-uthash-ReachSafety
995 valid tasks (995 correct, 0 wrong, 1 void), max. score: 995
Quantile-Plot 04860
CPU time0 s30000 s0 s
SoftwareSystems-uthash-MemSafety
683 valid tasks (566 correct, 117 wrong, 0 void), max. score: 1366
00
CPU time0 s0 s
SoftwareSystems-uthash-NoOverflows
1020 valid tasks (1020 correct, 0 wrong, 0 void), max. score: 1020
000
CPU time0 s0 s0 s
Overall
132707 valid tasks (125560 correct, 7147 wrong, 4335 void), max. score: 190824
Quantile-Plot 30076-16516693049
CPU time990000 s3300000 s6800000 s
ParticipantsPlotsCPAcheckerMetaValUAutomizer

Note on meta-categories: The score is not the sum of scores of the sub-categories (normalization). The run time is the sum of run times of the sub-categories, rounded to two significant digits.


If you did not find what you are looking for, please do not hesitate to contact Dirk Beyer (competition chair).

Imprint and Privacy Policy