AttackBench

The AttackBench framework wants to fairly compare gradient-based attacks based on their robustness evaluation curves. To this end, we derive a process involving five distinct stages, as depicted below.

  • In stage (1), we construct a list of diverse non-robust and robust models to assess the attacks' impact on various settings, thus testing their adaptability to diverse defensive strategies.
  • In stage (2), we define an environment for testing gradient-based attacks under a systematic and reproducible protocol. Specifically, AttackBench limits the number of forward and backward queries to the model, such that all attacks are compared within a given maximum query budget. This step provides common ground with shared assumptions, advantages, and limitations. We then run the attacks against the selected models individually and collect the performance metrics of interest in our analysis, which are perturbation size, execution time, and query usage.
  • In stage (3), we gather all the previously-obtained results, comparing attacks with the novel local optimality metric that quantifies how close an attack is to the optimal solution.
  • Finally, in stage (4), we aggregate the optimality results from all considered models, and in stage (5) we rank the attacks based on their average optimality, namely global optimality.


AttackBench
A comprehensive overview of the five stages of AttackBench. Each attack is tested in fair conditions, and then it is ranked through the optimality metric. The best attack is the one that produces the higher numbers of minimally-perturbed adversarial examples with fewer queries and less time.

Experimental coverage

2

Datasets

9

Models

6

Libraries

20

Distinct Attacks

102

Implementations

815

Comparisons

We perform an extensive experimental analysis that compares 20 attacks (listed below), retrieving their original implementation and collecting the other implementations available among popular adversarial attack libraries. We empirically test a total of 102 techniques, re-evaluating them in terms of their runtime, success rate and perturbation distance, as well as with our newly introduced optimality metrics. While implementing AttackBench, we collected additional insights, including sub-optimal implementations, attacks returning incorrect results, and errors in the source code that prevent attacks from concluding their runs correctly. These additional insights could lead to a complete re-evaluation of the State of the Art, as incorrect evaluations might have impacted and inflated results in published work.





Authors


Antonio Emanuele Cinà*
University of Genoa
Jérôme Rony*
ÉTS Montréal
Maura Pintor
University of Cagliari
Luca Demetrio
University of Genoa
Ambra Demontis
University of Cagliari
Battista Biggio
University of Cagliari
Ismail Ben Ayed
ÉTS Montréal
Fabio Roli
University of Genoa

* Equal contribution


Citation


@inproceedings{CinaRony2024AttackBench,
  author = {Antonio Emanuele Cinà, Jérôme Rony, Maura Pintor, Luca Demetrio, Ambra Demontis, Battista Biggio, Ismail Ben Ayed, Fabio Roli },
  title = {AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year = {2025},
}
+