Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing

We evaluate the faithfulness of explanation methods and find that traditional tests on faithfulness encounter the random dominance problem, ie, the random selection performs the best, especially for complex data. To further solve this problem, we propose three trend-based faithfulness tests and empirically demonstrate that the new trend tests can better assess faithfulness than traditional tests on image, natural language and security tasks. We implement the assessment system and evaluate ten popular explanation methods. Benefiting from the trend tests, we successfully assess the explanation methods on complex data for the first time, bringing unprecedented discoveries and inspiring future research. Downstream tasks also greatly benefit from the tests. For example, model debugging equipped with faithful explanation methods performs much better for detecting and correcting accuracy and security problems. [CODE]

Our paper was accepted by the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS 2023). [PDF]


AI-Guardian: Defeating Adversarial Attacks using Backdoors

We present AI-Guardian, a novel approach to defeating adversarial attacks that leverages intentionally embedded backdoors. We design a unique backdoor, called bijection backdoor, to change the behavior of the protected model, mitigating the impact of adversarial examples on the final outputs without affecting the model's performance on the original tasks. We conduct extensive evaluation and experimental results show that AI-Guardian reduces the attack success rate from 97.3% to 3.2%, with only a 0.9% decline on the clean data accuracy. Furthermore, AI-Guardian introduces only 0.36% overhead to the model prediction time, almost negligible in most cases. [CODE]

Our paper was accepted by the 44th IEEE Symposium on Security and Privacy (S&P 2023). [PDF]


AURC: Detecting Errors in Program Code and Documentation

We present AURC, a static framework for detecting code bugs of incorrect return checks and document defects. We observe that three objects participate in the API invocation, the document, the caller (code that invokes API), and the callee (the source code of API). Mutual corroboration of these three objects boosts the detection of code and documentation errors. We evaluated AURC on ten popular codebases. AURC discovered 529 new bugs and 224 new document defects. Maintainers acknowledge our findings and have accepted 222 code patches and 76 document patches. [CODE]

Our paper was accepted by the 32nd USENIX Security Symposium (USENIX Security 2023). [PDF]


CarpetFuzz: Automatic Program Option Constraint Extraction from Documentation for Fuzzing

We proposed a novel technique for identifying and extracting constraints among program options from the documentation. To the best of our knowledge, this is the first study that tries to use NLP to automatically figure out the relationships among program options from the documentation. With the help of this technique, AFL finds 45.97% more paths that other fuzzers cannot discover. We implemented the prototype tool, CarpetFuzz, and evaluated it on 20 popular real-world open-source programs. CarpetFuzz accurately extracted 88.85% of the relationships from their documents. Through fuzzing these programs with the valid option combinations obtained by CarpetFuzz, 57 unique crashes have been found, 30 of which have been assigned with CVE IDs.[CODE]

Our paper was accepted by the 32nd USENIX Security Symposium (USENIX Security 2023). [PDF]


A Data-free Backdoor Injection Approach in Neural Networks

We propose a novel backdoor injection approach in a "data-free" manner. We design a novel loss function for fine-tuning the original model into the backdoored one using the substitute data that is irrelevant to the main task, and optimize the fine-tuning to balance the backdoor injection and the performance on the main task. We conduct extensive experiments on various deep learning scenarios, and the evaluation results demonstrate that our data-free backdoor injection approach can efficiently embed backdoors with a nearly 100% attack success rate. [CODE]

Our paper was accepted by the 32nd USENIX Security Symposium (USENIX Security 2023). [PDF]