Compiler/Interpreter Fuzzing#

PyRTFuzz (CCS'23)#

  • Python runtime consists of the interpreter and runtime libraries of the language.
  • [CPython] Since 2008, more than 1,000 bug-related issues have been reported annually, and the number of bugs reported per year has consistently remained close to 2,000 in the last five years.
  • [CPython] Our analysis revealed that most bugs (86.8%) occurred in the Python runtime libraries, while the remaining 13.2% occurred in the Python interpreter core.
  • [CPython] Furthermore, out of 165 modules extracted from the CPython source code, 164 modules were found to have reported bugs.
  • Semantically and syntactically correct programs are there (looking at CodeAlchemist).
  • Remaining challenges: 1) without paying sufficient attention to how these runtime APIs are used, 2) with no varying inputs, 3) A comprehensive approach to testing the Python runtime should address both the interpreter core and runtime libraries as well as interactions between the two.
  • Phase 1: Runtime API Description Extraction: Static extraction (AST) -> Untyped API description -> Dynamic refinement (unittest) -> Typed API description
  • Phase 2: Specification generation (Basic (OO/PO) + Extend (While/For/If/Call/With)) -> Python code generation (top-down wrapping, opt for API coverage/APP diversity/APP validity, with seamless data transfer).
  • Phase 3: Instrumentation (C + Python code), Custom Mutations (of input values).
  • JSfunfuzz: 2007, industrial, generation-based, SpiderMonkey JavaScript engine
  • LangFuzz: 2012, Usenix Security
  • TreeFuzz: 2016, industrial, generation-based
  • JVM testing: 2016, PLDI
  • Skyfile: 2017, SP, generation-based
  • Fuzzil: 2018, mutation-based, JS engine
  • DeepSmith: 2018, ISSTA, machine-learning-based
  • JVM testing: 2019 ICSE
  • CodeAlchemist: 2019, NDSS, generation-based, JS engine, both semantically and syntactically correct
  • Superion: 2019, ICSE, mutation-based