Coverage Control in libFuzzer#
This article reveals how to control the coverage collection in libFuzzer.
How to use libFuzzer?#
To use libFuzzer, it is necessary to develop a fuzz target. Please refer to this and this to check how to develop a fuzz target and how to compile it with Clang.
How to compiler LLVM project?#
Download llvm-project and compile like below. Please also refer to this and this.
git clone https://github.com/llvm/llvm-project.git --depth=1
mkdir build; cd build
cmake -G Ninja -DLLVM_USE_LINKER=gold -DLLVM_ENABLE_PROJECTS="clang;compiler-rt" -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_OPTIMIZED_TABLEGEN=ON ../llvm/
ninja clang compiler-rt
export PATH=$PWD/bin:$PATH
Details beneath -fsanitize=fuzzer
#
As we all know, when compiling a program, a compiler will automatically expand
its compiler flags. If -v
is enable, the compiler will show all flags.
Considering a very simple example: clang -o foo -fsanitize=fuzzer foo.c
, the
full flags related to -fsanitize
are in the following.
# SIMPLIFIED
"$LLVM/bin/clang-13" -cc1 \
-triple x86_64-unknown-linux-gnu \
-emit-obj \
-target-cpu x86-64 -v \
-fsanitize-coverage-type=1 -fsanitize-coverage-type=3 \
-fsanitize-coverage-indirect-calls \
-fsanitize-coverage-trace-cmp \
-fsanitize-coverage-inline-8bit-counters \
-fsanitize-coverage-pc-table \
-fsanitize-coverage-stack-depth \
-fsanitize-coverage-trace-state \
-fsanitize=fuzzer,fuzzer-no-link \
-o /tmp/main-d501e8.o -x c main.c
# SIMPLIFIED
"/usr/local/bin/ld" -z relro \
--hash-style=gnu --eh-frame-hdr \
-m elf_x86_64 \
-dynamic-linker /lib64/ld-linux-x86-64.so.2 \
-o main \
$LLVM/lib/clang/13.0.0/lib/linux/libclang_rt.fuzzer-x86_64.a \
$LLVM/lib/clang/13.0.0/lib/linux/libclang_rt.fuzzer_interceptors-x86_64.a \
$LLVM/lib/clang/13.0.0/lib/linux/libclang_rt.ubsan_standalone-x86_64.a \
--dynamic-list=$LLVM/lib/clang/13.0.0/lib/linux/libclang_rt.ubsan_standalone-x86_64.a.syms \
/tmp/main-d501e8.o
It's SanitizerArgs()
that parses SanCov and sanitizers flags. The path to it
is in the following.
[#0] clang::driver::SanitizerArgs::SanitizerArgs()
[#1] clang::driver::ToolChain::getSanitizerArgs() const()
[#2] clang::driver::toolchains::Linux::isPIEDefault() const()
[#3] clang::driver::tools::ParsePICArgs()
[#4] clang::driver::tools::Clang::ConstructJob()
[#5] clang::driver::Driver::BuildJobsForActionNoCache()
[#6] clang::driver::Driver::BuildJobsForAction()
[#7] clang::driver::Driver::BuildJobsForActionNoCache()
[#8] clang::driver::Driver::BuildJobsForAction()
[#9] clang::driver::Driver::BuildJobs()
[#10] clang::driver::Driver::BuildCompilation()
[#11] main()
In SanitizerArgs()
, parseArgValues
will parse six sanitizer related flags.
parseArgValues
will invoke parseSanitizerValue
defined in
clang/lib/Basic/Sanitizers.cpp
to parse sanitizers defined
clang/include/clang/Basic/Sanitizers.def
.
// clang/include/clang/Basic/Sanitizers.def
// libFuzzer
SANITIZER("fuzzer", Fuzzer)
// libFuzzer-required instrumentation, no linking.
SANITIZER("fuzzer-no-link", FuzzerNoLink)
In SanitizerArgs()
, parseCoverageFeatures
will parse two flags:
-fsanitize-coverage=<value>
and -fno-sanitize-coverage=<value>
to control
what kind of coverage information for sanitizers. Try clang --help | grep
coverage
to see more related flags.
int parseCoverageFeatures(const Driver &D, const llvm::opt::Arg *A) {
assert(A->getOption().matches(options::OPT_fsanitize_coverage) ||
A->getOption().matches(options::OPT_fno_sanitize_coverage));
int Features = 0;
for (int i = 0, n = A->getNumValues(); i != n; ++i) {
const char *Value = A->getValue(i);
int F = llvm::StringSwitch<int>(Value)
.Case("func", CoverageFunc)
.Case("bb", CoverageBB)
.Case("edge", CoverageEdge)
.Case("indirect-calls", CoverageIndirCall)
.Case("trace-bb", CoverageTraceBB)
.Case("trace-cmp", CoverageTraceCmp)
.Case("trace-div", CoverageTraceDiv)
.Case("trace-gep", CoverageTraceGep)
.Case("8bit-counters", Coverage8bitCounters)
.Case("trace-pc", CoverageTracePC)
.Case("trace-pc-guard", CoverageTracePCGuard)
.Case("no-prune", CoverageNoPrune)
.Case("inline-8bit-counters", CoverageInline8bitCounters)
.Case("inline-bool-flag", CoverageInlineBoolFlag)
.Case("pc-table", CoveragePCTable)
.Case("stack-depth", CoverageStackDepth)
.Default(0);
if (F == 0)
D.Diag(clang::diag::err_drv_unsupported_option_argument)
<< A->getOption().getName() << Value;
Features |= F;
}
return Features;
}
parseCoverageFeatures
clearly show what kind of coverage we can control. In
the following are several tips to enable and disable these coverage flags.
+ func
, bb
, and edge
are mutually exclusive
+ trace-bb
is deprecated, use trace-pc-guard
instead
+ 8bit-counter
is deprecated, use trace-pc-guard
instead
+ if use one of func
, bb
, and edge
, trace-pc-guard
or trace-pc
must be enabled
+ if one of trace-pc
, trace-pc-guard
, inline-8bit-counter
, and inline-bool-flag
is enabled without any func
, bb
, or edge
, then edge
is added by default
+ stack-depth
needs func
Returning from SanitizerArgs()
, ConstructJob
will invoke addArgs
to
append flags to the command line clang -o foo -fsanitize=fuzzer foo.c
.
[#0] 0x55555a470fa2 → clang::driver::SanitizerArgs::addArgs()
[#1] 0x55555a3c6572 → clang::driver::tools::Clang::ConstructJob()
[#2] 0x55555a345a9a → clang::driver::Driver::BuildJobsForActionNoCache()
[#3] 0x55555a343f99 → clang::driver::Driver::BuildJobsForAction()
[#4] 0x55555a344bad → clang::driver::Driver::BuildJobsForActionNoCache()
[#5] 0x55555a343f99 → clang::driver::Driver::BuildJobsForAction()
[#6] 0x55555a34280e → clang::driver::Driver::BuildJobs()
[#7] 0x55555a3345c4 → clang::driver::Driver::BuildCompilation()
addArgs
will add corresponding flags according to the table below.
std::pair<int, const char *> CoverageFlags[] = {
std::make_pair(CoverageFunc, "-fsanitize-coverage-type=1"),
std::make_pair(CoverageBB, "-fsanitize-coverage-type=2"),
std::make_pair(CoverageEdge, "-fsanitize-coverage-type=3"),
std::make_pair(CoverageIndirCall, "-fsanitize-coverage-indirect-calls"),
std::make_pair(CoverageTraceBB, "-fsanitize-coverage-trace-bb"),
std::make_pair(CoverageTraceCmp, "-fsanitize-coverage-trace-cmp"),
std::make_pair(CoverageTraceDiv, "-fsanitize-coverage-trace-div"),
std::make_pair(CoverageTraceGep, "-fsanitize-coverage-trace-gep"),
std::make_pair(Coverage8bitCounters, "-fsanitize-coverage-8bit-counters"),
std::make_pair(CoverageTracePC, "-fsanitize-coverage-trace-pc"),
std::make_pair(CoverageTracePCGuard,
"-fsanitize-coverage-trace-pc-guard"),
std::make_pair(CoverageInline8bitCounters,
"-fsanitize-coverage-inline-8bit-counters"),
std::make_pair(CoverageInlineBoolFlag,
"-fsanitize-coverage-inline-bool-flag"),
std::make_pair(CoveragePCTable, "-fsanitize-coverage-pc-table"),
std::make_pair(CoverageNoPrune, "-fsanitize-coverage-no-prune"),
std::make_pair(CoverageStackDepth, "-fsanitize-coverage-stack-depth"),
std::make_pair(CoverageTraceState, "-fsanitize-coverage-trace-state")};
Returning from Clang::contructJob
, addSanitizerRuntimes
will expand linker
flags.
[#0] 0x55555a3dcbe2 → clang::driver::tools::addSanitizerRuntimes()
[#1] 0x55555a40cac0 → clang::driver::tools::gnutools::Linker::ConstructJob()
[#2] 0x55555a345a9a → clang::driver::Driver::BuildJobsForActionNoCache()
[#3] 0x55555a343f99 → clang::driver::Driver::BuildJobsForAction()
[#4] 0x55555a34280e → clang::driver::Driver::BuildJobs()
[#5] 0x55555a3345c4 → clang::driver::Driver::BuildCompilation()
[#6] 0x555557ddf8f7 → main()
In addSanitizerRuntimes
, collectSanitizerRuntimes
will collect libraries for
sanitizers.
+ Use -shared-libsan
(by default) or -static-libsan
to collect dynamic or static libraries
+ use -fsanitize-link-runtime"
(by default) or -fno-sanitize-link-runtime
to switch on or off linking
To use ASAN, assign -fsanitize=address
. If only -fsanitize=fuzzer
, then
UBSAN will be enabled.
bool SanitizerArgs::needsUbsanRt() const {
// All of these include ubsan.
if (needsAsanRt() || needsMsanRt() || needsHwasanRt() || needsTsanRt() ||
needsDfsanRt() || needsLsanRt() || needsCfiDiagRt() ||
(needsScudoRt() && !requiresMinimalRuntime()))
return false;
return (Sanitizers.Mask & NeedsUbsanRt & ~TrapSanitizers.Mask) ||
CoverageFeatures;
}
Shortly, if no other sanitizers is enabled, and if any coverage is enabled, UBSAN will be enabled.
After collectSanitizerRuntimes
, addSanitizerRuntimes
will update runtimes regarding to -fsanitizer=fuzzer
.
bool SanitizerArgs::needsFuzzerInterceptors() const {
return needsFuzzer() && !needsAsanRt() && !needsTsanRt() && !needsMsanRt();
}
bool tools::addSanitizerRuntimes(...) {
...
addSanitizerRuntime(TC, Args, CmdArgs, "fuzzer", false, true);
if (SanArgs.needsFuzzerInterceptors())
addSanitizerRuntime(TC, Args, CmdArgs, "fuzzer_interceptors", false, true);
}
BTW, fuzz_interceptors
will be appended if no ASAN, TSAN, MSAN runtime is
enabled.
Finally, to narrow down the coverage collection, we can construct a command in the following.
clang -o foo -fsanitize=fuzzer \
-fno-sanitize-coverage=indirect-calls,trace-cmp,stack-depth,pc-table \
foo.c
In this way, only edge
and inline-8bit-counters
are enabled.
Flow of instrumentations#
The module pass SanitizerCoverage
(llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp) will instrument
coverage flag to each module.
In the first state, SanitizerCoverage
will construct the IR of stubs to be
instrumented. A classic pattern is in the following.
const char SanCovTracePCIndirName[] = "__sanitizer_cov_trace_pc_indir";
SanCovTracePCIndir = M.getOrInsertFunction(SanCovTracePCIndirName, VoidTy, IntptrTy);
In the second state, SanitizerCoverage
will traverse all IR code and do
instrumentation at the proper position.
IRB.CreateCall(SanCovTracePCIndir, IRB.CreatePointerCast(Callee, IntptrTy));
The overall flow of SanitizerCoverage
is in the following.
instrumentModule
- stage 1
- for (auto &F : M) { instrumentFunction(F); }
instrumentFunction
- split edges if edge coverage[^1]
- for (auto &BB : F) {
BlocksToInstrument.push_back(&BB);
for (auto &Inst: BB) { /* simplified */
if (IndirectCalls && xxx) IndirCalls.push_back(&Inst)
if (TraceCmp && xxx) CmpTraceTargets.push_back(&Inst)
if (TraceCmp && xxx) SwitchTraceTargets.push_back(&Inst)
if (TraceDiv && xxx) DivTraceTargets.push_back(BO)
if (TraceGep && xxx) GepTraceTargets.push_back(BO)
if (TraceStackDepth && xxx) IsLeafFunc = false;
}
}
- stage 2
InjectCoverage(F, BlocksToInstrument, IsLeafFunc);
InjectCoverageForIndirectCalls(F, IndirCalls);
InjectTraceForCmp(F, CmpTraceTargets);
InjectTraceForSwitch(F, SwitchTraceTargets);
InjectTraceForDiv(F, DivTraceTargets);
InjectTraceForGep(F, GepTraceTargets);
The key function in stage 2 is InjectCoverage
.
InjectCoverage
first create FunctionGuardArray, Function8bitCounterArray,
FunctionBoolArray, or FunctionPCsArray in CreateFunctionLocalArrays
, then
invoke InjectCoverageAtBlock
to handle each basic blocks.
InjectCoverageAtBlock
will instrument SanCovTracePC
, SanCovTracePCGuard
,
Inline8BitCounters
, or InlineBoolFlag
, or update the lowest stack frame, for
each basic block.
Details of stubs#
Please also refer to this.
__sanitizer_cov_indir_call#
This will be in front of an indirect call. It requires at least one of
trace-pc
, trace-pc-guard
, inline-8bit-counters
, and inline-bool-flag
.
It accepts one parameter, the callee address. The address of the caller is
passed implicitly via caller PC. Importantly, if the callee is inline assembly,
the indirect call will not be instrumented. Its implementation in libFuzzer is
in the following. In the end, new information will be updated into the value
profile.
#define GET_CALLER_PC() __builtin_return_address(0)
void TracePC::HandleCallerCallee(uintptr_t Caller, uintptr_t Callee) {
const uintptr_t kBits = 12;
const uintptr_t kMask = (1 << kBits) - 1;
uintptr_t Idx = (Caller & kMask) | ((Callee & kMask) << kBits);
ValueProfileMap.AddValueModPrime(Idx);
}
void __sanitizer_cov_trace_pc_indir(uintptr_t Callee) {
uintptr_t PC = reinterpret_cast<uintptr_t>(GET_CALLER_PC());
fuzzer::TPC.HandleCallerCallee(PC, Callee);
}
__sanitizer_cov_trace_[const_]cmp[1|2|4|8]#
These will be in front of a cmp instruction with const operand or not. They accept both operands to be compared. The address of the caller is passed implicitly via caller PC. One of its implementation in libFuzzer is in the following. In the end, new information will be updated into the value profile.
#define GET_CALLER_PC() __builtin_return_address(0)
template <class T>
void TracePC::HandleCmp(uintptr_t PC, T Arg1, T Arg2) {
uint64_t ArgXor = Arg1 ^ Arg2;
if (sizeof(T) == 4)
TORC4.Insert(ArgXor, Arg1, Arg2);
else if (sizeof(T) == 8)
TORC8.Insert(ArgXor, Arg1, Arg2);
uint64_t HammingDistance = Popcountll(ArgXor); // [0,64]
uint64_t AbsoluteDistance = (Arg1 == Arg2 ? 0 : Clzll(Arg1 - Arg2) + 1);
ValueProfileMap.AddValue(PC * 128 + HammingDistance);
ValueProfileMap.AddValue(PC * 128 + 64 + AbsoluteDistance);
}
void __sanitizer_cov_trace_cmp1(uint8_t Arg1, uint8_t Arg2) {
uintptr_t PC = reinterpret_cast<uintptr_t>(GET_CALLER_PC());
fuzzer::TPC.HandleCmp(PC, Arg1, Arg2);
}
Similarly stubs are __sanitizer_cov_trace_switch
,
__sanitizer_cov_trace_div[4|8]
, and __sanitizer_cov_trace_gep
. They all
invoke HandleCmp at the end to update new information into the value profile.
__sanitizer_cov_trace_pc#
This will be at the entry of each basic block. The address of the caller is passed implicitly via caller PC. This is deprecated.
__sanitizer_cov_trace_pc_guard[_init]#
__sanitizer_cov_trace_pc_guard
will be at the entry of each basic block after
__sanitier_cov_trace_pc
. The address of the caller is passed implicitly via
caller PC. They are deprecated.
Each function would have a function guard array int32_t FunctionGuardArray[]
whose size is the number of the basic blocks. This array is associated with
sancov_guards
section. __sanitizer_cov_trace_pc_guard
accepts
FunctionGuardArray[IdxofBB]
as the guard.
If any function guard array, SanCov will create a section named
sancov.module_ctor_trace_pc_guard
to invoke
__sanitizer_cov_trace_pc_guard_init
to initialize sancov_guards
for each
module.
[NOT SURE] In the end, after linking, there will be one sancov_guards
and one
sancov.module_ctor_trace_pc_guard
.
__sanitizer_cov_8biPCTableEntryIdxpc_guard`.#
Each function would have a function 8bit counter array int8_t
Function8BitArray[]
whose size is the number of the basic blocks. This array is
associated with sancov_cntrs
section. If a basic block is visited, then the
corresponding byte in the array will be increased by 1.
If any function 8bit array, SanCov will create a section named
sancov.module_ctor_8bit_counters
to invoke
__sanitizer_cov_8bit_counters_init
to initialize sancov_cntrs
for each
module.
[NOT SURE] In the end, after linking, there will be one sancov_cntrs
and one
sancov.module_ctor_8bit_counters
.
__sanitizer_cov_8bit_counters_init
is defined in the following. It shows the
counter information flows to Modules
in the libFuzzer. In short, Modules
records the start and the stop address of the sancov_cntrs
divided by page
(Region
).
void TracePC::HandleInline8bitCountersInit(uint8_t *Start, uint8_t *Stop) {
if (Start == Stop) return;
if (NumModules &&
Modules[NumModules - 1].Start() == Start)
return;
assert(NumModules <
sizeof(Modules) / sizeof(Modules[0]));
auto &M = Modules[NumModules++];
uint8_t *AlignedStart = RoundUpByPage(Start);
uint8_t *AlignedStop = RoundDownByPage(Stop);
size_t NumFullPages = AlignedStop > AlignedStart ?
(AlignedStop - AlignedStart) / PageSize() : 0;
bool NeedFirst = Start < AlignedStart || !NumFullPages;
bool NeedLast = Stop > AlignedStop && AlignedStop >= AlignedStart;
M.NumRegions = NumFullPages + NeedFirst + NeedLast;;
assert(M.NumRegions > 0);
M.Regions = new Module::Region[M.NumRegions];
assert(M.Regions);
size_t R = 0;
if (NeedFirst)
M.Regions[R++] = {Start, std::min(Stop, AlignedStart), true, false};
for (uint8_t *P = AlignedStart; P < AlignedStop; P += PageSize())
M.Regions[R++] = {P, P + PageSize(), true, true};
if (NeedLast)
M.Regions[R++] = {AlignedStop, Stop, true, false};
assert(R == M.NumRegions);
assert(M.Size() == (size_t)(Stop - Start));
assert(M.Stop() == Stop);
assert(M.Start() == Start);
NumInline8bitCounters += M.Size();
}
void __sanitizer_cov_8bit_counters_init(uint8_t *Start, uint8_t *Stop) {
fuzzer::TPC.HandleInline8bitCountersInit(Start, Stop);
}
__sanitizer_cov_bool_flag_init#
The inline bool flag will be at the entry of each basic block after the inline 8bit counters.
Each function would have a function 1 bit array int1_t FunctionBoolArray[]
whose size is the number of the basic blocks. This array is associated with
sancov_bools
section. If a basic block is visited, then the corresponding bit
in the array will be true.
If any function bool array, SanCov will create a section named
sancov.module_ctor_bool_flag
to invoke __sanitizer_cov_bool_flag_init
to
initilize sancov_bools
for each module.
[NOT SURE] In the end, after linking, there will be one sancov_bools
and one
sancov.module_ctor_bool_flag
.
__sanitizer_cov_bool_flag_init
is not defined in the libFuzzer.
__sanitizer_cov_pcs_init#
For each function, SanCov creates a PC array associated with sancov_pcs
to
store {PC, PCFlags}
pairs. PC is the address of the corresponding basic block,
and a PCFlags describes the basic block is the function entry block (1) or not
(0).
If one of the trace-pc-guard
, inline-8bit-counters
, and inline-bool-flag
,
and any function PC array, SanCov will invoke __sanitizer_cov_pcs_init
to
initilize sancov_pcs
for each module in one of the section: sancov.xxx
.
[NOT SURE] In the end, after linking, there will be one sancov_pcs
.
__sanitizer_cov_pcs_init
is defined in the following. In short, the
information flows to ModulePCTable
in libFuzzer.
void TracePC::HandlePCsInit(const uintptr_t *Start, const uintptr_t *Stop) {
const PCTableEntry *B = reinterpret_cast<const PCTableEntry *>(Start);
const PCTableEntry *E = reinterpret_cast<const PCTableEntry *>(Stop);
if (NumPCTables && ModulePCTable[NumPCTables - 1].Start == B) return;
assert(NumPCTables < sizeof(ModulePCTable) / sizeof(ModulePCTable[0]));
ModulePCTable[NumPCTables++] = {B, E};
NumPCsInPCTables += E - B;
}
void __sanitizer_cov_pcs_init(const uintptr_t *pcs_beg,
const uintptr_t *pcs_end) {
fuzzer::TPC.HandlePCsInit(pcs_beg, pcs_end);
}
A brief list of (flag, stubs, and information sink in libFuzzer)#
Flag | Stubs | Information Sink |
---|---|---|
trace-pc,indirect-calls | __sanitizer_cov_trace_pc_indirect | ValueProfileMap |
trace-pc-guard,indirect-calls | __sanitizer_cov_trace_pc_indirect | ValueProfileMap |
inline-8bit-counters,indirect-calls | __sanitizer_cov_trace_pc_indirect | ValueProfileMap |
inline-bool-flag,indirect-calls | __sanitizer_cov_trace_pc_indirect | ValueProfileMap |
trace-cmp | __sanitizer_cov_trace_[const_]cmp[1|2|4|8] | ValuleProfileMap |
trace-switch | __sanitizer_cov_trace_switch | ValuleProfileMap |
trace-div | __sanitizer_cov_trace_div[4|8] | ValuleProfileMap |
trace-gep | __sanitizer_cov_trace_gep | ValuleProfileMap |
trace-pc | __sanitizer_cov_trace_pc | deprecated |
trace-pc-guard | __sanitizer_cov_trace_pc_guard[_init] | deprecated |
inline-8bit-counters | __sanitizer_cov_8bit_counters_init | Modules |
inline-bool-flag | __sanitizer_cov_bool_flag_init | not supported |
trace-pc-guard,pc-table | __sanitizer_cov_pcs_init | ModulePCTable |
inline-8bit-guard,pc-table | __sanitizer_cov_pcs_init | ModulePCTable |
inline-bool-flag,pc-table | __sanitizer_cov_pcs_init | ModulePCTable |
Details of coverage collection algorithm and implementation#
Recalling that several stubs are instrumented to the target program. The implementation of these stubs are implemented in libFuzzer by default or can be replaced by developers. Most of them are defined in compiler-rt/lib/fuzzer/FuzzerTracePC.cpp. After testing an input, these stubs will update corresponding information. LibFuzzer will then calculate the coverage with the information. A detailed flow is in the following.
ExecuteCallback
- TPC.ResetMaps();
- CB(DataCopy, Size);
TPC.CollectFeatures();
if (NumNewFeatures || ForceAddToCorpus) {
TPC.UpdateObservedPCs();
}
ResetMaps#
template <class Callback>
void IterateCounterRegions(Callback CB) {
for (size_t m = 0; m < NumModules; m++)
for (size_t r = 0; r < Modules[m].NumRegions; r++)
CB(Modules[m].Regions[r]);
}
void TracePC::ClearInlineCounters() {
IterateCounterRegions([](const Module::Region &R){
if (R.Enabled)
memset(R.Start, 0, R.Stop - R.Start);
});
}
void ResetMaps() {
ValueProfileMap.Reset();
ClearExtraCounters();
ClearInlineCounters();
}
TPC.ResetMaps reset 1) ValueProfileMap, a bit map for data flow value, 2)
ExtraCounters, 3) InlineCouters, the area for inline-8bit-counters
.
CollectFeatures#
size_t NumUpdatesBefore = Corpus.NumFeatureUpdates();
TPC.CollectFeatures([&](size_t Feature) {
if (Corpus.AddFeature(Feature, Size, Options.Shrink))
// *
});
TPC.CollectFeatures
accepts a HandleFeature function pointer. In the
HandleFeature, it accepts a Feature that is calculated from all the coverage
information (Information Sink), and then adds the feature to the corpus.
AddFeature is part of the HandleFeature function to log features. libFuzzer will map a feature to the size of the corresponding input. If the size is zero, the feature is not visited.
bool AddFeature(size_t Idx, uint32_t NewSize, bool Shrink) {
Idx = Idx % kFeatureSetSize;
uint32_t OldSize = GetFeature(Idx);
if (OldSize == 0 || (Shrink && OldSize > NewSize)) {
if (OldSize > 0) {
// ...
} else {
NumAddedFeatures++;
// ...
}
NumUpdatedFeatures++;
InputSizesPerFeature[Idx] = NewSize;
return true;
}
return false;
}
In TPC.CollectFeatures
, it maps the information sinks to features like below.
// Modules (Inlint8BitCounters)
FirstFeature=0
feature
0 8 w/o counters w/ counters
+----+
BB00 +d'02+ +0 +(0*8 + log(2))
+----+
BB01 +d'80+ +1 +(1*8 + log(80))
+----+
FirstFeature += NumOfBits(Modules)
// ExtracCounters
0 8 w/o counters w/ counters
+----+
CNT0 +d'02+ +0 +(0*8 + log(2))
+----+
CNT1 +d'80+ +1 +(1*8 + log(80))
+----+
FirstFeature += NumOfBits(ExtraCounters)
// ValueProfileMap
0 8
+----+
VPM0 +d'02+ +6 (b'00000010)
+----+
VPM8 +d'82+ +8/+14 (b'10000010)
+----+
FirstFeature += NumOfBits(ValueProfileMap)
// StackDepth
+ StackDepthStepFunction(MaxStackOffset / 8)
In general, we map coverage information to a linear feature from zero. For the
Modules
, libFuzzer checks each byte that records how many times a basic block is
visited. If without counters, the feature is the start feature plus the index of
the bytes. For BB01, if the index is 1, then the feature is 1. If with counter,
it will take d'80 into consideration. The feature is 0 plus log(80). The
logarithmic function guarantees the feature will not overflow 8 bits. In the
end, the start of the new features will be updated by adding the bit number of
modules. ExtraCounters
works similarly. For the ValueProfileMap, each non-zero
bit is a new feature. For the stack depth, it leverages a hash function
StackDepthStepFunction.
UpdateObservedPCs#
If any new features, libFuzzer will update observed PCs.
for (size_t i = 0; i < NumModules; i++) {
auto &M = Modules[i];
for (size_t r = 0; r < M.NumRegions; r++) {
auto &R = M.Regions[r];
if (!R.Enabled) continue;
for (uint8_t *P = R.Start; P < R.Stop; P++)
if (*P) // if this basic block is visited
// then get the PC of the visited the basic block
// then invoke Observe
Observe(&ModulePCTable[i].Start[M.Idx(P)]);
}
}
First, if a basic block is visited, libFuzzer will get the PC of the visited the
basic in the PCTable, and invoke Observe
.
Vector<uintptr_t> CoveredFuncs;
auto ObservePC = [&](const PCTableEntry *TE) {
if (ObservedPCs.insert(TE).second && DoPrintNewPCs) {
PrintPC("\tNEW_PC: %p %F %L", "\tNEW_PC: %p",
GetNextInstructionPc(TE->PC));
Printf("\n");
}
};
auto Observe = [&](const PCTableEntry *TE) {
if (PcIsFuncEntry(TE))
if (++ObservedFuncs[TE->PC] == 1 && NumPrintNewFuncs)
CoveredFuncs.push_back(TE->PC);
ObservePC(TE);
};
If the basic block is the entry, then update ObservedFunc
. Otherwise, invoke
ObservePC
to update ObservedPCs
.
libFuzzer intercepts#
LibFuzzer will intercepts memcmp
, strncmp
, strcmp
, strncasecmp
,
strcasecmp
, strstr
, strcasestr
, and memmem
functions if no ASAN, TSAN,
MSAN runtime is enabled. It is not easy to disable this behavior.
A typical flow for each above function is in the following.
RunningUserCallback = true;
int Res = CB(DataCopy, Size);
RunningUserCallback = false;
int memcmp(const void *s1, const void *s2, size_t n) {
if (!FuzzerInited)
return internal_memcmp(s1, s2, n);
int result = REAL(memcmp)(s1, s2, n);
__sanitizer_weak_hook_memcmp(GET_CALLER_PC(), s1, s2, n, result);
return result;
}
void __sanitizer_weak_hook_memcmp(void *caller_pc, const void *s1,
const void *s2, size_t n, int result) {
if (!fuzzer::RunningUserCallback) return;
if (result == 0) return; // No reason to mutate.
if (n <= 1) return; // Not interesting.
fuzzer::TPC.AddValueForMemcmp(caller_pc, s1, s2, n, /*StopAtZero*/false);
}
Here is a summary of where the collected information will flow.
Function | Information Sink | |
---|---|---|
memcmp | AddValueForMemcmp | ValueProfileMap |
strnmp | AddValueForMemcmp | ValueProfileMap |
strcmp | AddValueForMemcmp | ValueProfileMap |
strncasecmp | AddValueForMemcmp | ValueProfileMap |
strcasecmp | AddValueForMemcmp | |
strstr | MMT(Mutation Only) | |
strcasestr | MMT(Mutation Only) | |
memmem | MMT(Mutation Only) |
To disable them, we could 1) use -use_value_profile=0
when fuzzing to avoid
update coverage information from ValueProfileMap, 2) comment these
__sanitizer_weak_hook_xxx
to reduce the overhead. Luckily,
-use_value_profile=0
is the default option of libFuzzer.
Conclusion#
-
For the basic block coverage, SanCov maintains an array that records how many times a basic block is visited, and the libFuzzer will collect that information and calculate features.
-
To disable fancy features, just do as below.
clang -o foo -fsanitize=fuzzer \
-fsanitize-coverage=bb \
-fno-sanitize-coverage=indirect-calls,trace-cmp,stack-depth,pc-table \
foo.c