Add more fuzzers and libFuzzer harness#553
Conversation
|
The windows failure appears to be related to a problem fixed in #550 about those functions not having |
| if(ENABLE_LIBFUZZER) | ||
| list(APPEND H3_COMPILE_FLAGS -fsanitize=fuzzer,address,undefined) | ||
| list(APPEND H3_LINK_FLAGS -fsanitize=fuzzer,address,undefined) | ||
| endif() |
There was a problem hiding this comment.
Note that these settings don't really work for the other binaries since main gets redefined.
nrabinowitz
left a comment
There was a problem hiding this comment.
This looks great overall - I think my confusion here is mostly my own lack of understanding of fuzzers, not problems with the code, but more comments might be helpful overall.
| option(BUILD_FILTERS "Build filter applications." ON) | ||
| option(BUILD_GENERATORS "Build code generation applications." ON) | ||
| # If ON, libfuzzer settings are used to build the fuzzer harnesses. If OFF, a frontend | ||
| # for afl++ is provided instead. |
There was a problem hiding this comment.
Can you explain the frontend part of this?
There was a problem hiding this comment.
frontend in this context means an implementation of main that accepts arguments in a way that AFL can integrate with. Should I expand on the comments for this option?
| H3Index *compacted = calloc(inputSize, sizeof(H3Index)); | ||
| H3_EXPORT(compactCells)(input, compacted, inputSize); | ||
|
|
||
| // fuzz uncompactCells using output of above |
There was a problem hiding this comment.
Does this make sense? Would we miss cases for uncompactCells that can't be generated by the output of compact?
There was a problem hiding this comment.
A second run of uncompactCells is done below, using the original input data. Not sure if this additional run of uncompact on the compact output is needed but it seemed like a reasonable check to append.
src/apps/fuzzers/fuzzerCompact.c
Outdated
|
|
||
| // fuzz compactCells | ||
| H3Index *compacted = calloc(inputSize, sizeof(H3Index)); | ||
| H3_EXPORT(compactCells)(input, compacted, inputSize); |
There was a problem hiding this comment.
We don't need the error here?
There was a problem hiding this comment.
Correct. I consider failing and returning the appropriate error code acceptable behavior. Unacceptable behavior would be undefined behavior, faulting/crashing/out of bounds reads, etc.
Edit: Oh, I see we use the output of this function later. In this particular case I don't think it's too bad since we are guaranteed that the memory pointed to by compacted has been zeroed by calloc. If this were data on the stack using it in a potentially unintialized state could be bad. I will update this one to check the error code.
| } | ||
| const inputArgs *args = (const inputArgs *)data; | ||
| if (args->sz >= 1024) { | ||
| return 0; |
There was a problem hiding this comment.
For my info - it seems like there's a class of issue here where the user passes in an array and a size that don't match, and we get an invalid array access. Since we can't deal with this (we don't know the array size), are we assuming this is outside the bounds of testing, and is user error? Since this is a pretty serious error, do we need to indicate where users might need to be careful in their own code not to hit this?
There was a problem hiding this comment.
This is a serious issue with memory management in C. I considered this case to be out of scope for the fuzzers since I don't know of any way to make H3 resilient to this kind of error.
We should be careful to document the expected array sizes when users pass memory into H3 so this can be correctly implemented on the user side.
| return 0; | ||
| } | ||
| inputArgs args; | ||
| memcpy(&args, data, sizeof(inputArgs)); |
There was a problem hiding this comment.
For my info - why memcpy here and casts in the other tests?
There was a problem hiding this comment.
This is needed because we modify args.str on the next line to ensure it meets the contract that it is null terminated. This cannot be done on the original buffer because of const and because libfuzzer will complain.
There was a problem hiding this comment.
Worth a comment in that case I think
There was a problem hiding this comment.
Will add. Come to think of it we may wish to address this in the API of stringToH3 by passing the buffer length.
| H3Error err = H3_EXPORT(maxPolygonToCellsSize)(geoPolygon, res, &sz); | ||
| if (!err && sz < MAX_SZ) { | ||
| if (sz < 0) { | ||
| printf("Oh no - sz is negative\n"); |
There was a problem hiding this comment.
I'm confused here - should this return 0 and exit? If not, why print the error?
There was a problem hiding this comment.
As it is it will crash on line 57. Not sure if this will be relevant after #551.
| return 0; | ||
| } | ||
| geoPolygon.holes = calloc(geoPolygon.numHoles, sizeof(GeoLoop)); | ||
| size_t offset = sizeof(inputArgs); |
There was a problem hiding this comment.
I'm quite confused here - isn't the fuzzer only going to provide sizeof(inputArgs) worth of data? Where does the rest of the data after offset come from in that case?
There was a problem hiding this comment.
I will need to check if this will work with AFL (might need to adjust the seed file so it has data after inputArgs). For libFuzzer it does not know about inputArgs so it will generate larger test cases.
248b1cf to
323d9e9
Compare
| double out; | ||
| H3_EXPORT(getHexagonAreaAvgKm2)(args->res, &out); | ||
| H3_EXPORT(getHexagonAreaAvgM2)(args->res, &out); | ||
| H3_EXPORT(getHexagonEdgeLengthAvgKm)(args->res, &out); | ||
| H3_EXPORT(getHexagonEdgeLengthAvgM)(args->res, &out); | ||
|
|
||
| int64_t outInt; | ||
| H3_EXPORT(getNumCells)(args->res, &outInt); | ||
|
|
||
| H3Index pentagons[12]; | ||
| H3_EXPORT(getPentagons)(args->res, pentagons); |
There was a problem hiding this comment.
This is probably just a reflection of me not knowing how fuzzers work, but when you're doing multiple sequential tests like this, does it become harder to test the last function because you need to "get past" all the previous ones? Is there any value is splitting out each function separately? But that's also a lot of annoying boilerplate...
There was a problem hiding this comment.
It's correct that if any of the previous functions called were to crash, this function wouldn't be exercised. But in that case we'd still have found a crash warranting investigation anyways. If we explicitly terminate the fuzzing run before reaching this function (i.e. if getNumCells were to return an error then we return 0), then that would not be a good test because this function's invocation would be dependent on the previous function. (And that is not part of the contract of calling this function, as with the memory allocation size functions.) That doesn't happen here so I believe this is a valid exercise of all the functions in this file.
This adds additional fuzzing harnesses (consult
README.mdin the PR for which functions should be covered and uncovered - a few are left uncovered since they are trivial) and also adds an adapter from AFL++ to libFuzzer. The idea being we can move the fuzzer harness from https://github.com/google/oss-fuzz/blob/master/projects/h3/h3_fuzzer.c to in repo (replaces #448) and cover all functions.Possible improvements include testing the build using afl++ itself and hooking this up to OSS-Fuzz.
cc @AdamKorcz