Memory Safety
This project contains unsafe Rust code (FFI callbacks, pointer casts, memory-mapped I/O) and links against the vendored ReadStat C library. Five automated CI checks guard against memory errors (the fifth is experimental and continue-on-error).
CI Jobs
All five jobs run on every workflow dispatch and tag push, in parallel with the build jobs. Any memory error fails the job with a nonzero exit code — except the experimental asan-windows-full job, which is marked continue-on-error and does not block the workflow.
Miri (Rust undefined behavior)
- Platform: Ubuntu (Linux)
- Scope: Unit tests in the
readstatcrate only (cargo miri test -p readstat) - What it catches: Undefined behavior in pure-Rust unsafe code — invalid pointer arithmetic, uninitialized reads, provenance violations, use-after-free in Rust allocations
- Limitation: Cannot execute FFI calls into C code, so integration tests (
readstat-tests) are excluded
Configuration:
- Uses Rust nightly with the
miricomponent MIRIFLAGS="-Zmiri-disable-isolation"allows tests that usetempfileto create directories
AddressSanitizer — Linux
- Platform: Ubuntu (Linux)
- Scope: Full workspace — lib tests, integration tests, binary tests (
cargo test --workspace --lib --tests --bins) - What it catches: Heap/stack buffer overflows, use-after-free, double-free, memory leaks (LeakSanitizer is enabled by default on Linux), across both Rust and C code
Configuration:
RUSTFLAGS="-Zsanitizer=address -Clinker=clang"— instruments Rust code and links the ASan runtime via clangREADSTAT_SANITIZE_ADDRESS=1— triggersreadstat-sys/build.rsto compile the ReadStat C library with-fsanitize=address -fno-omit-frame-pointer- Doctests are excluded (
--lib --tests --bins) becauserustdocdoes not properly inherit sanitizer linker flags
AddressSanitizer — macOS
- Platform: macOS (arm64)
- Scope: Full workspace — lib tests, integration tests, binary tests
- What it catches: Buffer overflows, use-after-free, double-free in Rust code and at the FFI boundary
Configuration:
RUSTFLAGS="-Zsanitizer=address"— instruments Rust code only- The ReadStat C library is not instrumented on macOS because Apple Clang and Rust’s LLVM have incompatible ASan runtimes — see ASan Runtime Mismatch below
- LeakSanitizer is not supported on macOS
- Doctests excluded for the same reason as Linux
AddressSanitizer — Windows
- Platform: Windows (x86_64, MSVC toolchain)
- Scope: Full workspace — lib tests, integration tests, binary tests
- What it catches: Buffer overflows, use-after-free, double-free in Rust code and at the FFI boundary
Configuration:
RUSTFLAGS="-Zsanitizer=address"— instruments Rust code only- Rust on Windows MSVC uses Microsoft’s ASan runtime (from Visual Studio), not LLVM’s compiler-rt. The compiler passes
/INFERASANLIBSto the MSVC linker, which auto-discovers the runtime import library at link time. See PR #118521. - Important: the MSVC ASan runtime DLL (
clang_rt.asan_dynamic-x86_64.dll) is NOT on PATH by default. The linker finds the import library at build time via/INFERASANLIBS, but the DLL loader needs the DLL on PATH at test runtime. The CI job usesvswhere.exeto locate the DLL directory (e.g.,C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\<ver>\bin\Hostx64\x64\) and prepends it to PATH. - LLVM is not installed by the Windows ASan job. Earlier versions installed it to satisfy bindgen’s
libclangrequirement, butreadstat-sysnow ships pre-generated bindings so default builds need neither. ASan itself uses Microsoft’s runtime, not LLVM’s. - This default job instruments Rust only. Unlike macOS, there is no runtime mismatch — both Rust and
cl.exeuse the same MSVC ASan runtime — so full C instrumentation is also exercised by a separate experimental job (below). - LeakSanitizer is not supported on Windows
- Doctests excluded for the same reason as Linux
AddressSanitizer — Windows (full C + Rust, experimental)
- Job:
asan-windows-full— markedcontinue-on-error, so a failure does not block the workflow - Platform: Windows (x86_64, MSVC toolchain)
- Scope: Full workspace, with the ReadStat C library also instrumented (
READSTAT_SANITIZE_ADDRESS=1→/fsanitize=address) - Why experimental: full Rust + C ASan on Windows MSVC should work since both use the same MSVC ASan runtime, but the combination is not widely documented as working — hence
continue-on-errorwhile it is validated. Once stable it would match Linux’s full C + Rust coverage (see Future Work).
How READSTAT_SANITIZE_ADDRESS Works
The readstat-sys/build.rs build script checks for the READSTAT_SANITIZE_ADDRESS environment variable. When set, it adds sanitizer flags to the C compiler flags for the ReadStat library only. This is intentionally scoped — a global CFLAGS would instrument third-party sys crates (e.g., zstd-sys) causing linker failures.
The flags are platform-specific:
- Linux/macOS:
-fsanitize=address -fno-omit-frame-pointer(GCC/Clang syntax) - Windows MSVC:
/fsanitize=address(MSVC syntax)
The Linux CI job sets READSTAT_SANITIZE_ADDRESS=1 (validated, blocking) and the experimental asan-windows-full job sets it too (continue-on-error while being validated). macOS does not, because of the runtime mismatch described below.
ASan Runtime Mismatch (macOS)
macOS has an ASan runtime mismatch that prevents instrumenting the C code alongside Rust. Apple Clang is a fork of LLVM with its own ASan runtime versioning. When both Rust and the C library are instrumented, the linker sees two incompatible ASan runtimes and fails with ___asan_version_mismatch_check_apple_clang_* vs ___asan_version_mismatch_check_v8. A potential workaround is to install upstream LLVM via Homebrew (brew install llvm) and set CC=/opt/homebrew/opt/llvm/bin/clang so both the C code and Rust use the same LLVM ASan runtime. However, this is fragile — the Homebrew LLVM version must stay close to the LLVM version used by Rust nightly, which changes frequently.
Windows does NOT have this problem. Rust on x86_64-pc-windows-msvc uses Microsoft’s ASan runtime (PR #118521), and so does cl.exe /fsanitize=address. Both link the same clang_rt.asan_dynamic-x86_64.dll from Visual Studio. Full C + Rust ASan instrumentation is theoretically possible on Windows — see Future Work.
Bottom line: Linux has full C + Rust ASan coverage. macOS provides Rust-only coverage due to the Apple Clang runtime mismatch. Windows provides Rust-only coverage currently, but full coverage is a future improvement since there is no runtime mismatch.
Future Work: Windows C Instrumentation
Since Rust and MSVC share the same ASan runtime on Windows, enabling READSTAT_SANITIZE_ADDRESS=1 in the Windows CI job should allow full C + Rust instrumentation — matching Linux’s coverage. This requires:
- Setting
READSTAT_SANITIZE_ADDRESS=1soreadstat-sys/build.rsadds/fsanitize=addresswhen compiling the ReadStat C library - Verifying there are no linker conflicts (if conflicts arise, the unstable
-Zexternal-clangrtflag can tell Rust to skip linking its own runtime copy) - Ensuring the MSVC ASan runtime DLL is on PATH at test time (the CI job already does this via
vswhere.exe)
Running Locally
Miri
rustup +nightly component add miri
MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test -p readstat -- --skip property_tests
--skip property_testsmatches CI: the proptest suites run 256 cases each and are 100–1000× slower under Miri’s interpreter. Everything else runs.
ASan on Linux
RUSTFLAGS="-Zsanitizer=address -Clinker=clang" \
READSTAT_SANITIZE_ADDRESS=1 \
cargo +nightly test --workspace --lib --tests --bins --target x86_64-unknown-linux-gnu
ASan on macOS
RUSTFLAGS="-Zsanitizer=address" \
cargo +nightly test --workspace --lib --tests --bins --target aarch64-apple-darwin
ASan on Windows
$env:RUSTFLAGS = "-Zsanitizer=address"
# The MSVC ASAN runtime DLL must be on PATH. Find it via vswhere:
$vsPath = & "${env:ProgramFiles(x86)}\Microsoft Visual Studio\Installer\vswhere.exe" -latest -property installationPath
$msvcVer = (Get-ChildItem "$vsPath\VC\Tools\MSVC" | Sort-Object Name -Descending | Select-Object -First 1).Name
$env:PATH = "$vsPath\VC\Tools\MSVC\$msvcVer\bin\Hostx64\x64;$env:PATH"
cargo +nightly test --workspace --lib --tests --bins --target x86_64-pc-windows-msvc
Valgrind (Linux)
For manual checks with full C library coverage, valgrind can also be used against debug test binaries:
cargo test -p readstat-tests --no-run
valgrind ./target/debug/deps/parse_cars_md_test-<hash>
Coverage Summary
| Tool | Platform | Rust code | C code (ReadStat) | Leak detection |
|---|---|---|---|---|
| Miri | Linux | Unit tests only | No (FFI excluded) | No |
| ASan | Linux | Full workspace | Yes (instrumented) | Yes |
| ASan | macOS | Full workspace | No (runtime mismatch) | No |
| ASan | Windows | Full workspace | Experimental (asan-windows-full, continue-on-error — see future work) | No |
| Valgrind | Linux (manual) | Full | Full | Yes |
| cargo-fuzz | Linux (CI, weekly) | Full | Full | No |
Fuzz testing exercises the FFI byte-parsing paths with arbitrary/malformed input via libFuzzer. See TESTING.md for details.