Testing
To perform unit / integration tests, run the following.
cargo test --workspace
To run only integration tests:
cargo test -p readstat-tests
Datasets
Formally tested (via integration tests) against the following datasets. See the README.md for data sources.
-
ahs2019n.sas7bdat→ US Census data (download via download_ahs.sh or download_ahs.ps1) -
all_dates.sas7bdat→ SAS dataset containing all possible date formats -
all_datetimes.sas7bdat→ SAS dataset containing all possible datetime formats -
all_times.sas7bdat→ SAS dataset containing all possible time formats -
all_types.sas7bdat→ SAS dataset containing all SAS types -
cars.sas7bdat→ SAS cars dataset -
hasmissing.sas7bdat→ SAS dataset containing missing values -
intel.sas7bdat -
malformed_utf8.sas7bdat→ SAS dataset with truncated multi-byte UTF-8 characters (issue #78) -
messydata.sas7bdat -
rand_ds_largepage_err.sas7bdat→ Created using create_rand_ds.sas with BUFSIZE set to2M -
rand_ds_largepage_ok.sas7bdat→ Created using create_rand_ds.sas with BUFSIZE set to1M -
scientific_notation.sas7bdat→ Used to test float parsing -
somedata.sas7bdat→ Used to test Parquet label preservation -
somemiss.sas7bdat
Fuzz Testing
Fuzz targets live in fuzz/ (a standalone Cargo project, not a workspace member) and use cargo-fuzz (libFuzzer). Requires nightly Rust.
Targets
| Target | What it exercises |
|---|---|
fuzz_read_metadata | Metadata + variable callbacks, format classification, schema building |
fuzz_read_data | Full metadata→data pipeline including Arrow conversion |
fuzz_read_data_filtered | Column filter index mapping, skipped-variable logic (uses arbitrary) |
Each target’s corpus is seeded with the 14 test .sas7bdat files.
Running locally
# Install (one-time)
cargo install cargo-fuzz
# Run a target indefinitely (Ctrl+C to stop)
cargo +nightly fuzz run fuzz_read_metadata
# Run for 10 minutes
cargo +nightly fuzz run fuzz_read_metadata -- -max_total_time=600
# Reproduce a crash
cargo +nightly fuzz run fuzz_read_metadata fuzz/artifacts/fuzz_read_metadata/<crash-file>
CI
Fuzz tests run weekly (Monday 3am UTC) via .github/workflows/fuzz.yml. Each target runs for 30 minutes. On crash, a GitHub issue is automatically opened.
Valgrind
To ensure no memory leaks, valgrind may be utilized. For example, to ensure no memory leaks for the test parse_file_metadata_test, run the following from within the readstat directory.
valgrind ./target/debug/deps/parse_file_metadata_test-<hash>