readstat-rs
Read, inspect, and convert SAS binary (.sas7bdat) files β from Rust code, the command line, or the browser. Converts to CSV, Parquet, Feather, and NDJSON using Apache Arrow.
The original use case was a command-line tool for converting SAS files, but the project has since expanded into a workspace of crates that can be used as a Rust library, a CLI, or compiled to WebAssembly for browser and JavaScript runtimes.
π Dependencies
The command-line tool is developed in Rust and is only possible due to the following excellent projects:
- The ReadStat C library developed by Evan Miller
- The arrow Rust crate developed by the Apache Arrow community
The ReadStat library is used to parse and read sas7bdat files, and the arrow crate is used to convert the read sas7bdat data into the Arrow memory format. Once in the Arrow memory format, the data can be written to other file formats.
π‘ Note: The ReadStat C library supports SAS, SPSS, and Stata file formats. The
readstat-syscrate exposes the full ReadStat API β all 125 functions across all formats. However, the higher-level crates (readstat,readstat-cli,readstat-wasm,readstat-tests) currently only implement support for SAS.sas7bdatfiles.
π CLI Quickstart
Convert the first 50,000 rows of example.sas7bdat (by performing the read in parallel) to the file example.parquet, overwriting the file if it already exists.
readstat data /some/dir/to/example.sas7bdat --output /some/dir/to/example.parquet --format parquet --rows 50000 --overwrite --parallel
π¦ CLI Install
Download a Release
[Mostly] static binaries for Linux, macOS, and Windows may be found at the Releases page.
Setup
Move the readstat binary to a known directory and add the binary to the userβs PATH.
Linux & macOS
Ensure the path to readstat is added to the appropriate shell configuration file.
Windows
For Windows users, path configuration may be found within the Environment Variables menu. Executing the following from the command line opens the Environment Variables menu for the current user.
rundll32.exe sysdm.cpl,EditEnvironmentVariables
Alternatively, update the user-level PATH in PowerShell (replace C:\path\to\readstat with the actual directory):
$currentPath = [Environment]::GetEnvironmentVariable("Path", "User")
[Environment]::SetEnvironmentVariable("Path", "$currentPath;C:\path\to\readstat", "User")
After running the above, restart your terminal for the change to take effect.
Run
Run the binary.
readstat --help
βοΈ CLI Usage
The binary is invoked using subcommands:
metadataβ writes file and variable metadata to standard out or JSONpreviewβ writes the first N rows of parsed data ascsvto standard outdataβ writes parsed data incsv,feather,ndjson, orparquetformat to a file
Column metadata β labels, SAS format strings, and storage widths β is preserved in Parquet and Feather output as Arrow field metadata. See docs/TECHNICAL.md for details.
For the full CLI reference β including column selection, parallelism, memory considerations, SQL queries, reader modes, and debug options β see docs/USAGE.md.
For library, API server, and WebAssembly usage, see Examples below.
π οΈ Build from Source
Clone the repository (with submodules), install platform-specific developer tools, and run cargo build. Platform-specific instructions for Linux, macOS, and Windows are in docs/BUILDING.md.
π» Platform Support
| Platform | Status | C library | Notes |
|---|---|---|---|
| Linux (glibc) | β Builds and runs | System iconv, system zlib | β |
| Linux (musl) | β Builds and runs | System iconv, system zlib | β |
| macOS | β Builds and runs | System libiconv, system zlib | β |
| Windows (MSVC) | β Builds and runs | Vendored iconv, vendored zlib | Requires libclang for bindgen. MSVC supported since ReadStat 1.1.5 (no msys2 needed). |
π Documentation
| Document | Description |
|---|---|
| docs/ARCHITECTURE.md | Crate layout, key types, and architectural patterns |
| docs/USAGE.md | Full CLI reference and examples |
| docs/BUILDING.md | Clone, build, and linking details per platform |
| docs/TECHNICAL.md | Floating-point precision and date/time handling |
| docs/TESTING.md | Running tests, dataset table, valgrind |
| docs/BENCHMARKING.md | Criterion benchmarks, hyperfine, and profiling |
| docs/CI-CD.md | GitHub Actions triggers and artifacts |
| docs/MEMORY_SAFETY.md | Automated memory-safety CI checks (Valgrind, ASan, Miri, unsafe audit) |
| docs/RELEASING.md | Step-by-step guide for publishing crates to crates.io |
π§© Workspace Crates
| Crate | Path | Description |
|---|---|---|
readstat | crates/readstat/ | Pure library for parsing SAS files into Arrow RecordBatch format. Output writers are feature-gated. |
readstat-cli | crates/readstat-cli/ | Binary crate producing the readstat CLI tool (arg parsing, progress bars, orchestration). |
readstat-sys | crates/readstat-sys/ | Raw FFI bindings to the full ReadStat C library (SAS, SPSS, Stata) via bindgen. |
readstat-iconv-sys | crates/readstat-iconv-sys/ | Windows-only FFI bindings to libiconv for character encoding conversion. |
readstat-tests | crates/readstat-tests/ | Integration test suite (29 modules, 14 datasets). |
readstat-wasm | crates/readstat-wasm/ | WebAssembly build for browser/JS usage (excluded from workspace, built with Emscripten). |
For full architectural details, see docs/ARCHITECTURE.md.
π‘ Examples
The examples/ directory contains runnable demos showing different ways to use readstat-rs.
| Example | Description |
|---|---|
cli-demo | Convert a .sas7bdat file to CSV, NDJSON, Parquet, and Feather using the readstat CLI |
api-demo | API servers in Rust (Axum) and Python (FastAPI + PyO3) β upload, inspect, and convert SAS files over HTTP |
bun-demo | Parse a .sas7bdat file from JavaScript using the WebAssembly build with Bun |
web-demo | Browser-based viewer and converter β upload, preview, and export entirely client-side via WASM |
sql-explorer | Browser-based SQL explorer β upload a .sas7bdat file and query it interactively with SQL via AlaSQL |
To use readstat as a library in your own Rust project, add the readstat crate as a dependency.
π Resources
The following have been incredibly helpful while developing!
- How to not RiiR
- Making a *-sys crate
- Rust Closures in FFI
- Rust FFI: Microsoft Flight Simulator SDK
- Stack Overflow answers by Jake Goulding
- ReadStat pull request to add MSVC/Windows support
- jamovi-readstat appveyor.yml file to build ReadStat on Windows
- Arrow documentation for utilizing ArrayBuilders