Getting Started
1
Install
cargo install readstat-cli
2
Inspect
readstat metadata file.sas7bdat
3
Preview
readstat preview file.sas7bdat
4
Convert
readstat data file.sas7bdat -o out.parquet -f parquet
Metadata
readstat metadata FILEDisplay metadata
--as-jsonOutput as JSON
--skip-row-countSkip row count (faster)
--no-progressHide progress bar
Includes: row/var counts, table name & label, encoding, format version, bitness, compression, byte order, variable names/types/labels/formats, Arrow types.
Preview
readstat preview FILEFirst 10 rows as CSV
--rows 100First N rows
--columns A,B,CSelect columns
--columns-file cols.txtColumns from file
--no-progressHide progress bar
Output: Always CSV to stdout. Pipe to | head, | column -t -s,, or redirect to a file.
Data Conversion
readstat data FILE -o OUTConvert (default: CSV)
-f csvCSV (default)
-f featherFeather / Arrow IPC
-f ndjsonNewline-delimited JSON
-f parquetApache Parquet
--rows 1000Limit rows
--overwriteOverwrite existing output
--columns A,B,CSelect columns
--columns-file cols.txtColumns from file
Parquet Compression
--compression snappyFast, moderate ratio
--compression zstdBest balance
--compression gzipWide compatibility
--compression brotliHigh ratio
--compression lz4-rawFastest decompression
--compression uncompressedNo compression
--compression-level NLevel (codec-specific)
Example: readstat data f.sas7bdat -o f.parquet -f parquet --compression zstd --compression-level 3
Parallelism
--parallelRead chunks in parallel
--parallel-writeWrite in parallel (Parquet)
--parallel-write-buffer-mb NBuffer before spill (default 100)
Note: --parallel increases memory (all chunks in memory). --parallel-write currently supports Parquet only. Row order is preserved.
Reader Modes
--reader streamDefault — chunked reads
--reader memRead all into memory
--stream-rows 10000Chunk size (default 10k)
stream keeps memory low for large files. mem is useful for benchmarking. Lower --stream-rows for wide/string-heavy datasets.
Column Selection
Discover columns
readstat metadata FILE --as-json | jq '.vars | to_entries[] | .value.var_name'
Select inline
--columns Brand,Model,EngineSize
Select from file
--columns-file columns.txt# comments, one per line
Works with: both preview and data subcommands.
Metadata Preservation
labelVariable label
sas_formatSAS format string
storage_widthStorage bytes
display_widthDisplay width hint
table_labelFile label (schema-level)
Parquet & Feather preserve SAS metadata as Arrow field metadata. Read with pyarrow.parquet.read_schema() or R's arrow::read_parquet().
Common Workflows
Quick data exploration
readstat metadata data.sas7bdatSchema overview
readstat metadata data.sas7bdat --as-json | jqProgrammatic metadata
readstat preview data.sas7bdat --rows 20Eyeball sample rows
readstat preview data.sas7bdat --columns Name,AgeSubset columns
Production conversion
readstat data big.sas7bdat -o big.parquet -f parquet --compression zstd
... --parallel --parallel-writeMax throughput
... --columns-file keep.txtOnly needed columns
... --rows 100000Partial extract
More Examples
web-demoBrowser-based viewer & converter (WASM)
api-demoREST API servers (Rust/Axum + Python/FastAPI)
bun-demoRead .sas7bdat from JavaScript via WASM
cli-demoShell scripts for batch conversion
Debug & Help
RUST_LOG=debug readstat ...Verbose debug output
readstat --versionShow version
readstat --helpTop-level help
readstat metadata --helpSubcommand help
Warning: RUST_LOG=debug prints info for every single value — extremely verbose with preview or data!