Concepts¶
fsspec-db maps database structure onto the fsspec filesystem model. Schemas, tables,
views, columns, indexes, and constraints become paths. Table data becomes files whose
extension selects the transfer format.
Layers¶
The implementation has three layers:
A Rust
Databasetrait describes database primitives: list schemas, list relations, inspect metadata, run queries, and insert Arrow batches.Rust
DatabaseFs<D>turns aDatabaseimplementation into anfsspec_rs::FileSystem. It owns path parsing, metadata shaping, SQL generation for table reads, and format encoding/decoding.Python filesystem classes subclass
fsspec.AbstractFileSystemand delegate primitives to the PyO3 bridge. Python gets normal fsspec behavior while Rust remains the source of truth for database path semantics.
The native SQLite, PostgreSQL, and MySQL backends use sqlx. Python-defined databases
implement AbstractDatabase; the reverse bridge lets Rust call that Python object through
the same DatabaseFs path.
Data Model¶
info() and ls(detail=True) return ordinary fsspec dictionaries with these common keys:
Key |
Meaning |
|---|---|
|
Absolute fsspec-db path, without protocol. |
|
|
|
Byte size when known. Materialized table reads usually learn size only after encoding. |
|
fsspec-db object kind, such as |
Extra metadata is stored directly in the same dictionary:
Path |
Extra fields |
|---|---|
Relation directory |
|
Column item |
|
Index item |
|
Constraint item |
|
Data file |
|
Reads¶
Reading a data path runs a generated SELECT against the relation, converts rows to Arrow,
then encodes the result based on the path extension:
Extension |
Bytes returned |
|---|---|
|
Arrow IPC stream |
|
Parquet |
|
CSV with a header |
|
Arrow JSON line-delimited records |
|
DDL or view definition text |
fs.query(sql, params=None) is intentionally separate from path reads. It accepts raw SQL,
binds parameters, and returns a pyarrow.Table.
Writes¶
Writes decode incoming Arrow-compatible bytes and call Database.insert():
Operation |
Insert mode |
|---|---|
|
truncate relation, then insert rows |
|
append rows |
|
truncate by default |
|
append |
|
truncate by default |
|
append |
DDL writes are deliberately not part of the early surface. Creating or dropping tables will be a guarded later feature.
Boundaries¶
Current native support is SQLite, PostgreSQL, and MySQL. These backends handle common Arrow scalar types: booleans, integers, floats, UTF-8 strings, binary values, and all-null columns. SQLite also binds temporal arrays as integer epoch values. Decimal, temporal, and specialized PostgreSQL/MySQL types should be cast in SQL until richer Arrow mappings land.