Concepts¶

fsspec-db maps database structure onto the fsspec filesystem model. Schemas, tables, views, columns, indexes, and constraints become paths. Table data becomes files whose extension selects the transfer format.

Layers¶

The implementation has three layers:

A Rust Database trait describes database primitives: list schemas, list relations, inspect metadata, run queries, and insert Arrow batches.
Rust DatabaseFs<D> turns a Database implementation into an fsspec_rs::FileSystem. It owns path parsing, metadata shaping, SQL generation for table reads, and format encoding/decoding.
Python filesystem classes subclass fsspec.AbstractFileSystem and delegate primitives to the PyO3 bridge. Python gets normal fsspec behavior while Rust remains the source of truth for database path semantics.

The native SQLite, PostgreSQL, and MySQL backends use sqlx. Python-defined databases implement AbstractDatabase; the reverse bridge lets Rust call that Python object through the same DatabaseFs path.

Data Model¶

info() and ls(detail=True) return ordinary fsspec dictionaries with these common keys:

Key	Meaning
`name`	Absolute fsspec-db path, without protocol.
`type`	`"directory"` for schemas, relations, and facets; `"file"` for metadata items and materialized data.
`size`	Byte size when known. Materialized table reads usually learn size only after encoding.
`kind`	fsspec-db object kind, such as `schema`, `table`, `view`, `column`, `index`, or `constraint`.

Extra metadata is stored directly in the same dictionary:

Path	Extra fields
Relation directory	`kind`, optional `row_count`, optional `size_bytes`.
Column item	`data_type`, `nullable`, `ordinal`, `primary_key`, optional `default`.
Index item	`columns`, `unique`, optional `method`.
Constraint item	`kind`, `columns`, optional `references`, optional `expr`.
Data file	`format`, `dialect`, `size_known`.

Reads¶

Reading a data path runs a generated SELECT against the relation, converts rows to Arrow, then encodes the result based on the path extension:

Extension	Bytes returned
`.arrow`	Arrow IPC stream
`.parquet`	Parquet
`.csv`	CSV with a header
`.jsonl`	Arrow JSON line-delimited records
`.sql`	DDL or view definition text

fs.query(sql, params=None) is intentionally separate from path reads. It accepts raw SQL, binds parameters, and returns a pyarrow.Table.

Writes¶

Writes decode incoming Arrow-compatible bytes and call Database.insert():

Operation	Insert mode
`open(path, "wb")`	truncate relation, then insert rows
`open(path, "ab")`	append rows
`pipe_file(path, bytes)`	truncate by default
`pipe_file(path, bytes, mode="append")`	append
`put_file(local, path)`	truncate by default
`put_file(local, path, mode="append")`	append

DDL writes are deliberately not part of the early surface. Creating or dropping tables will be a guarded later feature.

Boundaries¶

Current native support is SQLite, PostgreSQL, and MySQL. These backends handle common Arrow scalar types: booleans, integers, floats, UTF-8 strings, binary values, and all-null columns. SQLite also binds temporal arrays as integer epoch values. Decimal, temporal, and specialized PostgreSQL/MySQL types should be cast in SQL until richer Arrow mappings land.