SQLite¶

SQLite is the first native backend. It is implemented in Rust with sqlx and exposed to Python as the db+sqlite fsspec protocol.

Construction¶

import fsspec

fs = fsspec.filesystem("db+sqlite", database="app.db")

database can be a filesystem path, :memory:, or a SQLite URL such as sqlite:///tmp/app.db.

You can also construct the class directly:

from fsspec_db import SQLiteDatabaseFileSystem

fs = SQLiteDatabaseFileSystem(database="app.db")

Reading Metadata¶

fs.ls("/", detail=False)
fs.ls("/main", detail=False)
fs.info("/main/users")
fs.info("/main/users/columns/id")

The backend introspects:

schemas from PRAGMA database_list;
tables and views from sqlite_master;
columns from PRAGMA table_info;
indexes from PRAGMA index_list and PRAGMA index_info;
primary keys and foreign keys from PRAGMA table_info and PRAGMA foreign_key_list.

Relation lookups check only the requested relation and count rows only for that relation. Schema listings do not count every table.

Reading Data¶

import pyarrow.ipc as ipc

with ipc.open_stream(fs.cat_file("/main/users.arrow")) as reader:
    table = reader.read_all()

table = fs.query("SELECT id, name FROM users WHERE id > ?", [0])

The native path releases the Python GIL while SQLite I/O is running.

Writing Arrow IPC¶

Warning

Overwrite writes replace the table contents. open(path, "wb"), default pipe_file, and default put_file delete existing rows inside the write transaction before inserting the incoming rows. Use "ab" or mode="append" to preserve existing rows.

import pyarrow as pa
import pyarrow.ipc as ipc

table = pa.table({"name": ["ada"], "score": [1.0]})
sink = pa.BufferOutputStream()
with ipc.new_stream(sink, table.schema) as writer:
    writer.write_table(table)

with fs.open("/main/users.arrow", "ab") as file:
    file.write(sink.getvalue().to_pybytes())

"ab" appends rows. "wb" truncates the table with DELETE FROM inside the same transaction, then inserts rows.

Writing Parquet¶

import pyarrow as pa
import pyarrow.parquet as pq

pq.write_table(pa.table({"name": ["grace"], "score": [2.0]}), "rows.parquet")
fs.put_file("rows.parquet", "/main/users.parquet")

put_file truncates by default. Pass mode="append" to append:

fs.put_file("rows.parquet", "/main/users.parquet", mode="append")

Type Handling¶

SQLite values are converted into Arrow arrays with a small affinity mapper. Query results use declared column metadata plus observed non-null runtime value types, so NULL-first and mixed numeric expression columns can still produce numeric Arrow arrays.

SQLite value/type	Arrow type
boolean-like	`bool`
integer-like	`int64`
real/float/double	`float64`
blob/binary	`binary`
everything else	`utf8`

For writes, Arrow Null columns bind SQLite NULL. Temporal arrays are bound as integer epoch values. When a query produces mixed SQLite runtime types in one expression column, cast explicitly in SQL for predictable Arrow output.