PostgreSQL and MySQL¶
PostgreSQL and MySQL are native Rust backends implemented with sqlx. They use the same
filesystem path model as SQLite and are exposed as db+postgresql, db+postgres, and
db+mysql.
Construction¶
import fsspec
pg = fsspec.filesystem(
"db+postgresql",
dsn="postgresql://user:password@localhost:5432/app",
)
mysql = fsspec.filesystem(
"db+mysql",
dsn="mysql://user:password@localhost:3306/app",
)
URL-style construction is also supported:
pg, _ = fsspec.core.url_to_fs("db+postgresql://user:password@localhost:5432/app")
mysql, _ = fsspec.core.url_to_fs("db+mysql://user:password@localhost:3306/app")
Both backends also accept individual connection options:
pg = fsspec.filesystem(
"db+postgresql",
host="localhost",
port=5432,
database="app",
user="user",
password="password",
sslmode="require",
)
mysql = fsspec.filesystem(
"db+mysql",
host="localhost",
port=3306,
database="app",
user="user",
password="password",
ssl_mode="REQUIRED",
charset="utf8mb4",
)
Pool sizing can be set for both networked backends with min_connections and
max_connections:
pg = fsspec.filesystem(
"db+postgresql",
dsn="postgresql://user:password@localhost:5432/app",
min_connections=1,
max_connections=8,
)
These options also work through fsspec config files. For example, a config section named
db+postgresql or db+mysql can provide the same host, user, password,
database, min_connections, and max_connections keys; explicit constructor
arguments still win. If dsn or url is passed together with connection fields such
as database or user, the explicit source is used and those connection fields are
consumed instead of being passed through as generic fsspec storage options.
Metadata¶
PostgreSQL introspection uses information_schema plus pg_catalog for indexes,
constraints, estimated row counts, and view definitions. MySQL uses information_schema
for schemata, tables, columns, constraints, views, and statistics index metadata.
fs.ls("/", detail=False)
fs.ls("/public", detail=False)
fs.info("/public/users")
fs.info("/public/users/columns/id")
fs.info("/public/users/indexes/users_pkey")
MySQL database names appear as top-level schemas. PostgreSQL excludes pg_catalog,
information_schema, and toast schemas from root listings. MySQL excludes
information_schema, mysql, performance_schema, and sys.
Reads and Queries¶
Path reads generate dialect-aware SELECT statements:
data = pg.cat_file("/public/users.parquet")
table = mysql.query("SELECT id, name FROM users WHERE id > ?", [0])
Raw query() SQL uses the target database driver’s placeholder syntax:
Backend |
Placeholder example |
|---|---|
PostgreSQL |
|
MySQL |
|
SQLite |
|
Writes¶
Writes decode Arrow IPC, Parquet, CSV, or JSONL bytes and insert Arrow batches into the target relation.
with pg.open("/public/users.arrow", "ab") as file:
file.write(arrow_ipc_bytes)
mysql.put_file("rows.parquet", "/app/users.parquet", mode="append")
Overwrite writes replace table contents before inserting incoming rows. PostgreSQL uses
TRUNCATE TABLE for overwrite mode. MySQL uses DELETE FROM so the write remains in the
same transaction as the insert.
Type Handling¶
The sqlx backends convert common database scalar types into Arrow:
Backend |
Database type family |
Arrow type |
|---|---|---|
PostgreSQL |
|
|
PostgreSQL |
|
|
PostgreSQL |
|
|
PostgreSQL |
|
|
PostgreSQL |
|
|
MySQL |
|
|
MySQL |
signed and unsigned integer families |
|
MySQL |
|
|
MySQL |
|
|
MySQL |
|
|
Unsigned MySQL integers that do not fit in Arrow int64 return an error. Decimal,
temporal, geometry, and other specialized types should be cast in SQL until richer Arrow
type mappings land.
Arrow Extraction Boundary¶
All native backends expose query results through Database::query() -> RecordBatchStream.
The current sqlx drivers build Arrow from rows. The Rust trait also records an
ArrowExtraction strategy so future native-Arrow readers such as connector-x or
arrow-odbc can be added as optional driver implementations without changing fsspec path
semantics.