Persistence¶
TongGraph can run entirely in memory or against a local SQLite file. The SQLite backend is a source of truth for metadata and records, while compacted compute segments are stored in sidecar files next to the database.
What SQLite Stores¶
SQLite tables are initialized in src/sqlite.rs and cover:
- graph nodes and edges
- node and edge property rows
- property key/value catalogs
- operation log entries
- full-text index definitions and SQLite FTS5 derived rows
- vector index definitions and little-endian float32 vector BLOBs
- variables and ordered variable states
- factor metadata and factor tables
- latest posteriors
- evidence and traces
Storage format marker
The metadata table records storage_format = tonggraph-sqlite-v1.
Pre-v1 storage compatibility
TongGraph does not promise stable SQLite tables or segment formats before
v1. Treat local databases and .segments/ directories as disposable
0.x artifacts unless your application owns its own export path.
Segment Sidecars¶
Compacted adjacency segments are stored under:
The manifest includes the segment format, node count, edge count, and segment file name. When a graph reopens, TongGraph checks that the sidecar matches the expected node and edge counts before loading it. If no usable segment exists, or the manifest/file is corrupt, the core rebuilds one from SQLite records and re-saves a valid sidecar.
Write Flow¶
sequenceDiagram
participant Py as Python Graph
participant Core as GraphCore
participant SQL as SQLite
participant Seg as Segment sidecar
Py->>Core: add_node / add_edge / add_variable
Core->>Core: validate IDs and scalar properties
Core->>SQL: insert record and append op_log
Core->>Core: update indexes and delta adjacency
Py->>Core: compact()
Core->>Seg: write segment-v1.bin and manifest.txt
Auto-Compaction¶
SQLite-backed graphs keep recent edge writes in a mutable delta overlay.
Compaction can happen manually through Graph.compact.
The core also auto-compacts when the delta overlay grows beyond the current
thresholds in src/core/lifecycle.rs.
Operational Notes¶
- SQLite uses WAL journal mode, full synchronous mode, foreign keys, and a busy timeout for local ACID-oriented writes.
- A
Graphhandle is a single live writer view. If another handle appends to the same SQLite database, stale handles raise a refresh-required error before writing. CallGraph.refresh()to reload from SQLite. add_nodes()andadd_edges()use one SQLite transaction per batch.- Node and edge updates/deletions commit as one SQLite change set before the in-memory state is published. Property catalogs are rebuilt inside that transaction so removed values do not remain indexed.
- Monotonic next-node and next-edge IDs are stored in metadata so deleted IDs are not reused after reopening a graph.
- Full-text definitions are durable. FTS5 rows are synchronized inside graph mutation transactions and rebuilt from graph records whenever a graph opens.
- Vector definitions and embeddings are durable. Vector batches are atomic, entity deletion removes associated vectors in the same transaction, and graph open validates persisted BLOBs before publishing state.
- Properties are limited to Python-compatible scalar values:
bool,int, finitefloat, andstr. - Local
.db,.db-shm,.db-wal, and.segments/artifacts are ignored by the repository. - There is no network service, authentication layer, or distributed storage mode in the current codebase.