Ghee 0.6 - the tastiest way to manage your data
Introducing Ghee 0.6, the latest version of the tastiest way to manage your data!
Ghee is an experiment in leveraging modern filesystem features to implement a data management system, providing a key-value database, Git-style commits, and extensive tools for manipulation of extended attributes (xattrs).
The focus of this release is the introduction of the commit-management subcommands commit
, log
, restore
, and reset
. These are modeled after their Git equivalents, but utilize Btrfs copy-on-write filesystem semantics, including read-only snapshots, to efficiently track changes.
Using Btrfs generalizes change tracking, efficiently handling not only text files, but arbitrary binary blobs as well.
It is hoped that this could lead to a version control system that handles large files in an integrated manner, whereas large file support in Git is tacked on separately - in the case of Git-LFS, requiring an additional server to implement.
When using Ghee as a database, the commit
, restore
, and reset
commands provide transaction-like functionality, allowing modifications to be built incrementally but finalized all at once - and rolled back when mistakes are made.
The main question about what Ghee will be good for is how efficiently it can handle actual database workloads. I suspect the answer will be: not very well, based on Micheal Sproul's experience with Butter DB, based on a similar architecture.
But for many workflows, it's not necessary to serve large numbers of queries: Ghee would be well-suited for data scientists developing datasets and statistical models, for example.
It will be interesting to see what happens here at the meeting-place of databases, filesystems, and version control systems. Imagine adding WHERE
clauses to your merge command, a la ghee merge origin annotations-20231002 -w age>45
.
Now hosted at Codeberg... check it out and, as always, send bug reports my way!