Bundled the tiles into SQLite (I was inspired by seeing Dr. Hipp speak at a conference) and voila, things both easy to move and to checksum. Tiles were identified by X & Y offset at a given (Z)oom level, which made for super easy indexing in a relational DB like SQLite. On the iPad, it was then easy to give map bundles an application icon, associated datatype from file extension, metadata in a table, etc. At the time, I was fairly intimidated by the idea of creating a file format, but databases, I knew. And then making some CLI tools for working with the files in any language was trivial after that.
There's always the lazy approach of storing JSON blobs in TEXT fields, but I personally shy away from that because you lose out on a huge part of the benefits of using a SQL DB in the first place, most importantly migrations and querying/indexing.
What problem does normalization solve? You don't have to parse and run through a tree every time you're looking for data. You would, however, need to rebuild the tree through self joins or other references in other cases, I suppose. It depends how far you break down your data. I understand that we all see data structures a bit differently, however.
SQLite at least provides functions to make the “querying” part of that straightforward: https://sqlite.org/json1.html
An incomplete list of benefits of using SQLite:
- Runtime config changes for free
- type safety
- strong migration support
- incorrect configurations can be unrepresentable (or at least enforced with check constraints)
- interactable from text-based interfaces and strong off-the-shelf GUI support
Loose type checks, e.g. NOT NULL columns of "usually" text, are loose only compared to typical SQL table definitions; compared to the leap forward of using abstract tables and changing them with abstract SQL instead of using text or byte buffers and making arbitrary changes, enforcing data types on columns would be only a marginal improvement.
Naively, most data looks hierarchical and the instinctive reaction is to make your file format match. But if you think of this as a set of documents stacked on top of each other if you take the data as a bunch of 90 degree slices down through the stack now your data is relational, you loose the nice native hierarchical format, but you gain all sorts of interesting analysis and extraction options.
It is too bad relational data types tend to be so poorly represented in our programming languages, generally everything has to be mapped back to a hierarchical type.
Maybe a very simple document oriented db would have been better?
My biggest gripe is that the sqlite cli is brutally minimal (makes sense given design), but I probably should have been using a nicer cli.
My issue with SQLite's JSON implementation is that it cannot index arrays. SQLite indexes can only contain a single value per row (except for fulltext indexes but that's not what I want most of the time). SQLite has nothing like GIN indexes in Postgres.
- https://github.com/rumca-js/Internet-Places-Database
For UI I use HTML, because it already provides components with bootrap, and everybody can use it without installation of any software.
All data comes from a single SQLite that is easy read, and returns data.
My database is really big, so it takes time to browse it, I wanted to provide more meaningful way to limit scope of searching
SQLite is very simple, yet very reliable and powerful. Using SQLite as file format might be the best decision an engineer can take when it comes to future-proofing preservation of data.
- files can be individually extracted, in any order, from the archive
- thousands of implementations available, in every language and every architecture. no more than 32KiB RAM needed for decompression
- absolutely no possibility of patent challenges
Any multi-file archive format would do, but ZIP is very portable and random access.
If you don't need any table/relational data and are always happy to rewrite the entire file on every save, ZIP is a perfectly fine choice.
It's easier than e.g. a SQLite file with a bunch of individually gzipped blobs.
Alternatively, he could mean that, for the purposes of archiving, ZIP is very far behind the state of the art (no solid compression, old algorithms, small windows, file size limits without the ZIP64 extensions, and so on, most of which are not relevant to using ZIP as a container format)
I’ve reached for ZIP for application containers because it’s really easy, not because of design choices that affect me. Typically the compression is a convenient byproduct but not a requirement, and file size limits could be an issue, perhaps, but isn’t something I’ve ever hit when using ZIP for application data. File size limits is something I’ve hit when trying to archive lots of files.
Using ZIP for build pipelines that produce a large number of small files is handy since it’s often faster than direct file I/O, even on SSDs. In the past was much faster than spinning media, especially DVDs. These days in Python you can unzip to RAM and treat it like a small file system - and for that file size limits aren’t an issue in practice.
zip is used for Java jar files, OpenOffice documents and other cases.
The benefit is that individual files in the archive can be acces individually. A tgz file is a stream which can (without extra trickery) only be extracted from begin to end with no seeking to a specific record and no way to easily replace a single file without rewriting everything.
tgz is good enough for distributing packages which are supposed to be extracted at once (a software distribution)
Also, another usecase is to export data from production to uat for testing some scenarios, it can be easily encoded in a sqlite file.
CREATE TABLE functions (name TEXT, arch TEXT, body BLOB);
The advantage would be that binaries could be partially fattened, i.e. every function would have at least one implementation in some cross-platform bytecode (like WASM), and then some functions would get compiled to machine code as necessary, and then the really-performance-dependent functions would have extra rows for different combinations of CPU extensions or compiler optimization levels or whatever — and you could store all of these in the same executable instead of having a bunch of executables for each target.As a bonus, it'd be possible to embed functions' source code into the executable directly this way, whether for development purposes (kinda like how things are sometimes done in the old-school Smalltalk and Lisp worlds) or for debugging purposes (e.g. when printing stack traces).
Or if your files are large and composed of lots of blobs, then compress those blobs individually.
Whereas if your files are large and truly database-y made of tabular data like integers and floats and small strings, then compression isn't really very viable. You usually want speed of lookup, which isn't generally compatible with compression.
> An "SQLite Archive" is a file container similar to a ZIP archive or Tarball but based on an SQLite database.
Your parent comment said that when you're using SQLite as an application format, the content in the database don't get compressed. These two things have nothing to do with each other.
On a Mac, you'd e.g. use and mount a disk image if you wanted to create a filesystem inside of a file. Windows has virtual hard drives, and you can do that kind of thing on Linux too.
I don't understand why you'd ever want to use a relational database for that. It's a completely different paradigm.
Although I also don't really understand why you're worried about cluttering up a directory. And if it's transient, it's that when temp dirs are for?
Well, it might be a relation DB or else a zipfile. Why couldn't I encapsulate a file tree in a single file ? Maybe it's tens of thousands of quite small files.
So why would you want to use SQLite for that is my question? Mounting a database or a table as a filesystem doesn't make much sense to me. There's a very poor fit between the two paradigms. What does a subdirectory mean in a database? What does a foreign key or set of columns mean in a filesystem?
> On a Mac, you'd e.g. use and mount a disk image if you wanted to create a filesystem inside of a file. Windows has virtual hard drives, and you can do that kind of thing on Linux too.
So why wouldn't you use one of these if you need mounting? They're literally made for it.
I continue to not understand why you would want to mount a SQLite database instead of using one of these.
CREATE TABLE image_attributes ( name text, value blob);
CREATE TABLE layers (id text, parent_id text, sequence integer, uti text, name text, data blob);
CREATE TABLE layer_attributes ( id text, name text, value blob);
Also, document-based apps that use Apple's Core Data framework (kinda ORM) usually use SQLite files for storage.> and is backwards compatible to its inception in 2004 and which promises to continue to be compatible in decades to come.
That is pretty amazing. You could do a lot worse.
Doesn't mean that whatever the app stores inside will remain backward compatible which is the harder problem to solve.
I always wonder when people can sell ideas or products so effectively.