Columnar Storage

From MemCP
Jump to navigation Jump to search

The advantages for columnar storages over row based storages are the ability for good in-memory compression (low memory usage) and cache locality when accessing only few columns (performance).

How we designed the Interface

When designing an interface for a storage engine for an in-memory database, a lot of considerations have to be made. Here are the design goals:

  • The interface must be simple so that using it does not require implementing all kinds of corner cases (especially with datatypes)
  • The interface must allow fast value fetches for random data access
  • The interface must allow analyzing the data and finding the perfect storage and compression format

Here’s what we came up with:

type ColumnStorage interface {
        GetValue(uint) any // read function
        // buildup functions 1) prepare 2) scan, 3) proposeCompression(), if != nil repeat at 1, 4) init, 5) build; all values are passed through twice
        // analyze
        scan(uint, any)
        proposeCompression() ColumnStorage

        // store
        build(uint, any)

        // serialization

The read interface is pretty simple: GetValue(i) will read the column value at recordId i and return the scmer value. scmer is a kind of „any“ datatype that is used in our scheme interpreter to further process the data.

Data Analysis on Columnar Storages

The write interface is a bit more complicated and basically works in two passes:

  1. analyze phase:
    • prepare() is called
    • every future value that shall be stored will be passed to scan(recordId, value)
    • the columnar storage will analyze the data. If it knows a better storage format than its own class, it will return an other column container when calling proposeCompression()
    • when proposeCompression() returns another container, repeat step 1 analyze phase with the new container
    • otherwise proceed with the store phase
  2. store phase
    • init(size) is called where the storage can allocate the memory for its columns
    • for each value, build(recordId, value) is called to write the data to memory
    • for cleanup of possible reverse dictionaries that are only needed in the store phase, finish() is called

During analyze phase, statistics about the data can be collected. This allows a storage engine to check how wide the integers are or if there are lots of string duplicates.

The classes implementing that interface are then structured as follows:

  • StorageSCMER is a universal value storage; fast but needs lot of RAM
  • StorageSCMER analyzes a column whether it better fits the StorageString or StorageInt scheme; otherwise (i.e. for float64 values, stay at StorageSCMER)
  • StorageInt is a bit compressed integer storage. When analyzed values do not exceed i.e. 15, the storage will only eat up 4 bits per stored integer. The numbers are stored in an array of 64 bit integers which are shifted when reading and writing
  • StorageString is a dictionary compressed string storage using golang slices

With this interface we have built a powerful interface to implement any columnar storage compression format. We can support bit compressed integers as well as dictionaries.


We loaded a bunch of 55MiB of .jsonl files into our storage. This is our memory usage using the storage interface:

As you see, RAM compressed columnar storage needs only about half the RAM compared to disk based InnoDB storage or .json files.

Storage Memory Usage
MySQL InnoDB original storage 55 MiB Disk Space
Export from MySQL in .jsonl format 55 MiB Disk Space
Imported .jsonl as []map[string]any into delta storage 233 MiB RAM
Moved to main storage only using StorageSCMER 81 MiB RAM
Using StorageInt for integer columns 46 MiB RAM
Using StorageString for string columns 58 MiB RAM
Using StorageInt and StorageString 23 MiB RAM

For more info about compression, read the article about Compression