Persistency and Performance Guarantees: Difference between revisions

From MemCP
Jump to navigation Jump to search
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
MemCP gives the user several Guarantees for persistency and performance. These are the guarantees:
 
MemCP gives the user several Guarantees for persistency and performance. These are the guarantees:
   
 
== Performance Guarantees ==
 
MemCP guarantees that:
  +
 
* Insert/Update without any unique key check or foreign key check will scale over shards
 
* Aggregate functions like SUM, AVG will scale perfectly with the amount of CPUs
  +
 
With these guarantees, you can add more CPU cores whenever you have more data to ensure the same query time even with growing data.
  +
  +
Please consider that memory bandwith still might be a bottleneck. Even modern 128 core CPUs do not scale better than 8 cores for uncompressed data (usually 40 GiB/s bandwith on tested machines). '''MemCPs [[Columnar Storage]] scales beyond that limit.'''
 
== Persistency Guarantees ==
 
== Persistency Guarantees ==
 
There are three persistency modes per table which are:
 
There are three persistency modes per table which are:
Line 8: Line 17:
 
* ENGINE = logged
 
* ENGINE = logged
 
* ENGINE = safe
 
* ENGINE = safe
  +
In comparison to other RDBMs, MemCP can do this setting per table. MySQL/MariaDB for instance only allows a global setting which is sometimes not what you want.
   
 
=== ENGINE = memory ===
 
=== ENGINE = memory ===
Line 48: Line 58:
 
* an operation only succeeds after the data is synced to disk permanently
 
* an operation only succeeds after the data is synced to disk permanently
 
* in case of a crash, all data will be recovered
 
* in case of a crash, all data will be recovered
  +
* introduces delays for each transaction since the system has to wait for a write fence
 
* IO bound (limited to ~1,700 write operations per second)
 
* IO bound (limited to ~1,700 write operations per second)
 
* use it for accounting data or any data that must not be lost
 
* use it for accounting data or any data that must not be lost
 
== Performance Guarantees ==
 
MemCP guarantees that:
 
 
* Insert/Update without any unique key check or foreign key check will scale over shards
 
* Aggregate functions like SUM, AVG will scale perfectly with the amount of CPUs
 
 
With these guarantees, you can add more CPU cores whenever you have more data.
 

Latest revision as of 22:23, 8 October 2024

MemCP gives the user several Guarantees for persistency and performance. These are the guarantees:

Performance Guarantees

MemCP guarantees that:

  • Insert/Update without any unique key check or foreign key check will scale over shards
  • Aggregate functions like SUM, AVG will scale perfectly with the amount of CPUs

With these guarantees, you can add more CPU cores whenever you have more data to ensure the same query time even with growing data.

Please consider that memory bandwith still might be a bottleneck. Even modern 128 core CPUs do not scale better than 8 cores for uncompressed data (usually 40 GiB/s bandwith on tested machines). MemCPs Columnar Storage scales beyond that limit.

Persistency Guarantees

There are three persistency modes per table which are:

  • ENGINE = memory
  • ENGINE = sloppy
  • ENGINE = logged
  • ENGINE = safe

In comparison to other RDBMs, MemCP can do this setting per table. MySQL/MariaDB for instance only allows a global setting which is sometimes not what you want.

ENGINE = memory

  • all data is held in memory and only in memory
  • in case of a crash, all data is gone
  • the schema is saved on disk
  • after a recovery, the table starts empty
  • fastest way to store data
  • use it for session data, observer handles, caches and other data that can be recreated by the software

ENGINE = sloppy

  • all data is held in memory
  • the main storage is mirrored on disk
  • the delta storage is RAM-only
  • a main storage rebuild is triggered every 15 minutes, so data older than 15 minutes are guaranteed to be persistent
  • in case of a crash, the delta storage is gone, the main storage is recovered
  • after a recovery, some datasets or deletions that happend in the last 15 minutes before the crash may be gone
  • extremely fast way to store data without the fear of losing them in normal operation
  • use it for frequently updated tables with unimportant data like usage statistics or sensor data

ENGINE = logged

  • all data is held in memory
  • the main storage is mirrored on disk
  • changes to the delta storage are logged on disk files
  • an operation succeeds even if data is not synced to disk permanently yet
  • in case of a crash, data might be recovered
    • in case of a crash of the process, all data will be recovered
    • in case of a power outage or kernel crash, data might be lost
  • allows buffering of the log file in return for the risk of data loss
  • use it for data where you need a high update performance but cannot afford losing the last 15 minutes of your work

ENGINE = safe

  • all data is held in memory
  • the main storage is mirrored on disk
  • changes to the delta storage are logged on disk files
  • an operation only succeeds after the data is synced to disk permanently
  • in case of a crash, all data will be recovered
  • introduces delays for each transaction since the system has to wait for a write fence
  • IO bound (limited to ~1,700 write operations per second)
  • use it for accounting data or any data that must not be lost