Current Status and Open Issues: Difference between revisions

From MemCP
Jump to navigation Jump to search
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 10: Line 10:
 
* iterateIndex: sort delta storage and merge it with main, so that correct order is always guaranteed
 
* iterateIndex: sort delta storage and merge it with main, so that correct order is always guaranteed
 
* merge join: <code>(scan_star schema tbls[] joincols[] filtercols[][] filterfn mapcols[][] mapfn reduce neutral)</code>
 
* merge join: <code>(scan_star schema tbls[] joincols[] filtercols[][] filterfn mapcols[][] mapfn reduce neutral)</code>
  +
* processes: implement kill switch in <code>sync.go</code> and set a correct context in <code>mysql.go</code>, also a process id concept is missing in the dependent mysql connection library
* processes: implementation with https://pkg.go.dev/context (killable, timeoutable), see also: https://pkg.go.dev/github.com/jtolds/gls and support KILL pid (processlist stored in in sync.Map pid->{user,commandstring,cancelfunc})
 
 
* transactions: map[shard]{deletionOverlay, insertionOverlay NonBlockingBitmap}
 
* transactions: map[shard]{deletionOverlay, insertionOverlay NonBlockingBitmap}
 
** inserts are inserted as deleted for the main view but the insertionOverlay will tell that the deletion is reversed after commit
 
** inserts are inserted as deleted for the main view but the insertionOverlay will tell that the deletion is reversed after commit
Line 24: Line 24:
 
** <code>type MMapReader interface { Reader MMap(int size) []byte }</code>
 
** <code>type MMapReader interface { Reader MMap(int size) []byte }</code>
 
** mutex: only one WriteCloser is able to append to that file
 
** mutex: only one WriteCloser is able to append to that file
  +
* Indexes for <code>LIKE</code> queries
   
 
=== RDF Frontend ===
 
=== RDF Frontend ===
Line 42: Line 43:
 
=== Scheme language ===
 
=== Scheme language ===
   
 
* support for http:// links in filenames for <code>load</code> and stream
* Logging and time measurement (controlled via contexts and https://pkg.go.dev/github.com/jtolds/gls)
 
 
* support for ipfs:// links in filenames for <code>load</code> and stream
* <code>(watch "filename" (lambda (newcontent) ...))</code> for file-based observers (RDFfop dynamic file updating)
 
* support for http:// links in filenames for <code>load</code> and <code>watch</code>
 
* support for ipfs:// links in filenames for <code>load</code> and <code>watch</code>
 
 
* JIT engine: specialize code either on bitcode level or on machine code level according to https://cs.emis.de/LNI/Proceedings/Proceedings241/363.pdf
 
* JIT engine: specialize code either on bitcode level or on machine code level according to https://cs.emis.de/LNI/Proceedings/Proceedings241/363.pdf
   
 
=== Infrastructure ===
 
=== Infrastructure ===
   
* MySQL importer
 
 
* Plugin concept e.g. for AI or external C++ libraries (GPU BLAS or something like that)
 
* Plugin concept e.g. for AI or external C++ libraries (GPU BLAS or something like that)

Latest revision as of 19:38, 20 November 2024

There are several TODOs in MemCP still. These are categorized as follows:

Storage Engine

  • Allow ALTER TABLE ENGINE = ...
  • Garbage Collection with an LRU policy on temporary columns
  • Triggers and change hooks on computed columns
  • Respect Foreign Keys
  • Serialize and Deserialize into MMapped big files (these bigfiles must be organized as key-value stores)
  • iterateIndex: sort delta storage and merge it with main, so that correct order is always guaranteed
  • merge join: (scan_star schema tbls[] joincols[] filtercols[][] filterfn mapcols[][] mapfn reduce neutral)
  • processes: implement kill switch in sync.go and set a correct context in mysql.go, also a process id concept is missing in the dependent mysql connection library
  • transactions: map[shard]{deletionOverlay, insertionOverlay NonBlockingBitmap}
    • inserts are inserted as deleted for the main view but the insertionOverlay will tell that the deletion is reversed after commit
    • during commit, all shards are locked at the same time
    • if deletions & deletionOverlay != 0, abort transaction (write after write conflict)
    • deletions = (deletions | deletionOverlay) & (^insertionOverlay) to apply the changes
    • during scans, the overlays must be respected
  • Memory-Mapped Serialize & Deserialize
    • import "https://github.com/edsrzf/mmap-go/blob/main/mmap.go"
    • mmap.Map(file, RDWR, 0)
    • map blocks of 100GiB chunks per database
    • encode a map[string]blob into these blocks
    • type MMapReader interface { Reader MMap(int size) []byte }
    • mutex: only one WriteCloser is able to append to that file
  • Indexes for LIKE queries

RDF Frontend

  • INSERT {triples} after a SELECT
  • DELETE {triples} after a SELECT
  • WHERE { OPTIONAL {} } syntax which is translated to an outer join

SQL Frontend

  • AUTO_INCREMENT
  • Convert Subqueries into LEFT JOIN (https://cs.emis.de/LNI/Proceedings/Proceedings241/383.pdf)
  • GROUP: force sharding of the grouptbl to be the same sharding schema as the group keys
  • Prejoin complex query plans (group on inter-table conditions)
  • system.grant table to restrict users to databases
  • Test tools like DBeaver and phpmyadmin and extend the parsed syntax and supported metadata tables

Scheme language

Infrastructure

  • Plugin concept e.g. for AI or external C++ libraries (GPU BLAS or something like that)