Distributed Databases

This page needs editing.

A High Level Overview

Distribution involves separate machines, connected via TCP. On each machine, typically several PicoLisp database processes are running, and they exchange objects via 'id' or 'ext', but - more importantly - can do remote calls (via 'pr', 'rd' etc., i.e. the PLIO protocol) and remote queries (see also ' remote/2').

Direct remote DB operations involve only read accesses (queries). Changes to the individual DBs have to be done the normal way (e.g. the 'put>' family of methods), where each application (PicoLisp process family) is maintaining its own DB.

'Several PicoLisp database processes' running on one machine form a 'PicoLisp process family' that is considered as one application with one database

There may be several such "families" on a single machine (or spread across many)
A single application, operating on a single database, consists of a parent process with an arbitrary number of child processes. This structure is necessary because synchronization of all processes that access a given database must go via a common parent (family IPC uses simple pipes).

Typically each application is a complete class hierarchy in itself, independent from (but knowing about) the other DBs.

"A single database" means usually a single directory, containing all files of that database. Theoretically, a database may consist of maximally 65536 files, but this dosn't make sense in a typical Unix environment, because of too many file descriptors and other resource problems. A single file can contain maximally 4 Tera objects (42 bit object ID).

It makes sense to run several applications (= databases) on a single machine (or spread across many), to get a better load distribution. There is no general rule, for opimal tuning some experimentation is required. It depends mostly on the number of CPU cores and the amount of available RAM (file buffer cache).

For the program logic (how those applications communicate with each other), it doesn't matter which application is running on which machine, as long as all is properly configered (see (doc '*Ext) and (doc 'remote/2)).





Many designs are surely possible. Stress is on ease of designing such structures, not on a given framework. The philosophy of PicoLisp was always to go for a vertical approach, with easy access from the lowest to the highest levels.

Relevant Functions

'task', 'pr', 'rd', 'tell', 'hear', 'sync', etc.

https://picolisp.com/wiki/?distributeddb

19mar16    erik