The machine and the system
This document defines Kueea Abstract Machine Version 1, a service in the context of Kueea Network [NETWORK].
A Kueea AMv1 is implemented by a set of multiple computing machines, which function together as if they were a single machine. The Kueea Abstract Machine Version 1 abstraction is created when the machines execute compatible system software which implements and supports the Kueea AMv1 protocol.
Such system must have full control over the hardware it executes on. What Kueea AMv1 abstracts is access to hardware which is physically distant from a CPU by mapping the hardware to the CPU's virtual memory. The end result is that from the point of view of a program, there is no distiction between "remote" and "local" hardware as both of them are accessible via virtual memory in exactly the same way.
In response to messages received via the protocol, the system creates a task.
A task is a system construct, partially defined by this document, which handles the execution of the requested [function].
A function can load values from memory into registers, do computations on values stored in registers and store values back into system memory, operatiions which are all also functions themselves. A function can also call other functions.
Once a task finishes, the system sends the result computed by the task back to the requester, at which point there is no more need for the task construct - it is removed from the system.
A register is a temporary storage unit for a value. They are only used as input or output parameters of a function.
Each register stores one value of one register type. Register types are defined by their range of permitted values.
There is no limit to the amount of registers.
Unsigned integers (binary)
Signed integers (binary, two's complement)
Real numbers (binary floating-point)
Real numbers (decimal floating-point)
There are three distinct memory (address) spaces: system memory, virtual memory and physical memory.
An octet is an 8-bit byte (a sequence of exactly 8 bits) with no associated meaning.
Physical memory is a sequence of bytes. These bytes are mapped into a sequence of octets by an algorithm that is out of scope of this document. From the point of view of Kueea AMv1, physical memory is a sequence of octets.
A Kueea AMv1 Node is a physical or virtual machine, which exposes up to 264 octets of its physical memory. Each octet is assigned a node-local physical address, in sequence order, beginning with 0.
Every allocated physical address is connected with some device, providing an input and/or output channel from/to the device.
Note: One physical machine may function as more than one node.
System memory is a segmented memory space, composed of 2128 distinct non-overlapping segments, each being a sequence of 264 octets. Each octet is assigned a unique 192-bit system address.
The most significant 128 bits of a system address uniquely identifies the segment – a node in a given Abstract Machine. The least significant 64 bits reference an octet exposed by the node.
Every allocated system address translates to a physical address.
Segment/node 0 is reserved for the empty-address value.
Each task in the system has its own virtual memory.
Virtual memory is a sequence of 264 bytes. Each byte has 8 or more bits and is assigned a 64-bit virtual address, in sequence order, beginning with 0.
All bits above the lowest 8 are assumed to be cleared when loading values to registers and are cleared when saving/storing them to memory.
Every allocated virtual address translates to a system address.
Virtual addresss 0 is never allocated.
Objects give meaning to octets stored in memory.
An object is composed of at least:
- object data;
- class of object data, also called the object's type;
- access counter.
Object data is a non-empty sequence of octets stored at a continous ascending sequence of system addresses which must all be part of the same segment. The data is identified by the system address of the first octet in the sequence and its length. Any octet within object data can be referenced.
An access counter is a non-negative integer value, which indicates the current amount of the object's users.
The core functionality of a Kueea AMv1 system is the sharing of objects between tasks.
Objects are accessible via one of three access contexts:
- task context, which contains objects accessible only to a given task;
- module context, which contains objects accessible only to a given module; and
- kernel context, which contains objects accessible only when a task is in kernel mode.
What exactly is kernel mode depends on the system. In most cases, it is a privileged mode of the underlying hardware.
When a task, a module or the kernel gains access to an object, its object data becomes accessible via, respectively, the task’s task context, the module’s module context or the kernel context and the object’s access counter is increased.
When a task, a module or the kernel loses access to an object, its object data becomes inaccessible via, respectively, the task’s task context, the module’s module context or the kernel context and the object’s access counter is decreased.
Each system task has associated allocator parameters. Whenever a task allocates memory, the system uses the current allocator parameters of a task in order to allocate required memory.
The parameters define the location and kind of memory to allocate. The amount of memory is not included in the parameters.
The function that sets the allocator parameters is different from the function that requests a portion of memory for an object.
This is because allocator parameters are out of scope of an object allocator — an object allocator only provides the length. The task sets where to allocate the object beforehand.
In order to allocate an object, the system reserves a consecutive sequence of octets from system memory and assigns it as object data. The associated area of memory becomes unavailable for reservation.
In order to deallocate an object, the system makes the system memory area corresponding to the object’s object data available for reservation.
A constructor of an object is a procedure that iniitializes all octets of object data to some iniital state.
A destructor of an object is a procedure that prepares the object for deallocation.
A preconstructor of an object is a procedure that initializes only those octets of object data which are required for a subsequent successful invocation of the destructor.
For a task to create an object:
- The task requests the system to allocate the object.
- The task has gained access to the object.
- The task invokes the object’s preconstructor.
- The task invokes the object’s constructor.
- If the constructor fails, the task loses access to the object.
When an object’s access counter is decreased to zero:
- The system invokes the object’s destructor in a new task.
- The system deallocates the object, regardless of whether the destructor succeeded or not.
This poses a question: What should the system do when a destructor fails? The preferred answer is to log the event and continue.
A Kueea OSv1 Module is a special object. At most one instance of a module exists per node.
Nodes maintain a module state for each module. The state is initially destructed (no instance). When destructed, the module's constructor may be called. If successful, the module state becomes constructed. Once constructed, module functions of the module can be called.
A module is defined by a Kueea OSv1 Module Declaration Document [MDOC]. It consists of declarations of classes and paths.
Every object is an instance of a class.
A class defines the semantics and structure of object data and the associated functions. It is identified by its class identifier - a 128-bit value, unique in the scope of an Abstract Machine.
A class identifier with all of its bits cleared is invalid.
A sequence of data members defines the structure of object data of objects that are instances of the class.
Members are grouped into class levels. Implementations of a classes at level n contain all members declared by levels [0,n] of the class.
Once a level is declared as final, all members declared at that level never change.
There is only one fundamental class - an octet. It has no members. The length of its object data is 1 octet.
An interfaces is a special class which cannot be allocated.
A class descriptor is a read-only object containing information on the class.
A sequence of descriptor members defines the structure of object data associated with the interface. The members are usually system addresses of functions.
Descriptor members are included in the class descriptor of the class that implements the interface.
Tasks may access class descriptors, which allows them to execute interface-defined functions without any prior knowledge about a given object. Programs are written to either manipulate an object based on its class or based on its interfaces.
A parameter is a register or an object.
An input parameter is a parameter, value of which is stored at a predefined hardware location on function call. On function return, the value at the location is undefined.
An output parameter is a parameter, value of which is stored at a predefined hardware location on function return. On function call, the value at the location is undefined.
These predefined hardware locations differ per CPU architecture. They are defined in architecture-specific ABI specification documents.
A function is an object, object data of which is a program routine. The routine is an "operation" that the Abstract Machine may execute. It receives a list of input parameters, does something, prepares a list of output parameters and returns.
Functions are categorized by the memory they access:
- Task functions access their input parameters only.
- Module functions additionally access objects in the module’s module context; and
- system functions additionally access objects in the system context.
Functions always have access to the task context. A function may be simultaneously a module function and a system function.
An error code is a special output parameter. It is an unsigned integer value which references a human-readable message describing an unexpected situation. Value of zero means that the function has nothing to report.
When a function succeeds, it returns an error code equal to zero.
Each task has a status stack, a stack of metadata explaining why has a function failed.
When a function fails, it returns an error code not equal to zero and pushes metadata about the failure onto the status stack. The innermost function clears the stack beforehand.
If a function does not define any error codes, it either succeeds or causes the task to fail.
A path is an identifier for a named resource. (Data such as configuration data, assets used by a game engine, style sheets, generally data stored in non-volatile memory.)
There are four kinds of resources named by paths:
- read-only resources, which are distributed alongside M-Build images,
- node-specific resources, which are stored at the node executing the task,
- synchronized resources, which are synchronized against a synchronization point that ensures the resource state is the same for any node at any given point in time.
- user-provided resources, which serve as input from and output to the user; paths are not declared for these, the user provides them.
An M-Build is an implementation of functions of a given module.
An M-Build image is a data object that contains one or more M-Builds. It is a program image interpreted and executed by system software. The system is not dependent on any particular image format, although there always exists a preferred format for these images.
Each M-Build image is uniquely identified based on the M-Builds (machine code and data) it contains.
This section defines the Kueea OSv1 Remote Procedure Call Protocol, which is identified by the external reference [NETWORK]