Introduction

This document defines the syntax and semantics of a Kueea Abstract Machine Version 1 Module Declaration Document, shortened throughout this document to just "Document".

One Document declares one Kueea AMv1 Module. The syntax is designed to be human-readable and fairly easy to read and write using the most simple text editors.

Document Processors are programs which take Documents as input. The primary output are source files in a given programming language. Other output include module documentation in HTML or other formats. They are part of toolchains that generate Kueea AMv1 M-Build images.

Keywords

The key words ‘MUST,’ ‘MUST NOT,’ ‘REQUIRED,’ ‘SHALL,’ ‘SHALL NOT,’ ‘SHOULD,’ ‘SHOULD NOT,’ ‘RECOMMENDED,’ ‘NOT RECOMMENDED,’ ‘MAY,’ and ‘OPTIONAL’ in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Document

A Document is a sequence of Unicode characters. The encoding of characters MUST be UTF-8.

Lines are sequences of characters separated by a sequence of two characters (in order): U+000D CARRIAGE RETURN and U+000A LINE FEED.

Whitespace is either U+0020 SPACE or U+0009 HORIZONTAL TAB.

Documents are processed line by line. The maximum length of a line is 1024 code units (bytes), including the line separator.

Three types of data may appear on a line: instructions, comments and text.

Instructions

An instruction is a line directed at the Document Processor. It begins with optional whitespace followed by one U+002E FULL STOP character, the instruction name and its arguments. Each argument is preceeded by at least one whitespace character.

Instruction names and arguments are case-sensitive.

Instruction names are sequences of four small Latin letters.

Instruction character set is limited to the range [U+0000, U+007F].

Examples:

.inst arg1 arg2
   .inst arg1

Comments

A comment is a line which is discarded. There are two kinds of comments.

A one-line comment begins with optional whitespace followed by no more than 1 consecutive U+0023 NUMBER SIGN character.

Examples:

# This is a one-line comment.
  # This is a one-line comment.
THIS IS NOT A COMMENT.
# # # This is a one-line comment.

A multi-line comment begins and ends with a line beginning with optional whitespace followed by 2 consecutive U+0023 NUMBER SIGN characters.

Examples:

## This is the first line of a multi-line comment.
This is a comment.
.This is a comment.
# # This is inside a multi-line comment.
  ## This is the last line of a multi-line comment.
THIS IS NOT A COMMENT.

Text

Any other line is text - secondary data associated with the current item, stored in a named buffer.

It is OPTIONAL for a Document Processor to process text.

Syntax and semantics of text are out of scope of this document.

By default, text is a human-readable textual description of the associated item, in Markdown. [MARKDOWN]

Examples:

.item example1
Description of the example1 item.

Description of the example1 item.

.item example2

Description of the example2 item.

Line indentation

The preceeding whitespace on an instruction sets the amount of ignored preceeding whitespace for text that comes after it.

Both of the whitespace characters count as one. Decide on the identation character for the document, please. U+0020 SPACE is RECOMMENDED because visual presentation of tabs vary.

Consider the following example:

   .inst first
   Line 1-1.
     Line 1-2.
  Line 1-3.
.inst second
   Line 2-1.
     Line 2-2.
Line 2-3.

The first line is an instruction indented by 3 whitespace characters. The ignored indentation becomes 3.

The second line is text indented by 3 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has no preceeding whitespace characters.

The third line is text indented by 5 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has 2 preceeding whitespace characters.

The fourth line is text indented by 2 whitespace characters. The parser removes the first 3 whitespace characters. In this case, the line has less whitespace, so all is removed. The resulting line has 0 preceeding whitespace characters.

Text for the first item is thus:

Line 1-1.
  Line 1-2.
Line 1-3.

The fifth line is an instruction indented by 0 whitespace characters. The ignored indentation becomes 0.

The sixth line is text indented by 3 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 3 preceeding whitespace characters.

The seventh line is text indented by 5 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 5 preceeding whitespace characters.

The eigth line is text indented by 0 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 0 preceeding whitespace characters.

Text for the second item is thus:

   Line 2-1.
     Line 2-2.
Line 2-3.

Formal syntax (ABNF)

The following MDOC rule expresses the syntax in ABNF.

MDOC  = k1md CRLF *( line CRLF )
; k1md defined in another section
line  = comm / inst / text

comm  = cmul / cone
cmul  = *WSP "##" *OCTET CRLF *WSP "##" *OCTET
cone  = *WSP  "#" *OCTET

inst  = *WSP  "." *( WSP / VCHAR )
text  = [ *WSP "\" ] *OCTET

ABNF rules are referenced in prose like this: <rule>.

The first non-whitespace character of <text> MAY be a U+005C REVERSE SOLIDUS. If so, the \ character is removed before further processing of the line. This is an escape mechanism in case the line begins with # or ..

Objects

An object is a set of variables. Variables of an object are referenced in prose like this: object.variable (using a dot separator).

Objects are categorized as items and non-items.

Non-items

This section defines non-item objects.

These objects are part of items.

Characters

A character is a Unicode code point.

Object reference

An object reference references another object.

When an object reference refers to nothing, it means that the reference is set to a value that does not reference any objects.

Containers

Some objects are stored in a container object.

A list is an ordered container, which means that position of elements in a list is significant.

A set is an unordered container.

A pair is a set of two elements.

Class identifier

Classes are identified by 128-bit values. These values are globally unique and are treated as opaque data.

It is expected that the value is a Universally Unique Identifier. The value MAY NOT be a valid UUID, although it SHOULD be.

The nil UUID value is reserved and means no value (invalid value). This value is called nil.

Class level

Members of a class are grouped under its levels.

A class level is an 8-bit unsigned binary integer.

An instance of a class at level n contains all members of the class declared at levels [0, n].

Memory alignment

An alignment exponent is an integer within the range [0, 31]. It is the exponent e in the expression 2e, result of which is the associated memory alignment in octets. Any other value is outside of a memory alignment's value range.

Tag

A tag is a list of characters.

The minimum length of the list is 1.

The maximum length of the list is 16.

Name

A name is a list of characters.

The minimum length of the list is 1.

The maximum length of the list is 64.

Most items have an associated name. Names are local in scope to the document.

Member reference

A member reference is a list of names.

Member references are relative to a class. The list designates the class's member by following the referenced names.

Names are deferenced in order and there is no way to reference a previously dereferenced name - infinite recursion never happens.

Item reference

An item reference is a pair of a name and a member reference.

The name contains a module's class identifier, a user-defined module alias or a predefined name.

When the name is empty, the item reference is an unknown item reference.

The member reference is relative to the module's class.

Type reference

A type reference is an object which consists of:

  • ref: item reference to a class.
  • clv: class level of the referenced class.
  • hnd: type of reference.

Possible values of hnd are:

null
No value set. Type reference is empty.
regt
Register type.
creg
Register type of the class.
cmem
Instance of the class.
none
Handle with no access rights; just the address. Use to avoid unnecessary duplication of rights.
read
Handle with read rights.
rdex
Handle with read and execute rights.
rdwr
Handle with read and write rights.
rwex
Handle with read, write and execute rights.

A handle is a system address associated with access rights to the referenced object.

The system gives access as many times as there are handles in a class. A class only needs to have one handle with specific rights to an object. Other handles to the same object SHOULD be of the no-access type in order to avoid unnecessary computations when passing objects around.

If ref is an unknown item reference and hnd designates a handle, the handle is to an unspecified object.

Array length

An array length is an object which consists of:

  • ref: member reference to the data member which stores the amount of elements.
  • min: minimum amount of elements.
  • max: maximum amount of elements.

Amount of elements is a 32-bit unsigned binary integer.

The min and max values are used in calculation of the minimum length and the possible maximum length of a class instance.

If max is zero, an array is not declared. If it is equal to 4294967295 (232-1), the value represents the maximum possible amount of elements; otherwise, the value is the given number.

The data member referenced by ref MUST be an instance of a register class with an unsigned integer register type. This member's memory address MUST be lower than the array's, i,e. it must be declared before the array. Both min and max MUST NOT be greater than the maximum value representable by the referenced member's class.

The maximum possible amount of elements is either 4294967295 divided by the maximum class length of an element, rounded down, or the maximum value representable by the type of the data member referenced by ref if ref is not empty.

An invalid array length is one of which either:

  • min is greater than max.
  • min is equal to max and ref is not empty.

Value

Representation of a value is implementation-dependent. This document only defines the syntax of a value in an instruction.

A value may be the undefined value. This state represents an omitted parameter. (There is no syntax for representing it in a Document.)

Condition

A condition is an object which consists of:

  • expr: expression of the condition.
  • name: name of the condition.

An expression is an object which consists of:

  • exop: operator of the expression.
  • data: data of the expression determined by exop.

Each expression evaluates to either true or false.

A condition holds when its expression evaluates to true.

Logical expression

data for exop values 'any is true', 'any is false', 'all are true', and 'all are false' is a logical expression, which is an object that consists of:
  • exp: list of expressions, evaluated in list order.

An empty list evaluates to true.

Comparision

data for exop values 'is equal to', 'is not equal to', 'is less than', 'is less than or is equal to', 'is greater than', and 'is greater than or is equal to' is a comparision, which is an object that consists of:
  • ref: member reference to the data member which is compared with the value,
  • val: the value.

The type of the referenced member SHOULD be a register class. Implementations MAY NOT support values that are arrays and instances of non-register classes.

Condition reference

data for exop values 'condition holds' and 'condition does not hold' is a condition reference, which is an object that consists of:
  • cnd: name of a condition.

Path component

A path component is a list of characters.

The minimum length of the list is 7.

The maximum length of the list is 1024.

Path components name external (out-of-memory) resources.

Interface reference

An interface reference is an object which consists of:

  • type: data member type of the interface.
  • mref: member reference to a data member.
  • clv: class level of the associated class.

Module import

A module import is an object which consists of:

  • mid: a module's class identifier.
  • mlv: minimum class level of the module.
  • name: an alias, name for the module.

Module

A module is an object which consists of:

  • this: object reference to the module's class.
  • types: set of classes.
  • paths: set of path components.
  • mload: set of module imports.

Function identifier

A function identifier (FID) is a 64-bit unsigned binary integer.

There MUST NOT be two functions with the same identifier within a module.

By default, the value is the result of passing a character string, constructed from the module's class identifier and relevant item names, to the 64-bit FNV-1a (Fowler-Noll-Vo) hash function:

hash = 0xCBF29CE484222325
for each octet_of_data to be hashed
  hash = hash XOR octet_of_data
  hash = hash * 0x100000001B3
# BEG DOCUMENT SPECIFIC
if hash == 0 then hash = 0xFFFFFFFFFFFFFFFF
# END DOCUMENT SPECIFIC
return hash

The value 0 is invalid; it can be used as an undefined value.

Examples:

Input stringOutput value
module_func0x0F7E93E1AF686350
class$00$function0x2862790D0CE9E837

Function identifiers should only be explicitly declared in case of a hash collision. The encoding of the string is the same as for the document - UTF-8, although it is practically limited to just the first 127 code points.

The input character string for the default FID of a function depends on the function's location within the item tree. If the function is a member of the module's class, the FID is computed over the name of the function only. Otherwise, the input string is computed as follows:

  1. Let s be an empty character string.
  2. Let c be the class.
  3. Let f be the function member.
  4. Append c.name to s.
  5. Append a U+0024 DOLLAR SIGN character to s.
  6. If c.clv is less than 16, append a U+0030 DIGIT ZERO to s.
  7. Append the hexadecimal representation of c.clv, using character ranges [U+0030,U+0039] and [U+0041,U+0046] (0-9 and A-F) for the values [0,15], to s.
  8. Append a U+0024 DOLLAR SIGN character to s.
  9. Append f.name to s.
  10. Return s as the input character string.

Items

This section defines item objects.

All items contain a text variable, which is a list of objects, which each consists of:

  • data: list of characters,
  • format: name; format of the data.

When appending an object n to text, where o is the last object in text: if n.format equals to o.format, data in n.data MAY be appended to o.data with a preceeding line separator instead of appending a new object.

Data member

A data member is an item which consists of:

  • mlv: class level of the associated module's class.
  • clv: class level of the associated class.
  • tags: set of tags.
  • type: type reference.
  • alen: array length.
  • align: memory alignment or zero.
  • value: default value.
  • cond: name of a condition.
  • name: name of the object.

The alignment of data member d is a 32-bit unsigned binary integer computed as follows:

  1. Return the default alignment of class referenced by type.ref at level type.clv if d.align is equal to zero.
  2. Return d.align.

The minimum length of data member d is a 32-bit unsigned binary integer computed as follows:

  1. Let ltype be a 32-bit unsigned binary integer.
  2. Set ltype to the minimum class length of class referenced by d.type.ref at level d.type.clv.
  3. Return ltype if d.alen.max is equal to zero.
  4. Return zero if d.alen.min is equal to zero.
  5. Let len be a 34-bit unsigned binary integer.
  6. Let off be a 34-bit unsigned binary integer.
  7. Let rem be a 32-bit unsigned binary integer.
  8. Set len to ltype.
  9. Set rem to d.alen.min.
  10. Set off to the alignment of d.
  11. Subtract one from off.
  12. While rem is greater than one:
    1. Add off to len.
    2. Set len to the binary AND of len and the binary NOT of off.
    3. Add ltype to len.
    4. Abort the program if len is greater than 4294967295.
    5. Subtract one from rem.
  13. Return len as a 32-bit unsigned binary integer.

The maximum length of data member d is a 32-bit unsigned binary integer computed as follows:

  1. Let ltype be a 32-bit unsigned binary integer.
  2. Set ltype to the maximum class length of class referenced by d.type.ref at level d.type.clv.
  3. Return ltype if d.alen.max is equal to zero.
  4. Let len be a 34-bit unsigned binary integer.
  5. Let off be a 34-bit unsigned binary integer.
  6. Let rem be a 32-bit unsigned binary integer.
  7. Set len to ltype.
  8. Set rem to d.alen.max.
  9. Set off to the alignment of d.
  10. Subtract one from off.
  11. While rem is greater than one:
    1. Add off to len.
    2. Set len to the binary AND of len and the binary NOT of off.
    3. Add ltype to len.
    4. Set len to 4294967295 if len is greater than 4294967295.
    5. Subtract one from rem.
  12. Return len as a 32-bit unsigned binary integer.

The allocated space of a data member is greater than or equal to its minimum length and less than or equal to its maximum length. This is the amount of octets that an instance occupies in memory. The value is a run-time variable if alen.ref is not empty.

The following tags are recognized in tags:

sameaddr
Member is a non-canonical member of a data union.
sametext
Item description is shared with the previous member.

A data union is a range in a list of data members, which begins with a member a, inclusive, and ends with a member b, exclusive, such that a.tags does not contain sameaddr and b.tags does not contain sameaddr. The member a is the canonical member of the union.

A non-canonical member of a data union is any data member d, d.tags of which contains sameaddr.

The canonical member of a data union defines the allocated space of the data union as a whole as well as its memory alignment.

The maximum length of a non-canonical member of a data union MUST NOT be greater than the maximum length of the canonical member.

If a non-canonical member d is such that d.alen.max represents the maximum possible amount of elements and d.alen.ref is empty, then the maximum possible amount of elements is the maximum length of the canonical member divided by the maximum length of an element, rounded down.

If alignment of a non-canonical member is greater than the alignment of the canonical member, then the non-canonical member is aligned farther into the union in order to match its memory alignment. This reduces the remaining allocated space available to the member.

If cond is not empty, the member exists only when the referenced condition holds. In case of a canonical member of a data union, the condition applies to the data union in whole.

All non-canonical members d of a data union, d.cond of which is not empty, are mutually exclusive with each other and the canonical member. The canonical member exists if the data union exists and none of conditions referenced by d.cond hold.

All non-canonical members d of a data union, d.cond of which is empty, always exist. These members have their destructors and accessors ignored. Thus, types of such members SHOULD NOT have these.

Function member

A function member is an item which consists of:

  • mlv: class level of the associated module's class.
  • clv: class level of the associated class.
  • fid: function identifier.
  • tags: set of tags.
  • params: list of function parameters.
  • codes: list of error codes.
  • name: name of the function.

Function parameter

A function parameter is an item which consists of:

  • tags: set of tags.
  • type: type reference.
  • name: name of the parameter.

Error code

An error code is an item which consists of:

  • name: name of a message function.
  • fid: identifier of the function.

Error codes are declared in a predefined module.

Named value

A named value is an item which consists of:

  • mlv: class level of the associated module's class.
  • clv: class level of the associated class.
  • value: value.
  • name: name of the value.

Named reference

A named reference is an item which consists of:

  • mlv: class level of the associated module's class.
  • clv: class level of the associated class.
  • iref: item reference.
  • name: name of the reference.

Path

A path is an item which consists of:

  • mlv: class level of the associated module's class.
  • path: path component.

Class

A class is an item which consists of:

  • cid: class identifier.
  • clv: class level.
  • regt: type reference.
  • rclv: class level (for regt).
  • desc: list of data members (descriptor).
  • data: list of data members (instance).
  • cond: set of conditions.
  • func: set of function members.
  • nref: set of named references.
  • nval: set of named values.
  • tags: set of tags.
  • ifaces: set of interface references.
  • name: name of the class.

A class, regt of which is non-empty, is called a register class. It has an implicit reserved member called regt which is a name reference to the register type referenced by regt.

The default alignment of class c at level lv is computed as follows:

  1. Let cur be a 32-bit unsigned binary integer.
  2. Let max be a 32-bit unsigned binary integer.
  3. Set max to zero.
  4. For each data member d in c.data:
    1. Advance to the next element if d.clv is greater than lv or d.tags contains sameaddr.
    2. Set cur to the alignment of d.
    3. Advance to the next element if cur is less than or equal to max.
    4. Set max to cur.
  5. Return max.

The minimum class length of class c at level lv is computed as follows:

  1. Let cur be a 32-bit unsigned binary integer.
  2. Let len be a 33-bit unsigned binary integer.
  3. Set len to zero.
  4. For each data member d in c.data:
    1. Advance to the next element if d.clv is greater than lv or d.tags contains sameaddr.
    2. If d.alen.max is not equal to zero and d.alen.ref is empty and d is not the last element in c.data, set cur to the maximum length of d; otherwise, set cur to the minimum length of d.
    3. Add cur to len.
    4. Abort the program if len is greater than 4294967295.
  5. Return len.

The maximum class length of class c at level lv is computed as follows:

  1. Let cur be a 32-bit unsigned binary integer.
  2. Let len be a 33-bit unsigned binary integer.
  3. Set len to zero.
  4. For each data member d in c.data:
    1. Advance to the next element if d.clv is greater than lv.
    2. Set cur to the maximum length of d.
    3. Add cur to len.
    4. Set len to 4294967295 if len is greater than 4294967295.
  5. Return len.

Document Processor

A Document Processor is a program that uses a Parser to fill a set of loaded modules and then does something with them.

Parser

A Parser consists of a line parser, an instruction parser and an instruction processor.

The state of a Parser consists of the following objects:

  • mset: set of loaded modules.
  • mcur: object reference to the current module.
  • text: object reference to the current item description.
  • format: name of the current text format.

The mset is supplied by the caller as an input/output parameter. Additionaly, the parser takes four objects as input parameters:

  • mcid: the target module identifier.
  • mclv: the target module's class level.
  • ignoreText: boolean, if true, text lines are ignored.
  • fetch: an external function for fetching Documents; input is the target module identifier and level.

The parser loads a module using the following algorithm and then postprocess the set of loaded modules. Postprocessing MAY be done as part of loading or as a separate step. The result is either success or failure.

  1. Search mset for a module m such that m.this.cid is equal to mcid.
  2. If found and m.this.clv is no less than mclv, return success.
  3. If found and m.this.clv is less than mclv, remove m from mset.
  4. Obtain an octet stream doc by calling fetch, passing mcid and mclv as arguments.
  5. If fetch has failed, return failure.
  6. Set mcur to a newly created module.
  7. Set mcur.this to a newly created class.
  8. Insert mcur.this into mcur.types.
  9. Set mcur.this.cid to mcid.
  10. Set text to nothing,
  11. Set format to markdown.
  12. Invoke the line parser passing doc and ignoreText.
  13. If the parser failed, delete mcur and return failure.
  14. If the mcur.this.clv is less than mclv, delete mcur and return failure.
  15. Insert mcur into mset.
  16. Iterate mcur.mload and recursively call this algorithm with the arguments from the elements; if any of the referenced modules failed to load, return failure.
  17. Return success.

Line parser

State of the line parser consists of the following objects:

  • line: current line (list of characters).
  • wslv: current line indentation (integer).
  • skip: ignored line indentation (ingeter).

The line parser takes two objects as input parameters:

  • input: byte stream.
  • ignoreText: boolean.

The parser loads lines until the end of input. Each loaded line is then processed.

Line loading

In order to load a line follow, the parser follows algorithm:

  1. Let cr be a boolean.
  2. Set line to the empty list.
  3. Set cr to false.
  4. While input is not empty:
    1. Decode the next character c from input.
    2. If line is longer than 1024 characters, return failure.
    3. Append c to line.
    4. If cr is true and c is U+000A LINE FEED, remove the last character in line and return success.
    5. If c is U+000D CARRIAGE RETURN, set cr to true; otherwise, set cr to false.
  5. Return success.

Line processing

The line parser counts the amount of whitespace at the beginning of line and stores the resulting value in wslv.

The first line in the document MUST begin with the sequence of eight bytes with values as defined by <k1os>.

Further processing depends on the first character after the whitespace.

If the line is a one-line comment, the line is ignored.

If the line is a multi-line comment, the parser loads subsequent lines until another multi-line comment is encountered. All of these lines are ignored.

If the line is an instruction, skip is set to wslv. Whitespace at the beginning and end of line is removed. The U+002E FULL STOP at the beginning is removed. The instruction parser is then invoked on line. In case of failure, the Parser MUST immediately return failure.

The other remaining possibility is that the line is text. If ignoreText is true, the line is ignored. Otherwise, up to skip whitespace characters are removed since the beginning of line. If the first character after the removal is \, the character is removed. If text references an object, create and append an object o with o.data equal to line and o.format equal to format, to text.

Instruction parser

The instruction parser converts the loaded line into a list of typed arguments and invokes the named instruction processor's function.

Instructions begin with a four-letter function name, followed by argument tokens, each preceeded by whitespace, as per the <cmd0> rule.

cmd0  = fun0 *( 1*WSP arg0 )
fun0  = 4LOALPHA
arg0  = TAGS / FID / FIDN / ALEN / VAL / CID / CREF
arg0 /= EXPR / UINT / ARGT / REGT / MEMT / IREF / MREF / NAME

The parser MUST return failure on any unrecognized function or argument, or when parsed arguments do not match the function's arguments.

Argument tokens are defined such that the parser can determine the type of an argument from the first few characters. The order of alternatives in <arg0> is the recommended order of tests.

Argument rules use capital letters by convention.

Class identifier

The <CID> rule represents a class identifier in hexadecimal notation.

CID   = "!" ( id16 / %s"NOID" )
id16  = 2HEXDIGIT 15( [ "-" ] 2HEXDIGIT )

The keyword NOID is the no-value ID (every octet is equal to zero).

Unsigned integer

The <UINT> rule represents a 64-bit unsigned binary integer value, in decimal or hexadecimal notation.

UINT  = udec / uhex
udec  = 1*20DIGIT
uhex  = "0x" 1*16HEXDIGIT

Integer arguments are unsigned and may be at most 64-bit long. They are either in decimal or hexadecimal notation.

Function identifier

The <FID> rule represents a function identifier. The <FIDN> rule represents a pair of a name and a function identifier.

FID    = "#" UINT
FIDN   = "#" NAME "#" UINT

Because there are instructions with more than one function identifier, there are two variants: one with a name and one without.

The value of an FID, when specified as an argument, MUST be non-zero.

Tags

The <FIDN> rule represents a list of tags.

TAGS = tag *( 1*WSP tag )
tag  = "+" 1*16LOALPHA

Tags are character string of up to 16 characters, which are preceeded by a U+002B PLUS SIGN character.

They are parsed into a list of strings.

Functions test for the presence of a tag in the list.

Name

The <NAME> rule represents a name.

NAME = LOALPHA *63( LOALPHA / DIGIT / "_" )

Items are given a human-readable name for reference. All letters in names MUST be small.

It is RECOMMENDED that names of functions be composed in subject-object-verb order, for example object_units_replace. The recommendation is for consistency and grouping of members. Note that one can generate aliases in camelCase, too, if needed. Source code could be converted back and forth.

Member reference

The <MREF> rule represents a member reference.

MREF = 1*( "." NAME )

Item reference

The <IREF> rule represents an item reference.

IREF = [ CID / NAME ] MREF

An item reference begins with a module reference, followed by a member reference.

The module reference may be omitted as a shorthand for referencing the module declared by the currently parsed document.

References are resolved after all modules are loaded. Referenced item MAY be declared later in a document.

Memory type

The <MEMT> rule represents a type reference associated with a data member.

MEMT = href / oref

href = hndr ":" ( oref / "?" )
hndr = %s"none" / %s"read" / %s"rdex" / %s"rdwr" / %s"rwex"

oref = %s"mem:" NAME / UINT ":" IREF

Reference to a class instance begins with an item reference to a class, followed by its level after a colon.

Reference to a predefined class instance begins with mem:, followed by the predefined class name.

Predefined classes handle, iface and class can only be referenced via a handle; they cannot be used as a memory type.

Handles begin with access rights associated with the handle (<hndr>), followed by a class reference in angle brackets.

Handles may reference objects of undefined (?) type.

For example, read:0:stream.buffer means a read-only handle to an instance of class buffer at level 0 from module stream.

Register type

The <REGT> rule represents a type reference to a register type.

REGT = %s"reg:" ( oref / NAME )

<NAME> MUST be a register type name as defined in [MACHINE], except the name of a memory reference. (Handles are for that.)

Parameter type

The <ARGT> rule represents a type reference associated with a function parameter.

ARGT = REGT / MEMT

Array length

ALEN = "[" [ MREF ":" ] arrl [ ":" arrl ] "]"
arrl = UINT / %s"MAX"

The value obtained by parsing <arrl> into an unsigned integer MUST NOT be greater than 232-1.

The special keyword MAX is an alias for 232-1.

Parsing examples:

         [10] => min =  10, max =  10, var = ()
       [1:20] => min =   1, max =  20, var = ()
      [2:MAX] => min =   2, max = MAX, var = ()
  [len:4:255] => min =   4, max = 255, var = ("len")
[obj.len:MAX] => min =   0, max = MAX, var = ("obj", "len")

Value

VAL  = "=" vval
vval = vreg / vref / vobj / varr / CID
vreg = valf / vali / valu / valc / valb
vref = "&" IREF
vobj = "{" [ NAME "=" vval *( "," NAME "=" vval ) ] "}"
varr = "[" [ vval ] *( "," [ vval ] ) "]"

sign = "+" / "-"
vhex = "0x" 1*HEXDIGIT
vdec = 1*DIGIT
fdec = vdec [ "."    1*DIGIT ] [ "e" [ sign ] vdec ]
fbin = vhex [ "." 1*HEXDIGIT ] [ "p" [ sign ] vdec ]

valu = vhex / vdec
vali = sign valu
valf = [ sign ] ( "NaN" / "INF" / fdec / fbin )
valc = %s"ce" / %s"lt" / %s"eq" / %s"gt"
valb = %s"true" / %s"false"

Values specify values of octets on given positions of an object. They can also be assigned to a name, creating a named value for reference.

The <VAL> rule is the most complex one; it contains recursion. Parsers MUST verify that all value lists are correctly terminated. Please study the rules calmly and thoroughly.

A register value can only be assigned to an instance of a register class; it is one of:

  • <valf>: real number.
  • <vali>: (signed) integer.
  • <valu>: unsigned integer.
  • <valc>: comparision result value.
  • <valb>: boolean.

<valn> references a named value.

<varr> is array of values. The target object MUST also be an array. The values in the array MUST be valid elements of the object. If the target array is longer, the excess values are undefined. An element MAY be omitted, in which case its value is undefined.

<CID> is a more readable notation for an array of 16 unsigned integers. Target objects of type id16 are expected, although, technically, any array of 16 or more integers is valid for this value.

<vobj> is a set of name-value pairs. The names reference a data member of the target object. It is an error if there is no data member with a matching name. Unreferenced members have undefined values.

Condition reference

The <CREF> rule represents a condition reference.

CREF = "?" NAME

A condition reference begins with a question mark, followed by a name of a condition.

Expression

The <EXPR> rule represents an expression.

EXPR = "(" ( elog / eref / ecmp ) ")"

elog = elop *( 0*WSP EXPR )
elop = "all=1" / "all=0" / "any=1" / "any=0"

eref = erop 0*WSP CREF
erop = "is=1" / "is=0"

ecmp = ecop 0*WSP MREF 0*WSP VAL
ecop = "eq" / "ne" / "lt" / "le" / "gt" / "ge"

An expression begins with a left parenthesis, followed by the expression operator and its arguments, followed by a right parenthesis.

Input textExpression type and operator name
all=1logical expression, 'all are true'
all=0logical expression, 'all are false'
any=1logical expression, 'any is true'
any=0logical expression, 'any is false'
eqcomparision; 'is equal to'
necomparision; 'is not equal to'
ltcomparision; 'is less than'
lecomparision; 'is less than or is equal to'
gtcomparision; 'is greater than'
gecomparision; 'is greater than or is equal to'
is=1condition reference; 'condition holds'
is=0condition reference; 'condition does not hold'

Path

PATH = "/" ppfx "/" 1*pchar ; pchar from RFC 3986
ppfx = %s"data" / %s"node" / %s"sync"

Path to an external resource.

Paths MUST begin with a predefined prefix and conform to the rules of URI [[RFC3986]][] path components.

Path prefixes correspond to:

  • /data/: read-only resources,
  • /node/: node-specific resources,
  • /sync/: synchronized resources,
  • /user/: user-provided resources; cannot be declared; these are named by the user, not the module.

Paths are case-sensitive.

The full URI is kueea1am://<id16><PATH>.

Instruction processor

Functions are defined by listing their parameters in ascending order, describing the function and formally specifiying its outcome.

State of the instruction processor consists of the following variables:

  • mfin: level finalization state (boolean).
  • clvl: current class level (integer).
  • ccur: current class (reference).
  • fcur: current function (reference).

An item name colllision between a class c and a name name occurs when any of the following is true:

  • name is regt.
  • Any data member d in c.data is such that: d.name is equal to name.
  • Any function member f in c.func is such that: f.name is equal to name.
  • Any named value v in c.nval is such that: v.name is equal to name.
  • Any named reference i in c.nref is such that: i.name is equal to name.

Additionally, if c is mcur.this:

  • Any class c in mcur.types is such that: c.name is equal to name.

An FID collision occurs against a function identifier fid when any class c in mcur.types is such that: any function member f in c.func is such that: f.fid is non-zero and equal to fid.

A class level violation occurs in list data when data is not an empty list and its last element d is such that: d.clv is equal to or greater than ccur.clv and d.mlv is less than mcur.this.clv.

The k1md function

k1md  = %s".k1md" 2SP CID
; The first 8 bytes are: 2E 6B 31 61 6D 20 20 21

Verifies input as a Document and begins declaration of a module (at level 0, finalized).

<CID> mcid
Class identifier of the declared module.

The instruction MUST be the first line of a Document. It is special in that it has no preceeding whitespace. It MUST NOT be on any subsequent line (except the first).

Modules are a singleton objects. Nodes have at most one instance of a given module, which is shared by all module implementations on that node. The instance is stored in volatile memory of the module's context.

Each module has a set of predefined function members. These functions are defined in another section of this document.

A module declaration declares a class. Thus, modules also have predefined functions of a class.

Implementations MUST return failure if any of the following is true:

  • version is higher (more) than 0.
  • mcur.this.cid is not mid.

Implementations MUST modify the state as follows:

  1. Set mcur.this.clv to 0.
  2. Set mcur.this.name to this.
  3. Set mfin to true.
  4. Set ccur to mcur.this.
  5. Set fcur to nothing.
  6. Set text to ccur.text.
  7. Insert predefined functions to mcur.this.func.

The text function

Updates the active text buffer(s).

<NAME> name
Buffer name.
<TAGS> tags
Tags.

The default buffer is named markdown.

Example:

This will go into the `markdown` buffer.
.text html
<p>This will go into the <code>html</code> buffer.</p>
.text ia32
.text amd64 +multi
This will go into both "ia32" and "amd64" buffers.

Implementations MUST modify the state as follows:

  1. Set format to type.

The load function

References an external module.

<CID> mid
Module identifier.
<UINT> mlv
Required minimum level of the module.
<NAME> name (optional)
Alias for the module.

Modules reference items declared in other modules. A reference to a module that has not been loaded is invalid.

Item references are processed after all modules are loaded. This instruction does not have to appear before a reference. It is RECOMMENDED that it appears at the beginning of a document.

Implementations MUST return failure if any of the following is true:

  • mid is nil.
  • mlv is equal to or more than 28.
  • name was given and any module import i in mcur.mload is such that: i.name is equal to name.

Implementations MUST modify the state as follows:

  1. Find a module import i in mcur.mload, where i.mid is equal to mid.
  2. If found and i.mlv is less than mlv:
    1. Set i.mlv to mlv.
    Otherwise (if not found):
    1. Create a new module import i.
    2. Set i.mid to mid.
    3. Set i.mlv to mlv.
    4. Set i.name to name.
    5. Insert i into mcur.mload.

The mlvl function

Increases the current level of the module.

<UINT> mlv
Module level.
<TAGS> tags
Tags associated with the level.

The new level applies to all items declared afterward. Module levels may only be increased.

tags contains either +final or +draft. If the level is final, no changes to it (and lower ones) will ever be made. This rule refers to the resulting tree of declared items; descriptions of the items are not considered.

What this means is that if a Document declares the same module as another, already known Document, then the declared items in both of these documents MUST be the same, except for their textual descriptions.

Implementations MUST return failure if any of the following is true:

  • mlv is equal to or greater than 28.
  • mlv is less than mcur.this.clv.
  • mlv is equal to 0, mcur.this.clv is equal to 0 and one of either: mcur.this.desc is not empty, mcur.this.data is not empty, mcur.this.func is not empty. mcur.this.nref is not empty. mcur.this.nval is not empty. mcur.this.ifaces is not empty. mcur.types has more than 1 element.
  • tags contains both +final and +draft.
  • tags contains neither +final nor +draft.
  • tags contains +final and mfin is false.

Implementations MUST modify the state as follows:

  1. Set ccur to mcur.this.
  2. Set fcur to nothing.
  3. Set text to ccur.text.
  4. Set mfin to false if +draft in tags.
  5. Set ccur.clv to mlv.

The cbeg function

Begins a class declaration.

<NAME> name
Name of the class.
<TAGS> tags
Type of the class.
<CID> cid (optional)
Identifier of the class.

By default, cid is set to a version 5 UUID (namespace, SHA-1) with the namespace being the module's class identifier and the name being name (encoded in UTF-8).

If cid is nil, the class will not have a descriptor. It is not possible to allocate an instance of such a class. These classes are only used as an object type.

If tags contains +iface, the class is an interface. Interfaces require cid for their identification. It is not possible to allocate an instance of an interface. These classes can be used as an object type.

Each class has a set of predefined function members. These functions are defined in another section of this document.

Implementations MUST return failure if any of the following is true:

  • tags contains +iface and cid is nil.
  • Root name collision occurs between mcur and name.
  • Any class c in mcur.types is such that: c.name is equal to name and c.cid is not equal to cid.
  • cid is not nil and any class c in mcur.types is such that: c.cid is equal to cid and c.name is not equal to name.

Implementations MUST modify the state as follows:

  1. Find the class c in mcur.types, where c.name is equal to name.
  2. If c was not found:
    1. Set c to a newly created class.
    2. Insert c into mcur.types.
    3. Set c.cid to cid.
    4. Set c.clv to 0.
    5. Set c.name to name.
    6. Set c.tags to tags.
  3. Set ccur to c.
  4. Set fcur to nothing.
  5. Set text to c.text.
  6. Insert predefined functions to ccur.func.

The cend function

Explicitly ends a class declaration.

No parameters.

Implementations MUST modify the state as follows:

  1. Set ccur to mcur.this.
  2. Set fcur to nothing.
  3. Set text to mcur.this.text.

The clvl function

Modifies the current level of the current class.

<UINT> level
New level.
<TAGS> tags (optional)
Tags.

The presence of +fini in tags specifies that the instance at this class level has a destructor. The destructor is also implied whenever any of the data members of the class at the level has a destructor because the members' destructors are called from the class destructor.

Implementations MUST return failure if any of the following is true:

  • ccur refers to mcur.this.
  • level is equal to or more than 28.
  • tags contains +fini and class level violation occurs in ccur.data.
  • tags contains +fini and any function member f in ccur.func is such that: f.clv is equal to level and f.name is equal to _fini.

Implementations MUST modify the state as follows:

  1. Set ccur.clv to level.
  2. Set fcur to nothing.
  3. Set text to ccur.text.
  4. If tags does not contain +fini, finish.
  5. Let f be a new function member representation.
  6. Set f.mid to mcur.this.clv.
  7. Set f.cid to level.
  8. Set f.name to _fini.
  9. Let fid be the default FID of f.
  10. Verify that FID collision does not occur against fid.
  11. Set f.fid to fid.
  12. Insert f into ccur.func.

The creg function

Assigns a register type to a class.

<REGT> type
Register type.

This instruction declares that the class is a register class.

The class is assigned a register type and gains two predefined function members for transferring a value from a register to object data and vice-versa, called save and load respectively. Otherwise, there is no mapping between registers and object data. Object data is nothing more than an array of opaque octets.

Any class MAY define functions with these names. Their parameters and semantics are predefined only for register classes.

Implementations MUST modify the state as follows:

  1. Return failure if any of the following is true:
    • ccur.regt is not empty.
    • class level violation occurs in ccur.data.
    • item name collision occurs between ccur and save.
    • item name collision occurs between ccur and load.
  2. Set ccur.regt to type.
  3. Set ccur.rclv to ccur.clv.
  4. Create a new function member save.
  5. Set save.name to save.
  6. Set save.fid to the defualt FID of save.
  7. Return failure if FID collision occurs against save.fid.
  8. Create a new function parameter ireg.
  9. Set ireg.type to (a copy of) type.
  10. Set ireg.name to reg.
  11. Append ireg to save.params.
  12. Insert save into ccur.func.
  13. Create a new function member load.
  14. Insert +read into load.tags.
  15. Set load.name to load.
  16. Set load.fid to the defualt FID of load.
  17. Return failure if FID collision occurs against load.fid.
  18. Create a new function parameter oreg.
  19. Set oreg.type to (a copy of) type.
  20. Set oreg.name to reg.
  21. Insert +output into oreg.tags.
  22. Append oreg to load.params.
  23. Insert load into ccur.func.

The cond function

Declares a named condition for the current class.

<NAME> name
Name of the condition.
<EXPR> expr
Expression of the condition.

Implementations MUST modify the state as follows:

  1. Return failure if any condition c in ccur.cond is such that c.name is equal to name.
  2. Create a new condition c.
  3. Set c.name to name.
  4. Set c.expr to expr.
  5. Insert c into ccur.cond.

The desc and data functions

Declares the next data member, in memory order, of an interface descriptor (desc) / of an instance (data).

Classes which are iterfaces have an additional object, in addition to the usual instance of the class, called an interface descriptor. A class that implements an interface includes the interface descriptor as part of its class descriptor as well as declares an instance of the (interface) class as one of its data members.

Interface descriptors are data used by programs which manipulate instances of unknown classes via their set of implemented interfaces.

All members of a class descriptor are read-only, constant values. The values of an interface descriptor vary by module implementation. Data in a class descriptor is valid for every instance of the class.

<TYPE> type
Type of the object.
<NAME> name
Name of the object.
<ALEN> alen (optional)
Length of an array.
<VAL> value (optional, only in data)
Default value.
<UINT> align (optional)
Memory alignment.
<CREF> cond (optional)
Condition reference. The member exists if the condition holds.
<TAGS> tags (optional)
Tags.
<NAME> uref (optional)
Reference to a previously declared union.

When omitted, alen.max is zero.

When omitted, value is the undefined value.

When omitted, align is zero.

When omitted, cond is empty.

When omitted, tags is empty.

When omitted, uref is empty.

If alen.min is less than alen.max and alen.ref is empty, then the currently declared data member SHOULD be the last data member declared at the current class level. The minimum amount is only a hint to the programmer in this case.

If alen.ref is not empty, then the allocated space of the declared data member is known only at program run time. Thus, memory addresses of subsequent data members are variables, too.

The current union of list data is the data union which begins from the last element d in data, such that d.tags does not contain sameaddr.

Implementations MUST modify the state as follows:

  1. Let data be an object reference.
  2. Let prev be an object reference.
  3. If in the desc function:
    1. Return failure if ccur.tags does not contain iface.
    2. Return failure if any data member d in ccur.desc is such that d.name is equal to name.
    3. Make data refer to ccur.desc.
  4. If in the data function:
    1. Return failure if item name collision occurs between ccur and name.
    2. Make data refer to ccur.data.
  5. Return failure if any of the following is true:
    • align is neither zero nor a memory alignment.
    • alen.max is non-zero and alen is an invalid array length.
    • tags contains +sameaddr and any of:
      • data is empty.
      • cond is empty, alen.max is non-zero and alen.ref is not empty.
      • value is not the undefined value.
    • tags does not contain sameaddr and uref is not empty.
    • tags contains sametext and any of:
      • data is empty.
      • uref is not empty.
    • Class level violation occurs in data.
  6. Create a new data member d.
  7. Set d.mlv to mcur.this.clv.
  8. Set d.clv to ccur.clv.
  9. Set d.tags to tags.
  10. Set d.type to type.
  11. Set d.alen to alen.
  12. Set d.align to align.
  13. Set d.value to value.
  14. Set d.cond to cond.
  15. Set d.name to name.
  16. If uref is not empty:
    1. Find a data member u in data, u.name of which is equal to uref.
    2. Return failure if not found.
    3. Return failure if u.tags contains sameaddr.
    4. Insert d after the last element of the data union that begins with u.
    5. Make prev refer to u.
  17. Otherwise:
    1. Make prev refer to the last element in data.
    2. Append d to data.
  18. Make fcur refer to nothing.
  19. Make text refer to prev.text if tags contains sametext. Otherwise, make text refer to d.text.

The nval function

Declares a named value.

<NAME> name
Name of the value.
<VAL> value
Named value.

Implementations MUST return failure if any of the following is true:

  • item name collision occurs between ccur and name.

Implementations MUST modify the state as follows:

  1. Create a new named value v.
  2. Set v.name to name.
  3. Set v.value to value.
  4. Append v to ccur.nval.
  5. Set fcur to nothing.
  6. Set text to v.text

The nref function

Declares a name for an item reference (an alias).

<NAME> name
Name for an item.
<IREF> iref
Reference to the item.

Implementations MUST return failure if any of the following is true:

  • item name collision occurs between ccur and name.

Implementations MUST modify the state as follows:

  1. Create a new named reference r.
  2. Set v.name to name.
  3. Set v.iref to iref.
  4. Insert r into ccur.nref.
  5. Set fcur to nothing.
  6. Set text to r.text

The fbeg function

Begins declaration of a function member.

<NAME> name
Name of the function.
<TAGS> tags (optional)
Tags.
<FID> fid (optional)
Identifier of the function.
<FIDN> fidn1 (optional)
Identifier of the first autogenerated function.
<FIDN> fidn2 (optional)
Identifier of the second autogenerated function.

The following tags may be specified in tags:

+message
Message function declaration.
+proto
Function prototype declaration.
+event
Event declaration.
+init
Class constructor declaration.
+static
Function is independent of an instance.
+read
Function does not write to an instance.
+module
Function requires access to module memory.
+kernel
Function requires access to kernel memory.
+more
Function expects more parameters than declared.

The tags +message, +proto, +event and +init are mutually exclusive.

If tags contains +message, a function map is declared, from a pair of a langauge tag and a character coding to a function, which takes the delcared parameters and returns a human-readable message. A module implementation SHOULD implement only one function per language. The kernel automatically converts the encoding to the requested one. The functions are local to an implementation and MAY NOT be exported. The FID of the function identifies the array of implemented functions. These functions are accessed through the kernel via a special call.

If tags contains +proto, only a function type is declared. No function is declared and no FID is allocated. A class may reference the prototype in order to declare one.

If tags contains +init, two functions are declared. The FID fid is assigned to the constructor. It is named name, which SHOULD begin with the word init. The function returns a status boolean and takes an implied read-write handle to an instance as the first parameter, called this. The FID named create is assigned to the creator. The name of the function is name with $create appended. The creator takes an implied set of parameters inserted at the front, which determine the location and type of allocated memory, where an instance will be constructed by the constructor. The just-constructed object is then returned to the caller.

If tags contains +event, three items are declared: a function prototype for an event handler and a handler installer and uninstaller functions. The FID fid is assigned to the event prototype. The event prototype has an implicit first parameter: a read-write handle to an undefined, previously installed object. The prototype is named name. The FID named install is assigned to the event installer. It returns a status boolean and takes two parameters: a function reference to a handler and a read-write handle to an object. The object is passed as the first argument to the handler. The function is named name with $install appended. The FID named uninstall is assigned to the event uninstaller. It returns a status boolean and takes one parameter: a function reference to a handler to be uninstalled. The function is named name with $uninstall appended.

Implementations MUST conform to the following algorithm:

  1. Return failure if any of the following is true:
    • tags contains both +static and +read.
    • ccur refers to mcur.this and tags contains +read.
    • item name collision occurs between ccur and name.
  2. If tags contains +message:
    1. If fid was omitted, set fid to the default FID of a member named name of ccur.
    2. Return failure if any of the following is true:
      • tags contains +proto, +event or +init.
      • Either fidn1 or fidn2 was not omitted.
      • FID collision occurs against fid.
  3. If tags contains +proto:
    1. Return failure if any of the following is true:
      • tags contains +message, +event or +init.
      • tags contains +module or +kernel.
      • Either fid, fidn1 or fidn2 was not omitted.
  4. If tags contains +init:
    1. Let nameC be a copy of name.
    2. Append $create to nameC.
    3. If fid was omitted, set fid to the default FID of a member named name of ccur.
    4. If fidn1 was not omitted, return failure if the associated name is not equal to create.
    5. If fidn1 was omitted, set fidn1 to the default FID of a member named nameC of ccur.
    6. Return failure if any of the following is true:
      • tags contains +message, +proto or +event.
      • fidn2 was not omitted.
      • FID collision occurs against fid or fidn1.
  5. If tags contains +event:
    1. Let fidI be an invalid function identifier (zero).
    2. Let fidU be an invalid function identifier (zero).
    3. Let nameI be a copy of name.
    4. Let nameU be a copy of name.
    5. Append $install to nameI.
    6. Append $uninstall to nameU.
    7. If fidn1 was not omitted and its associated name is equal to install, set fidI to fidn1.
    8. If fidI is invalid and fidn2 was not omitted and its associated name is equal to install, set fidI to fidn2.
    9. If fidI is invalid, set fidI to the default FID of a member named nameI of ccur.
    10. If fidn1 was not omitted and its associated name is equal to uninstall, set fidU to fidn1.
    11. If fidU is invalid and fidn2 was not omitted and its associated name is equal to uninstall, set fidI to fidn2.
    12. If fidU is invalid, set fidU to the default FID of a member named nameU of ccur.
    13. Return failure if any of the following is true:
      • tags contains +message, +proto or +init.
      • tags contains +read.
      • ccur does not refer to mcur.this and tags contains +static but neither +module nor +kernel.
      • fid was not omitted.
      • fidn1 is named neither install nor uninstall.
      • fidn2 is named neither install nor uninstall.
      • Both fidn1 and fidn2 are named install.
      • Both fidn1 and fidn2 are named uninstall.
      • FID collision occurs against fidI or fidU.
    14. Create a new function member fi.
    15. Set fi.tags to a copy of tags.
    16. Insert +$install into fi.tags.
    17. Remove +more from fi.tags.
    18. Set fi.mlv to mcur.this.clv.
    19. Set fi.clv to ccur.clv.
    20. Set fi.fid to fidI.
    21. Set fi.name to nameI.
    22. Insert fi into ccur.func.
    23. Create a new function member fu.
    24. Set fu.tags to a copy of tags.
    25. Insert +$uninstall into fu.tags.
    26. Remove +more from fu.tags.
    27. Set fu.mlv to mcur.this.clv.
    28. Set fu.clv to ccur.clv.
    29. Set fu.fid to fidU.
    30. Set fu.name to nameU.
    31. Insert fu into ccur.func.
    32. Create a new function parameter p.
    33. Set p.type to read.
    34. Set p.name to handler.
    35. Append a copy of p to fi.params.
    36. Append a copy of p to fu.params.
    37. Set p.type to a rdwr.
    38. Set p.name to userdata.
    39. Append p to fi.params.
    40. Insert +proto into tags.
    41. Remove +module from tags.
    42. Remove +kernel from tags.
  6. If tags contains +proto, set fid to zero.
  7. If ccur refers to mcur.this, insert +static into tags.
  8. Create a new function member f.
  9. Set f.mlv to mcur.this.clv.
  10. Set f.clv to ccur.clv.
  11. Set f.fid to fid.
  12. Set f.tags to tags.
  13. Set f.name to name.
  14. Insert f into ccur.func.
  15. If tags contains +message:
    1. Create a new function parameter p.
    2. Insert +output to p.tags.
    3. Set p.type to rdwr.
    4. Set p.name to message.
    5. Append p to f.params.
    6. Create a new function parameter p.
    7. Set p.type to FID.
    8. Set p.name to enc_and_lang.
    9. Append p to f.params.
  16. Set fcur to f.
  17. Set text to f.text.
  18. Return success.

The fend function

Explicitly ends a function declaration.

No parameters.

Implementations MUST modify the state as follows:

  1. Set fcur to nothing,
  2. Set text to ccur.text.

The ferr function

Declares an error code that the current function may return.

<NAME> name
Name of the error message.
<FID> sid (optional)
Function identifier of the error message.

If the function does not return an error code, then the document simply does not declare any error codes.

Implementations MUST conform to the following algorithm:

  1. If fid is invalid or was omitted, set fid to the default FID of a member named name of a module.
  2. Return failure if any of the following is true:
    • fcur is nothing.
    • fcur.tags contains +message.
    • fcur.tags contains +event.
  3. For each error code code in fcur.codes: Return failure if code.fid is equal to fid.
  4. Create a new error code code.
  5. Set code.name to name.
  6. Set code.fid to fid.
  7. Append code to fcur.codes.
  8. Set text to f.text.
  9. Return success.

The fpar function

Declares the next in-order parameter of the current function.

<ARGT> type
Type of the parameter.
<NAME> name
Name of the parameter.
<TAGS> tags (optional)
Parameter tags.

Function arguments are objects passed by either handle or value.

Objects passed by value are specified by simply writing a class reference.

.fpar module.class:0 by_value

If a class is a register class, the value is passed via one register. Otherwise, a reference to a local copy of the object is the value. Objects passed by value MUST NOT be longer than 4096 octets.

Objects passed by handle are written by specifying the handle.

.fpar read<module.class:0> by_handle

By default, a parameter is an input parameter. In order to declare an output parameter, include +output in tags. No other tag is recognized.

.fpar module.class:0 output_value +output

Order in which parameters are declated is significant. It is RECOMMENDED to first declare output then input parameters.

Implementations MUST conform to the following algorithm:

  1. Return failure if any of the following is true:
    • fcur is nothing.
    • name is this.
  2. For each function parameter p in fcur.params: Return failure if f.name is equal to name.
  3. Create a new function parameter p.
  4. Set p.tags to tags.
  5. Set p.type to type.
  6. Set p.name to name.
  7. Append p to fcur.params.
  8. Set text to p.text.
  9. Return success.

The impf function

Declares an implementation of a prototyped function.

<IREF> proto
Reference to a prototype function.
<NAME> name
Name of the function (the implementation).
<TAGS> tags (optional)
Tags.
<FID> fid (optional)
Identifier of the function.

Prototype implementations are marked with a special tag. The referenced prototype is stored as the return value type.

The following tags may be specified in tags:

+static
Function is independent of an instance.
+module
Function requires access to module memory.
+kernel
Function requires access to kernel memory.

Implementations MUST return failure if any of the following is true:

  • tags contains +proto.
  • tags contains +event.
  • tags contains +message.
  • tags contains +init.
  • tags contains +read.
  • tags contains +more.
  • item name collision occurs between ccur and name.
  • FID collision occurs against fid.

Implementations MUST modify the state as follows:

  1. Insert +$protoref into tags.
  2. Create a new function member f.
  3. Set f.mlv to mcur.this.clv.
  4. Set f.clv to ccur.clv.
  5. Set f.fid to fid.
  6. Set f.tags to tags.
  7. Set f.name to name.
  8. Create a new function parameter p.
  9. Set p.type to proto.
  10. Append p into f.params.
  11. Insert f into ccur.func.
  12. Set fcur to nothing.
  13. Set text to f.text.

The impc function

Declares that the current class implements an interface.

<TYPE> type
Reference to the implemented interface.
<MEMB> mref (optional)
Reference to the data member associated with the interface.

An interface object is an instance of type.

If mref is omitted, then the class does not declare any data member as the interface object of the interface. This is only permitted if type has no data members.

The interface object MUST NOT be preceeded by a variable-length member. The offset must be the same value for every instance of the class.

Module implementations set the values of interface descriptor fields via the implementation's class definition document. A RECOMMENDED syntax for these documents is proposed in another section.

Implementations MUST return failure if any of the following is true:

  • ccur.tags contains +iface.

Implementations MUST modify the state as follows:

  1. Create a new interface reference iface.
  2. Set iface.type to type.
  3. Set iface.mref to member.
  4. Set iface.clv to ccur.clv.
  5. Insert iface into ccur.ifaces.
  6. Set fcur to nothing.
  7. Set text to ccur.text.

The path function

Declares a path to an external resource.

<PATH> path
Path to the resource.

Implementations MUST return failure if any of the following is true:

  • Any path p in mcur.paths is such that: p.path is equal to path.

Implementations MUST modify the state as follows:

  1. Create a new path p.
  2. Set p.path to path and p.mlv to mcur.self.clv.
  3. Insert p into mcur.paths.
  4. Set cbeg to nothing,
  5. Set func to nothing,
  6. Set data to nothing.
  7. Set text to p.text.

Postprocessing

Check this section later when it's written.

Register names

Specials

boolean
A boolean value; either "true" or "false".
cmprval (comparision result value)
Result of a comparision; one of: "error", "less than", "same as" or "more than".

Unsigned integers

u8
Integer in the range [0, 28-1].
u16
Integer in the range [0, 216-1].
u32
Integer in the range [0, 232-1].
u64
Integer in the range [0, 264-1].
u128
Integer in the range [0, 2128-1].

Signed integers

i8
Integer in the range [-27, 27-1].
i16
Integer in the range [-215, 215-1].
i32
Integer in the range [-231, 231-1].
i64
Integer in the range [-263, 263-1].
i128
Integer in the range [-2127, 2127-1].

Vectors of integers

PLACEHOLDER.

Binary floating-point numbers

f16
IEEE 754 arithmetic format with base=2, p=11, emax=15.
f32
IEEE 754 arithmetic format with base=2, p=24, emax=127.
f64
IEEE 754 arithmetic format with base=2, p=53, emax=1023.
f80x87
IEEE 754 arithmetic format with base=2, p=64, emax=16383.
f128
IEEE 754 arithmetic format with base=2, p=113, emax=16383.

Vectors of binary floating-point numbers

PLACEHOLDER.

Decimal floating-point numbers

d32
IEEE 754 arithmetic format with base=10, p=7, emax=96.
d64
IEEE 754 arithmetic format with base=10, p=16, emax=384.
d128
IEEE 754 arithmetic format with base=10, p=34, emax=6144.

Predefined classes

Predefined classes have no associated class level.

Their names are written with capital letters.

Octet

There is only one fundamental type: an OCTET. All classes are essentially arrays of octets.

An octet occupies one memory address, under which there are at least 8 bits. If there are more than 8 bits, the excess bits MUST be cleared. There is no meaning associated with the bits of an octet.

Length of a class is expressed in octets.

Its register type is a vector of 8 bits. It is mappped to an 8-bit unsigned integer in practice.

Boolean

A BOOLEAN is either true (non-zero) or false (zero).

The length of a boolean is 1 octet.

Its memory alignment is 1 octet.

Its register type is a 1-bit unsigned integer. It is mappped to an 8-bit unsigned integer in practice.

Status boolean

A STATUS is a special boolean. It is used as the return value of functions.

If false, it means that there is nothing to report (success). Otherwise (if true), it means that the task's status stack was pushed onto. The caller should examine the stack before continuing.

The idea is that programs are written like this:

if function() returns true
{
  code when function reports something, usually failure
}
otherwise, continue

If the function returns a status boolean, it is implied that the code in the if-block is an unlikely branch, because functions are assummed to generally execute successfully.

A status code is returned via the status stack along with other, supplementary information.

Comparision result

A CMPRVAL is the result of a comparision. It is used as the return value of functions.

The length of a comparision result is 1 octet.

Its memory alignment is 1 octet.

Its register type is a 2-bit signed integer. It is mappped to an 8-bit signed integer in practice.

Possible values of a comparision result, when comparing object LHS against object RHS, are:

0
The objects are equal.
1 (and greater)
LHS is greater than RHS.
-1
Comparision failed. Interpreted as a true status boolean.
-2 (and less)
LHS is less than RHS.

Simply put, one first tests for -1 (unsuccessful) and then compares with 0.

Functions that return this value also set associated CPU flags accordingly, so that a conditional jump may immediately follow the function call.

Object length

The length of an objsize is 4 octets.

Its memory alignment is 4 octets.

Its register type is a 32-bit unsigned integer.

It represents a length of an object, in octets. Octets are ordered in increasing order of significance.

Memory address

The length of an ADDRESS is 8 octets.

Its memory alignment is 8 octets.

Its register type is a 64-bit unsigned integer.

It represents a memory address. Octets are ordered in increasing order of significance.

Function identifier

The length of an FID is 8 octets.

Its memory alignment is 8 octets.

Its register type is a 64-bit unsigned integer.

It represents a function identifier. Octets are ordered in increasing order of significance.

16-octet identifier

The length of an ID16 is 16 octets.

Its memory alignment is 8 octets.

It has no register type.

It is an opaque array of 16 octets.

System memory reference

The length of a HANDLE is 32 octets.

Its memory alignment is 8 octets.

Its register type is a CPU-defined virtual memory reference.

Its data members are as follows:

.cbeg HANDLE !NOID
.data ADDRESS address
.data ID16    node_id
.data OCTET   nonce [8]
address
Lower bits of the system memory address.
node_id
Higher bits of the system memory address.
nonce
Random value associated with the referenced object.

Loading from and saving into a handle are kernel functions, which translate the system memory address in the handle from and to a virtual memory address of the calling task.

This is not an object identifier. The address may point to any octet within an object.

Module reference

The length of an MREF is 24 octets.

Its memory alignment is 8 octets.

It has no register type.

Its data members are as follows:

.cbeg MREF !NOID
.data ID16  mcid
.data OCTET mclv
.data OCTET mbid [8] +sameaddr
mcid
Class identifier of the referenced module.
mclv
8-bit unsigned integer. Minimum class level of the module.
mbid
Build identifier of the target M-Build.

A reference is either to a specific M-Build by its build identifier or to any M-Build which implements the given module at the given level.

A build identifier is an 8-octet (256-bit) value, which is computed from relevant parts of a program image. The exact way to compute it depends on the image format.

The class level is considered when octets of the mbid array at positions [1,7] have all of their bits cleared. Thus, such build identifiers are out of range and invalid.

Function reference

The length of an FREF is 32 octets.

Its memory alignment is 8 octets.

It has no register type.

Its data members are as follows:

.cbeg FREF !NOID
.data MREF  mref
.data FID   fid
mref
Module reference.
fid
Function identifier.

Interface descriptor

The length of an IFACE is variable. The minimum length is 24 octets.

Its memory alignment is 8 octets.

It has no register type.

Its data members are as follows:

.cbeg IFACE  !NOID
.data ID16    cid
.data objsize clv_len
.data objsize offset
.data OCTET   members [0:MAX]
cid
Class identifier of the interface.
clv_len
Class level and total length of the descriptor in octets. The level in is the most significant 8 bits. The length in is the remaining least significant 24 bits. The length is rounded up to a multiple of 8. The value is also an offset from the beginning of the descriptor to the next one.
offset
Offset from the beginning of a class instance, specifying the location of the associated interface object. If equal to 4294967295, then there is no such object.
members
Data members of the interface descriptor.

Class descriptor

The length of a CLASS is variable. The minimum length is 32 octets.

Its memory alignment is 8 octets.

It has no register type.

It is a structure defined as follows:

.cbeg CLASS !NOID
.data ID16    cid
.data objsize len_dsc
.data objsize len_min
.data objsize len_max
.data OCTET   align
.data OCTET   clv
.data OCTET   flags
.data OCTET   ifaces_len
.data IFACE   ifaces [ifaces_len:MAX]

All OCTET-typed fields are interpreted as an 8-bit unsigned integer.

cid
Class: identifier.
len_dsc
Total length of the descriptor, in octets. The length is rounded up to a multiple of 8. The value is also an offset from the beginning of the descriptor to the next one.
len_min
Minimum length of an instance, in octets.
len_max
Maximum length of an instance, in octets.
align
Alignment exponent.
clv
Class level.
flags
Vector of 8 flag bits.
ifaces_len
Amount of implemented interfaces.
ifaces
Descriptors of implemented interfaces.

Class descriptor for level n of a class is directly followed by the descriptor for level n-1.

The descriptor for level n contains only those interfaces that were introduced at level n.

The defined flag bits in flags are, counted from the least significant:

bit 0
The class has a destructor.
bit 1
The class contains handles. (It has an accessor.)
bits 2-7
Undefined. Must be cleared.

Predefined functions

This section defines predefined functions.

Each predefined function name begins with a U+005F LOW LINE character (_), (which cannot normally be part of an item name). The function is also marked with a tag named $predef. Register functions are additionaly marked with a tag named $reg.

All functions except module functions operate on an instance. The first parameter (read-write handle) is omitted in definitions.

Register functions operate directly on memory of an object. They are defined in case the value does not need to be loaded. Most of them can be implemented with one or two CPU instuctions.

Register types also have two or three kinds of corresponding functions that operate entirely on CPU registers and does not reference memory. Names for these do not begin with _ (as this distinction is unnecessary) and have a digit appended to the name:

  1. Set with 1 appended which is the same;
  2. Set with 2 appended which returns the result instead of overwriting and discards the defined return value.
  3. Set with 3 appended which returns the result instead of overwriting and accepts an additional parameter for the defined return value, if any.

In C++ syntax with u8 and u32 as register types:

u8  c;
u32 v, s;
c = v.add1(2);    // v = v + 2; c = carry bit;
s = v.add2(2);    // v = v    ;              ; s = v + 2
s = v.add3(2, c); // v = v    ; c = carry bit; s = v + 2

Module functions

Module functions can only be called by the kernel. They require access to the module's context (where the instance is kept).

Creator

Name
_create
Parameters
None.
Return value
No-access handle to the instance.

This function is called when there is no instance available and a function that requires access to the module's context is being called. An implementation at the lowest loaded level is chosen.

It creates a new instance of the module.

The instance becomes the current instance of the node. It remains in memory until the node shuts down or the instance is upgraded or downgraded.

Upgrador

Name
_upgrade
Parameters
No-access handle to the old instance.
Return value
No-access handle to the new instance.

This function is called after an implementation is unloaded and the lowest level of all module implementations is higher than the level of the current instance. An implementation at the lowest loaded level is chosen.

The current instance is locked before the call and unlocked after.

This function MAY fail, in which case a null handle is returned. Failure have no consequences.

Downgrador

Name
_downgrade
Parameters
No-access handle to the old instance.
Target class level.
Return value
No-access handle to the new instance.

This function is called before an implementation is loaded or unloaded and the highest level of all module implementations would be lower than the level of the current instance. An implementation at the current instance level or higher is chosen.

The current instance is locked before the call and unlocked after.

This function MAY fail, in which case a null handle is returned. Failure prevents the loading of a lower level implementation.

Class functions

Class functions can only be called by the kernel.

Destructor

Name
_destruct
Parameters
Read-write handle to the instance.
Return value
Status boolean.

The destructor is called when there are no more references from a memory context to the class instance.

The destructor is called in a new task.

The kernel deallocates memory of the instance after this function returns.

Mutual exclusion

Name
_lock
Parameters
None.
Return value
None.

Acquires a lock on the object for exclusive access.

If the caller (task) disappears, the object is unlocked.

Name
_unlock
Parameters
None.
Return value
None.

Releases a previously acquired exclusive access lock on the object.

The task fails if the object has not been locked by the caller.

Memory access updater

Name
_access
Parameters
Read-only handle to the instance.
Memory context change (unsigned integer).
Return value
Status boolean.

This function is called when a memory context gains or loses access to the instance; it propagates the call to relevant referenced objects.

This function executes within the same context as the initial caller. The instance is locked before the call and unlocked after.

This MUST be an automatically generated function which is exactly the same (machine code is the same) regardless of implementation. One reason is to be able to verify that the machine code is correct. The function is crucial for correct node operation.

Register functions

This section defines register functions.

Load and save

These functions apply to every register class.

Name
_load
Parameters
None.
Return value
Register type; current value.

This function loads the value from memory into CPU registers.

Name
_save
Parameters
Register type; new value.
Return value
None.

This function saves the value from CPU registers into memory.

Bitwise operations

These functions apply to every register class.

Name
_not
Parameters
None.
Return value
None.

Performs the bitwise NOT operation (logical negation on each bit) on the value, then saves the result into the object, overwriting the initial value.

Name
_and
Parameters
Second value; register type.
Return value
None.

Performs the bitwise AND operation (logical conjunction on each pair of corresponding bits) on the value and the second value, then saves the result into the object, overwriting the initial value.

Name
_xor
Parameters
Second value; register type.
Return value
None.

Performs the bitwise XOR operation (exclusive disjunction on each pair of corresponding bits) on the value and the second value, then saves the result into the object, overwriting the initial value.

Name
_set
Parameters
Second value; register type.
Return value
None.

Performs the bitwise OR operation (logical disjunction on each pair of corresponding bits) on the value and the second value, then saves the result into the object, overwriting the initial value.

Name
_clr
Parameters
Second value; register type.
Return value
None.

Performs the bitwise AND operation on the value and the second value, after performing the bitwise NOT operation on the second value, then saves the result into the object, overwriting the initial value.

Logical bit shifts

These functions apply to every register class.

Name
_lsl
Parameters
Amount of bits; unsigned integer of width N.
Return value
None.

Performs a logical left shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.

Name
_lsr
Parameters
Amount of bits; unsigned integer of width N.
Return value
None.

Performs a logical right shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.

Circular bit shifts

These functions apply to every register class.

Name
_csl
Parameters
Amount of bits; unsigned integer of width N.
Return value
None.

Performs a circular left shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.

Name
_csr
Parameters
Amount of bits; unsigned integer of width N.
Return value
None.

Performs a circular right shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.

Arithmetic shifts

These functions apply only to signed integers.

Name
_asl
Parameters
Amount of bits; unsigned integer of width N.
Return value
None.

Performs an arithmetic left shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.

Name
_asr
Parameters
Amount of bits; unsigned integer of width N.
Return value
None.

Performs an arithmetic right shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.

Fundamental arithmetic

These functions apply to values that are single numbers.

Name
_neg
Parameters
None.
Return value
None.

Performs an arithmetic negation (sign inversion) on the value, then saves the result into the object, overwriting the initial value.

This function does nothing if the number is unsinged.

Name
_add
Parameters
Addend; register type.
Return value
For unsigned integers: carry bit; boolean.
For signed integers: overflow bit; boolean.

Performs an arithmetic addition on the value and the addend, then saves the sum into the object, overwriting the initial value.

Name
_sub
Parameters
Subtrahend; register type.
Return value
For unsigned integers: carry bit; boolean.
For signed integers: overflow bit; boolean.

Performs an arithmetic subtraction on the value (minuend) and the subtrahend, then saves the difference into the object, overwriting the initial value.

Name
_mul
Parameters
Multiplier; register type.
Return value
For integers: higher half bits of the product; register type.

Performs an arithmetic multiplication on the value and the multiplier, then saves the lower half bits of the product into the object, overwriting the initial value.

Name
_div
Parameters
Divisor; register type.
Return value
For integers: remainder; register type.

Performs an arithmetic division on the value (dividend) and the divisor, then saves the quotient into the object, overwriting the initial value.

Document identification

This section contains information on how to identify and mark Documents as such in their respective systems.

Data Format Descriptor

Data Format Descriptor for Documents is TO-BE-DEFINED.

Documents have the class #Document.

Internet Media Type

Media type of Documents is text/prs.kagomeko.k1os.

The charset parameter MUST be included with the value UTF-8.

References

Hyperlinks in the document will point here in a later revision.

Developement considerations

Modules ought to follow the KISS (Keep It Simple, Stupid) rule. They are to be narrow in scope so that their developement ends one day, the module becomes finalized and no more levels are ever added to it. If it does not need revision, it means it’s good and can be safely used.

Having too much functionality in a module makes it more unstable. Stability is the most important trait every module author ought to pursue. Modules that are constantly revised are broken by design. Such modules ought to be scrapped, made obsolete, and then redesigned as new modules (with a new identifier).

The scale of a module should ideally be small enough for one person to not become mentally exhausted (burned out) while implementing it alone. There should be many implementations available that a user can choose from.

One should differentiate between a module and a software project. The module ought to be a part of the project, not the project itself. For example, if a Python interpreter were to be a module, then Python 2 and Python 3 interpreters would be different modules, which might be developed as part of the same project.

They are different because the most crucial part—the parser and interpreter—are different for version 2 and for version 3. The other reason is that Python 2 is phased out in favour of 3. Keeping both 2 and 3 in the same module is counter-productive. Remember that items—once defined—cannot be removed from a module.

Some modules are published and standarized for the general public. Other modules may be known only to select few or be created as part of developement or a user session and have the lifetime of only few hours.