DRAFT: Kueea AMv1: Module Declaration Document

Introduction

This document defines the syntax and semantics of a Kueea Abstract Machine Version 1 Module Declaration Document, shortened throughout this document to just "Document".

One Document declares one Kueea AMv1 Module. The syntax is designed to be human-readable and fairly easy to read and write using the most simple text editors.

Document Processors are programs which take Documents as input. The primary output are source files in a given programming language. Other output include module documentation in HTML or other formats. They are part of toolchains that generate Kueea AMv1 M-Build images.

Keywords

The key words ‘MUST,’ ‘MUST NOT,’ ‘REQUIRED,’ ‘SHALL,’ ‘SHALL NOT,’ ‘SHOULD,’ ‘SHOULD NOT,’ ‘RECOMMENDED,’ ‘NOT RECOMMENDED,’ ‘MAY,’ and ‘OPTIONAL’ in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Document

A Document is a sequence of Unicode characters. The encoding of characters MUST be UTF-8.

Lines are sequences of characters separated by a sequence of two characters (in order): U+000D CARRIAGE RETURN and U+000A LINE FEED.

Whitespace is either U+0020 SPACE or U+0009 HORIZONTAL TAB.

Documents are processed line by line. The maximum length of a line is 1024 code units (bytes), including the line separator.

Three types of data may appear on a line: instructions, comments and text.

Instructions

An instruction is a line directed at the Document Processor. It begins with optional whitespace followed by one U+002E FULL STOP character, the instruction name and its arguments. Each argument is preceeded by at least one whitespace character.

Instruction names and arguments are case-sensitive.

Instruction names are sequences of four small Latin letters.

Instruction character set is limited to the range [U+0000, U+007F].

Examples:

.inst arg1 arg2
   .inst arg1

Comments

A comment is a line which is discarded. There are two kinds of comments.

A one-line comment begins with optional whitespace followed by no more than 1 consecutive U+0023 NUMBER SIGN character.

Examples:

# This is a one-line comment.
  # This is a one-line comment.
THIS IS NOT A COMMENT.
# # # This is a one-line comment.

A multi-line comment begins and ends with a line beginning with optional whitespace followed by 2 consecutive U+0023 NUMBER SIGN characters.

Examples:

## This is the first line of a multi-line comment.
This is a comment.
.This is a comment.
# # This is inside a multi-line comment.
  ## This is the last line of a multi-line comment.
THIS IS NOT A COMMENT.

Text

Any other line is text - secondary data associated with the current item, stored in a named buffer.

It is OPTIONAL for a Document Processor to process text.

Syntax and semantics of text are out of scope of this document.

By default, text is a human-readable textual description of the associated item, in Markdown. [MARKDOWN]

Examples:

.item example1
Description of the example1 item.

Description of the example1 item.

.item example2

Description of the example2 item.

Line indentation

The preceeding whitespace on an instruction sets the amount of ignored preceeding whitespace for text that comes after it.

Both of the whitespace characters count as one. Decide on the identation character for the document, please. U+0020 SPACE is RECOMMENDED because visual presentation of tabs vary.

Consider the following example:

   .inst first
   Line 1-1.
     Line 1-2.
  Line 1-3.
.inst second
   Line 2-1.
     Line 2-2.
Line 2-3.

The first line is an instruction indented by 3 whitespace characters. The ignored indentation becomes 3.

The second line is text indented by 3 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has no preceeding whitespace characters.

The third line is text indented by 5 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has 2 preceeding whitespace characters.

The fourth line is text indented by 2 whitespace characters. The parser removes the first 3 whitespace characters. In this case, the line has less whitespace, so all is removed. The resulting line has 0 preceeding whitespace characters.

Text for the first item is thus:

Line 1-1.
  Line 1-2.
Line 1-3.

The fifth line is an instruction indented by 0 whitespace characters. The ignored indentation becomes 0.

The sixth line is text indented by 3 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 3 preceeding whitespace characters.

The seventh line is text indented by 5 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 5 preceeding whitespace characters.

The eigth line is text indented by 0 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 0 preceeding whitespace characters.

Text for the second item is thus:

   Line 2-1.
     Line 2-2.
Line 2-3.

Formal syntax (ABNF)

The following MDOC rule expresses the syntax in ABNF.

MDOC  = k1md CRLF *( line CRLF )
; k1md defined in another section
line  = comm / inst / text

comm  = cmul / cone
cmul  = *WSP "##" *OCTET CRLF *WSP "##" *OCTET
cone  = *WSP  "#" *OCTET

inst  = *WSP  "." *( WSP / VCHAR )
text  = [ *WSP "\" ] *OCTET

ABNF rules are referenced in prose like this: <rule>.

The first non-whitespace character of <text> MAY be a U+005C REVERSE SOLIDUS. If so, the \ character is removed before further processing of the line. This is an escape mechanism in case the line begins with # or ..

Objects

An object is a set of variables. Variables of an object are referenced in prose like this: object.variable (using a dot separator).

Objects are categorized as items and non-items.

Non-items

This section defines non-item objects.

These objects are part of items.

Characters

A character is a Unicode code point.

Object reference

An object reference references another object.

When an object reference refers to nothing, it means that the reference is set to a value that does not reference any objects.

Containers

Some objects are stored in a container object.

A list is an ordered container, which means that position of elements in a list is significant.

A set is an unordered container.

A pair is a set of two elements.

Class identifier

A class identifier is an object composed of:

mid: module identifier.
cno: class number.

A module identifier is a 120-bit value.

A module identifier is nil when all of its bits are cleared; this value is reserved.

If the first (most significant) bit of a module identifier is set, then the module is a standard module. Standard modules are the same on all Abstract Machines.

If the bit is cleared, the module is non-standard. Such modules are local to a given Abstract Machine.

The remaining 119 bits are randomly generated.

A class number is an integer in the range [0,255]. It identifies a class within the module.

Class numbers are assigned by name or explicitly by module authors. The exception is the module class (self) which has class number 0.

Class level

Members of a class are grouped under its levels.

A class level is an integer in the range [0,255].

An instance of a class at level n contains all members of the class declared at levels [0, n].

Memory alignment

An alignment exponent is an integer within the range [0, 31]. It is the exponent e in the expression 2^e, result of which is the associated memory alignment in octets. Any other value is outside of a memory alignment's value range.

Tag

A tag is a list of characters.

The minimum length of the list is 1.

The maximum length of the list is 16.

Name

A name is a list of characters.

The minimum length of the list is 1.

The maximum length of the list is 64.

Most items have an associated name. Names are local in scope to the document.

Member reference

A member reference is a list of names.

Member references are relative to a class. The list designates the class's member by following the referenced names.

Names are deferenced in order and there is no way to reference a previously dereferenced name - infinite recursion never happens.

Item reference

An item reference is a pair of a name and a member reference.

The name contains a module's class identifier, a user-defined module alias or a predefined name.

When the name is empty, the item reference is an unknown item reference.

The member reference is relative to the module's class.

Type reference

A type reference is an object which consists of:

ref: item reference to a class.
clv: class level of the referenced class.
hnd: type of reference.

Possible values of hnd are:

null: No value set. Type reference is empty.
regt: Register type.
creg: Register type of the class.
cmem: Instance of the class.
none: Handle with no access rights; just the address. Use to avoid unnecessary duplication of rights.
read: Handle with read rights.
rdex: Handle with read and execute rights.
rdwr: Handle with read and write rights.
rwex: Handle with read, write and execute rights.

A handle is a system address associated with access rights to the referenced object.

The system gives access as many times as there are handles in a class. A class only needs to have one handle with specific rights to an object. Other handles to the same object SHOULD be of the no-access type in order to avoid unnecessary computations when passing objects around.

If ref is an unknown item reference and hnd designates a handle, the handle is to an unspecified object.

Array length

An array length is an object which consists of:

ref: member reference to the data member which stores the amount of elements.
min: minimum amount of elements.
max: maximum amount of elements.

Amount of elements is a 32-bit unsigned binary integer.

The min and max values are used in calculation of the minimum length and the possible maximum length of a class instance.

If max is zero, an array is not declared. If it is equal to 4294967295 (2³²-1), the value represents the maximum possible amount of elements; otherwise, the value is the given number.

The data member referenced by ref MUST be an instance of a register class with an unsigned integer register type. This member's memory address MUST be lower than the array's, i,e. it must be declared before the array. Both min and max MUST NOT be greater than the maximum value representable by the referenced member's class.

The maximum possible amount of elements is either 4294967295 divided by the maximum class length of an element, rounded down, or the maximum value representable by the type of the data member referenced by ref if ref is not empty.

An invalid array length is one of which either:

min is greater than max.
min is equal to max and ref is not empty.

Value

Representation of a value is implementation-dependent. This document only defines the syntax of a value in an instruction.

A value may be the undefined value. This state represents an omitted parameter. (There is no syntax for representing it in a Document.)

Condition

A condition is an object which consists of:

expr: expression of the condition.
name: name of the condition.

An expression is an object which consists of:

exop: operator of the expression.
data: data of the expression determined by exop.

Each expression evaluates to either true or false.

A condition holds when its expression evaluates to true.

Logical expression

data for exop values 'any is true', 'any is false', 'all are true', and 'all are false' is a logical expression, which is an object that consists of:

exp: list of expressions, evaluated in list order.

An empty list evaluates to true.

Comparision

data for exop values 'is equal to', 'is not equal to', 'is less than', 'is less than or is equal to', 'is greater than', and 'is greater than or is equal to' is a comparision, which is an object that consists of:

ref: member reference to the data member which is compared with the value,
val: the value.

The type of the referenced member SHOULD be a register class. Implementations MAY NOT support values that are arrays and instances of non-register classes.

Condition reference

data for exop values 'condition holds' and 'condition does not hold' is a condition reference, which is an object that consists of:

cnd: name of a condition.

Path component

A path component is a list of characters.

The minimum length of the list is 7.

The maximum length of the list is 1024.

Path components name external (out-of-memory) resources.

Interface reference

An interface reference is an object which consists of:

type: data member type of the interface.
mref: member reference to a data member.
clv: class level of the associated class.

Module import

A module import is an object which consists of:

mid: a module identifier.
mclv: minimum class level of the module class.
name: an alias, name for the module.

Module

A module is an object which consists of:

self: object reference to the module's class.
types: set of classes.
paths: set of path components.
mload: set of module imports.

Function number

A function number is a 64-bit unsigned binary integer.

There MUST NOT be two functions with the same number within a module.

By default, the value is the result of passing a character string, constructed from the module's class identifier and relevant item names, to the 64-bit FNV-1a (Fowler-Noll-Vo) hash function:

hash = 0xCBF29CE484222325
for each octet_of_data to be hashed
  hash = hash XOR octet_of_data
  hash = hash * 0x100000001B3
# BEG DOCUMENT SPECIFIC
if hash == 0 then hash = 0xFFFFFFFFFFFFFFFF
# END DOCUMENT SPECIFIC
return hash

The value 0 is invalid; it can be used as an undefined value.

Examples:

Input string	Output value
`module_func`	`0x0F7E93E1AF686350`
`class$00$function`	`0x2862790D0CE9E837`

Function numbers should only be explicitly declared in case of a hash collision. The encoding of the string is the same as for the document - UTF-8, although it is practically limited to just the first 127 code points.

The input character string for the default FID of a function depends on the function's location within the item tree. If the function is a member of the module's class, the FID is computed over the name of the function only. Otherwise, the input string is computed as follows:

Let s be an empty character string.
Let c be the class.
Let f be the function member.
Append c.name to s.
Append a U+0024 DOLLAR SIGN character to s.
If c.clv is less than 16, append a U+0030 DIGIT ZERO to s.
Append the hexadecimal representation of c.clv, using character ranges [U+0030,U+0039] and [U+0041,U+0046] (0-9 and A-F) for the values [0,15], to s.
Append a U+0024 DOLLAR SIGN character to s.
Append f.name to s.
Return s as the input character string.

Items

This section defines item objects.

All items contain a text variable, which is a list of objects, which each consists of:

data: list of characters,
format: name; format of the data.

When appending an object n to text, where o is the last object in text: if n.format equals to o.format, data in n.data MAY be appended to o.data with a preceeding line separator instead of appending a new object.

Data member

A data member is an item which consists of:

mlv: class level of the associated module's class.
clv: class level of the associated class.
tags: set of tags.
type: type reference.
alen: array length.
align: memory alignment or zero.
value: default value.
cond: name of a condition.
name: name of the object.

The alignment of data member d is a 32-bit unsigned binary integer computed as follows:

Return the default alignment of class referenced by type.ref at level type.clv if d.align is equal to zero.
Return d.align.

The minimum length of data member d is a 32-bit unsigned binary integer computed as follows:

Let ltype be a 32-bit unsigned binary integer.
Set ltype to the minimum class length of class referenced by d.type.ref at level d.type.clv.
Return ltype if d.alen.max is equal to zero.
Return zero if d.alen.min is equal to zero.
Let len be a 34-bit unsigned binary integer.
Let off be a 34-bit unsigned binary integer.
Let rem be a 32-bit unsigned binary integer.
Set len to ltype.
Set rem to d.alen.min.
Set off to the alignment of d.
Subtract one from off.
While rem is greater than one:
1. Add off to len.
2. Set len to the binary AND of len and the binary NOT of off.
3. Add ltype to len.
4. Abort the program if len is greater than 4294967295.
5. Subtract one from rem.
Return len as a 32-bit unsigned binary integer.

The maximum length of data member d is a 32-bit unsigned binary integer computed as follows:

Let ltype be a 32-bit unsigned binary integer.
Set ltype to the maximum class length of class referenced by d.type.ref at level d.type.clv.
Return ltype if d.alen.max is equal to zero.
Let len be a 34-bit unsigned binary integer.
Let off be a 34-bit unsigned binary integer.
Let rem be a 32-bit unsigned binary integer.
Set len to ltype.
Set rem to d.alen.max.
Set off to the alignment of d.
Subtract one from off.
While rem is greater than one:
1. Add off to len.
2. Set len to the binary AND of len and the binary NOT of off.
3. Add ltype to len.
4. Set len to 4294967295 if len is greater than 4294967295.
5. Subtract one from rem.
Return len as a 32-bit unsigned binary integer.

The allocated space of a data member is greater than or equal to its minimum length and less than or equal to its maximum length. This is the amount of octets that an instance occupies in memory. The value is a run-time variable if alen.ref is not empty.

The following tags are recognized in tags:

sameaddr: Member is a non-canonical member of a data union.
sametext: Item description is shared with the previous member.

A data union is a range in a list of data members, which begins with a member a, inclusive, and ends with a member b, exclusive, such that a.tags does not contain sameaddr and b.tags does not contain sameaddr. The member a is the canonical member of the union.

A non-canonical member of a data union is any data member d, d.tags of which contains sameaddr.

The canonical member of a data union defines the allocated space of the data union as a whole as well as its memory alignment.

The maximum length of a non-canonical member of a data union MUST NOT be greater than the maximum length of the canonical member.

If a non-canonical member d is such that d.alen.max represents the maximum possible amount of elements and d.alen.ref is empty, then the maximum possible amount of elements is the maximum length of the canonical member divided by the maximum length of an element, rounded down.

If alignment of a non-canonical member is greater than the alignment of the canonical member, then the non-canonical member is aligned farther into the union in order to match its memory alignment. This reduces the remaining allocated space available to the member.

If cond is not empty, the member exists only when the referenced condition holds. In case of a canonical member of a data union, the condition applies to the data union in whole.

All non-canonical members d of a data union, d.cond of which is not empty, are mutually exclusive with each other and the canonical member. The canonical member exists if the data union exists and none of conditions referenced by d.cond hold.

All non-canonical members d of a data union, d.cond of which is empty, always exist. These members have their destructors and accessors ignored. Thus, types of such members SHOULD NOT have these.

Function member

A function member is an item which consists of:

mlv: class level of the associated module's class.
clv: class level of the associated class.
fno: function number.
tags: set of tags.
params: list of function parameters.
codes: list of error codes.
name: name of the function.

Function parameter

A function parameter is an item which consists of:

tags: set of tags.
type: type reference.
name: name of the parameter.

Error code

An error code is an item which consists of:

name: name of a message function.
fno: number of the function.

Error codes are declared in a predefined module.

Named value

A named value is an item which consists of:

mlv: class level of the associated module's class.
clv: class level of the associated class.
value: value.
name: name of the value.

Named reference

A named reference is an item which consists of:

mlv: class level of the associated module's class.
clv: class level of the associated class.
iref: item reference.
name: name of the reference.

Path

A path is an item which consists of:

mlv: class level of the associated module's class.
path: path component.

Class

A class is an item which consists of:

cid: class identifier.
clv: class level.
regt: type reference.
rclv: class level (for regt).
desc: list of data members (descriptor).
data: list of data members (instance).
cond: set of conditions.
func: set of function members.
nref: set of named references.
nval: set of named values.
tags: set of tags.
ifaces: set of interface references.
name: name of the class.

A class, regt of which is non-empty, is called a register class. It has an implicit reserved member called regt which is a name reference to the register type referenced by regt.

The default alignment of class c at level lv is computed as follows:

Let cur be a 32-bit unsigned binary integer.
Let max be a 32-bit unsigned binary integer.
Set max to zero.
For each data member d in c.data:
1. Advance to the next element if d.clv is greater than lv or d.tags contains sameaddr.
2. Set cur to the alignment of d.
3. Advance to the next element if cur is less than or equal to max.
4. Set max to cur.
Return max.

The minimum class length of class c at level lv is computed as follows:

Let cur be a 32-bit unsigned binary integer.
Let len be a 33-bit unsigned binary integer.
Set len to zero.
For each data member d in c.data:
1. Advance to the next element if d.clv is greater than lv or d.tags contains sameaddr.
2. If d.alen.max is not equal to zero and d.alen.ref is empty and d is not the last element in c.data, set cur to the maximum length of d; otherwise, set cur to the minimum length of d.
3. Add cur to len.
4. Abort the program if len is greater than 4294967295.
Return len.

The maximum class length of class c at level lv is computed as follows:

Let cur be a 32-bit unsigned binary integer.
Let len be a 33-bit unsigned binary integer.
Set len to zero.
For each data member d in c.data:
1. Advance to the next element if d.clv is greater than lv.
2. Set cur to the maximum length of d.
3. Add cur to len.
4. Set len to 4294967295 if len is greater than 4294967295.
Return len.

Document Processor

A Document Processor is a program that uses a Parser to fill a set of loaded modules and then does something with them.

Parser

A Parser consists of a line parser, an instruction parser and an instruction processor.

The state of a Parser consists of the following objects:

mset: set of loaded modules.
mcur: object reference to the current module.
text: object reference to the current item description.
format: name of the current text format.

The mset is supplied by the caller as an input/output parameter. Additionaly, the parser takes four objects as input parameters:

mcid: the target module identifier.
mclv: the target module's class level.
ignoreText: boolean, if true, text lines are ignored.
fetch: an external function for fetching Documents; input is the target module identifier and level.

The parser loads a module using the following algorithm and then postprocess the set of loaded modules. Postprocessing MAY be done as part of loading or as a separate step. The result is either success or failure.

Search mset for a module m such that m.this.cid is equal to mcid.
If found and m.this.clv is no less than mclv, return success.
If found and m.this.clv is less than mclv, remove m from mset.
Obtain an octet stream doc by calling fetch, passing mcid and mclv as arguments.
If fetch has failed, return failure.
Set mcur to a newly created module.
Set mcur.self to a newly created class.
Insert mcur.self into mcur.types.
Set mcur.self.cid to mcid.
Set text to nothing,
Set format to markdown.
Invoke the line parser passing doc and ignoreText.
If the parser failed, delete mcur and return failure.
If the mcur.self.clv is less than mclv, delete mcur and return failure.
Insert mcur into mset.
Iterate mcur.mload and recursively call this algorithm with the arguments from the elements; if any of the referenced modules failed to load, return failure.
Return success.

Line parser

State of the line parser consists of the following objects:

line: current line (list of characters).
wslv: current line indentation (integer).
skip: ignored line indentation (ingeter).

The line parser takes two objects as input parameters:

input: byte stream.
ignoreText: boolean.

The parser loads lines until the end of input. Each loaded line is then processed.

Line loading

In order to load a line follow, the parser follows algorithm:

Let cr be a boolean.
Set line to the empty list.
Set cr to false.
While input is not empty:
1. Decode the next character c from input.
2. If line is longer than 1024 characters, return failure.
3. Append c to line.
4. If cr is true and c is U+000A LINE FEED, remove the last character in line and return success.
5. If c is U+000D CARRIAGE RETURN, set cr to true; otherwise, set cr to false.
Return success.

Line processing

The line parser counts the amount of whitespace at the beginning of line and stores the resulting value in wslv.

The first line in the document MUST begin with the sequence of eight bytes with values as defined by <k1os>.

Further processing depends on the first character after the whitespace.

If the line is a one-line comment, the line is ignored.

If the line is a multi-line comment, the parser loads subsequent lines until another multi-line comment is encountered. All of these lines are ignored.

If the line is an instruction, skip is set to wslv. Whitespace at the beginning and end of line is removed. The U+002E FULL STOP at the beginning is removed. The instruction parser is then invoked on line. In case of failure, the Parser MUST immediately return failure.

The other remaining possibility is that the line is text. If ignoreText is true, the line is ignored. Otherwise, up to skip whitespace characters are removed since the beginning of line. If the first character after the removal is \, the character is removed. If text references an object, create and append an object o with o.data equal to line and o.format equal to format, to text.

Instruction parser

The instruction parser converts the loaded line into a list of typed arguments and invokes the named instruction processor's function.

Instructions begin with a four-letter function name, followed by argument tokens, each preceeded by whitespace, as per the <cmd0> rule.

cmd0  = fun0 *( 1*WSP arg0 )
fun0  = 4LOALPHA
arg0  = TAGS / FNO1 / FNO2 / ALEN / LVAL / CREF
arg0 /= EXPR / UINT / VALT / MEMT / AREF / RREF / NAME

The parser MUST return failure on any unrecognized function or argument, or when parsed arguments do not match the function's arguments.

Argument tokens are defined such that the parser can determine the type of an argument from the first few characters. The order of alternatives in <arg0> is the recommended order of tests.

Argument rules use capital letters by convention.

Unsigned integer

UINT  = Udec / Uhex
Udec  = 1*20DIGIT
Uhex  = %s"0x" 1*16HEXDIGIT

The <UINT> rule represents a 64-bit unsigned binary integer value.

It is in decimal (<Udec>) or in hexadecimal (<Uhex>) notation.

Name

NAME = LOALPHA *63( LOALPHA / DIGIT / "_" )

The <NAME> rule represents an item name.

It is a character sequence of up to 64 characters. All letters in a name MUST be small.

Names are used in item references. References are resolved after all documents are fully loaded. The item being referenced MAY be declared in a document later on.

It is RECOMMENDED that names of functions be composed in subject-object-verb order, for example object_units_replace. The recommendation is for consistency and grouping of members. Note that one can generate aliases in camelCase, too, if needed. Source code could be converted back and forth.

Relative item reference

RREF = 1*( "." NAME )

The <RREF> rule represents a relative item reference.

Items are referenced in tree order by their name.

Absolute item reference

AREF  = ( Ahex / Atxt ) RREF
Ahex  = "!" 2HEXDIGIT 15( [ ( "-" / ":" ) ] 2HEXDIGIT )
Atxt  = "." [ NAME ]

The <AREF> rule represents an absolute item reference.

It begins with a reference to the module, either by identifier (<Ahex>) or by name (<Atxt>), and then an item reference (<RREF>) relative to the module.

If the name variant has an empty name, the named module is the currently declared module.

Memory type

TMEM  = [ Thnd ] Tobj
Thnd /= %s"none:" / %s"read:" / %s"rdex:" / %s"rdwr:" / %s"rwex:"
Tobj  = %s"OCTET" / ( %s"C" / %s"T" ) AREF

The <MEMT> rule represents a type reference to a memory type.

If the type of handle (<Thnd>) is specified, the memory type is a handle; the remaining portion specifies to which class the handle refers to.

The class can be either OCTET (an octet; in case of handles, a handle to an octet is a handle to any class), a reference to a class item (when preceeded by a C), or a reference to a type defintion item (when preceeded by a T).

Value type

TVAL = %s"tval:" ( NAME / Tobj )

The <TVAL> rule represents a type reference to a value type.

The type is either given directly by <NAME> or it is derived from the referenced class (<Tobj).

Parameter type

TPAR = TVAL / TMEM

The <TPAR> rule represents a type reference to any type.

It is defined for function parameters, which can be either memory types or value types.

Function number

The <FNO1> rule represents an unnamed function number.

The <FNO2> rule represents a named function number.

FNO1   = "#" UINT
FNO2   = "#" NAME "#" UINT

Because more than one function may be declared with one instruction, there are two variants: named and unnamed.

The value, when specified as an argument, MUST be non-zero.

Array length

ALEN = "[" [ RREF ":" ] arrl [ ":" arrl ] "]"
arrl = UINT / %s"MAX"

The value obtained by parsing <arrl> into an unsigned integer MUST NOT be greater than 2³¹.

The special keyword MAX means the maximum permitted value.

Parsing examples:

          [10] => min =  10, max =  10, var = ()
        [1:20] => min =   1, max =  20, var = ()
       [2:MAX] => min =   2, max = MAX, var = ()
  [.len:4:255] => min =   4, max = 255, var = ("len")
[.obj.len:MAX] => min =   0, max = MAX, var = ("obj", "len")

Literal value

LVAL = "=" vtxt
vtxt = vval / vref / vobj / varr
vval = valf / vali / valu / valc / valb
vref = "&" IREF
vobj = "{" [ NAME "=" vtxt *( "," NAME "=" vtxt ) ] "}"
varr = "[" [ vtxt ] *( "," [ vtxt ] ) "]"

sign = "+" / "-"
vhex = "0x" 1*HEXDIGIT
vdec = 1*DIGIT
fdec = vdec [ "."    1*DIGIT ] [ "e" [ sign ] vdec ]
fbin = vhex [ "." 1*HEXDIGIT ] [ "p" [ sign ] vdec ]

valu = vhex / vdec
vali = sign valu
valf = [ sign ] ( "NaN" / "INF" / fdec / fbin )
valc = %s"ce" / %s"lt" / %s"eq" / %s"gt"
valb = %s"true" / %s"false"

Values specify values of octets on given positions of an object. They can also be assigned to a name, creating a named value for reference.

The <VAL> rule is the most complex one; it contains recursion. Parsers MUST verify that all value lists are correctly terminated. Please study the rules calmly and thoroughly.

A register value can only be assigned to an instance of a register class; it is one of:

<valf>: real number.
<vali>: (signed) integer.
<valu>: unsigned integer.
<valc>: comparision result value.
<valb>: boolean.

<valn> references a named value.

<varr> is array of values. The target object MUST also be an array. The values in the array MUST be valid elements of the object. If the target array is longer, the excess values are undefined. An element MAY be omitted, in which case its value is undefined.

<vobj> is a set of name-value pairs. The names reference a data member of the target object. It is an error if there is no data member with a matching name. Unreferenced members have undefined values.

Condition reference

The <CREF> rule represents a condition reference.

CREF = "?" NAME

A condition reference begins with a question mark, followed by a name of a condition.

Expression

The <EXPR> rule represents an expression.

EXPR = "(" ( elog / eref / ecmp ) ")"

elog = elop *( 0*WSP EXPR )
elop = "all=1" / "all=0" / "any=1" / "any=0"

eref = erop 0*WSP CREF
erop = "is=1" / "is=0"

ecmp = ecop 0*WSP RREF 0*WSP VAL
ecop = "eq" / "ne" / "lt" / "le" / "gt" / "ge"

An expression begins with a left parenthesis, followed by the expression operator and its arguments, followed by a right parenthesis.

Input text	Expression type and operator name
`all=1`	logical expression, 'all are true'
`all=0`	logical expression, 'all are false'
`any=1`	logical expression, 'any is true'
`any=0`	logical expression, 'any is false'
`eq`	comparision; 'is equal to'
`ne`	comparision; 'is not equal to'
`lt`	comparision; 'is less than'
`le`	comparision; 'is less than or is equal to'
`gt`	comparision; 'is greater than'
`ge`	comparision; 'is greater than or is equal to'
`is=1`	condition reference; 'condition holds'
`is=0`	condition reference; 'condition does not hold'

Path

PATH = "/" ppfx "/" 1*pchar ; pchar from RFC 3986
ppfx = %s"data" / %s"node" / %s"sync"

Path to an external resource.

Paths MUST begin with a predefined prefix and conform to the rules of URI [[RFC3986]][] path components.

Path prefixes correspond to:

/data/: read-only resources,
/node/: node-specific resources,
/sync/: synchronized resources,
/user/: user-provided resources; cannot be declared; these are named by the user, not the module.

Paths are case-sensitive.

The full URI is kueea1am://<Ahex><PATH>.

Instruction processor

Functions are defined by listing their parameters in ascending order, describing the function and formally specifiying its outcome.

State of the instruction processor consists of the following variables:

mfin: level finalization state (boolean).
clvl: current class level (integer).
ccur: current class (reference).
fcur: current function (reference).

An item name colllision between a class c and a name name occurs when any of the following is true:

name is regt.
Any data member d in c.data is such that: d.name is equal to name.
Any function member f in c.func is such that: f.name is equal to name.
Any named value v in c.nval is such that: v.name is equal to name.
Any named reference i in c.nref is such that: i.name is equal to name.

An FID collision occurs against a function number fid when any class c in mcur.types is such that: any function member f in c.func is such that: f.fid is non-zero and equal to fid.

The `k1md` function

k1md  = %s".k1md" SP %s"M" Ahex 1*WSP ( %s"+draft" / %s"+final" )
; The first 8 bytes are: 2E 6B 31 61 6D 20 4D 21

Verifies input as a Document and begins declaration of the 'self' class of the module.

<Mhex> mid: Module identifier of the declared module.
<TAGS> tags: Finalization state. Either the +draft or the +final tag.

The instruction MUST be the first line of a Document. It is special in that it has no preceeding whitespace. It MUST NOT be on any subsequent line (except the first).

Implementations MUST modify the state as follows:

Set mcur.self.name to self.
Set mfin to true if tags contains +final.
Set ccur to mcur.self.
Set fcur to nothing.
Set text to ccur.text.

The `text` function

Updates the active text buffer(s).

<NAME> name: Buffer name.
<TAGS> tags: Tags.

The default buffer is named markdown.

Example:

This will go into the `markdown` buffer.
.text html
<p>This will go into the <code>html</code> buffer.</p>
.text ia32
.text amd64 +multi
This will go into both "ia32" and "amd64" buffers.

Implementations MUST modify the state as follows:

Set format to type.

The `load` function

Loads another module.

<MID> mid: Module identifier.
<UINT> mlv: Required minimum level of the module class.
<NAME> name (optional): Name for the module.

Item references may reference items declared in other modules. A reference to a module that has not been loaded is invalid.

Item references are processed after all modules are loaded. This instruction does not have to appear before a reference. It is RECOMMENDED that it appears at the beginning of a document.

Implementations MUST return failure if any of the following is true:

mid is nil.
mlv is equal to or more than 2⁸.
name was given and any module import i in mcur.mload is such that: i.name is equal to name.

Implementations MUST modify the state as follows:

Find a module import i in mcur.mload, where i.mid is equal to mid.
If found and i.mlv is less than mlv:
1. Set i.mlv to mlv.
Otherwise (if not found):
1. Create a new module import i.
2. Set i.mid to mid.
3. Set i.mlv to mlv.
4. Set i.name to name.
5. Insert i into mcur.mload.

The `mlvl` function

Increases the current level of the module.

<UINT> mlv: Module level.
<TAGS> tags: Tags associated with the level.

The new level applies to all items declared afterward. Module levels may only be increased.

tags contains either +final or +draft. If the level is final, no changes to it (and lower ones) will ever be made. This rule refers to the resulting tree of declared items; descriptions of the items are not considered.

What this means is that if a Document declares the same module as another, already known Document, then the declared items in both of these documents MUST be the same, except for their textual descriptions.

Implementations MUST return failure if any of the following is true:

mlv is equal to or greater than 2⁸.
mlv is less than mcur.self.clv.
mlv is equal to 0, mcur.self.clv is equal to 0 and one of either: mcur.self.desc is not empty, mcur.self.data is not empty, mcur.self.func is not empty. mcur.self.nref is not empty. mcur.self.nval is not empty. mcur.self.ifaces is not empty. mcur.types has more than 1 element.
tags contains both +final and +draft.
tags contains neither +final nor +draft.
tags contains +final and mfin is false.

Implementations MUST modify the state as follows:

Set ccur to mcur.self.
Set fcur to nothing.
Set text to ccur.text.
Set mfin to false if +draft in tags.
Set ccur.clv to mlv.

The `cbeg` function

Begins a class declaration.

<NAME> name: Name of the class.
<TAGS> tags: Type of the class.
<UINT> cno (optional): Class number.

By default, cno is set in the post-processing stage, by ordering classes by class level and name, and then assigning the numbers in ascending order, skipping those values which have been explicitly set.

If tags contains +nodesc, the class will not have a descriptor nor a class identifier. It is not possible to allocate an instance of such a class. These classes are only used as an object type.

If tags contains +iface, the class is an interface. Interfaces require cno for their identification. It is not possible to allocate an instance of an interface. These classes can be used as an object type.

Each class has a set of predefined function members. These functions are defined in another section of this document.

Implementations MUST modify the state as follows:

Return failure if cno is greater than 255.
Return failure if tags contains both +iface and +nodesc.
Look for a class c in mcur.types, such that c.name is equal to name.
If c was found:
1. Return failure if cno is not zero.
2. Return failure if tags is not empty.
3. Return failure if c.clv equals 255.
4. Increase c.clv by 1.
If c was not found:
1. Let c be a newly created class.
2. Set c.cid to mcur.self.cid.
3. Set c.cid.cno to cno.
4. Set c.clv to 0.
5. Set c.name to name.
6. Set c.tags to tags.
7. If cno is zero (not set), insert +$autocno into c.tags.
8. Insert c into mcur.types.
Set ccur to c.
Set fcur to nothing.
Set text to c.text.
Insert predefined functions to ccur.func.

The `cend` function

Explicitly ends a class declaration.

No parameters.

Implementations MUST modify the state as follows:

Set ccur to mcur.self.
Set fcur to nothing.
Set text to mcur.self.text.

The `clvl` function

Modifies the current level of the current class.

<UINT> level: New level.
<TAGS> tags (optional): Tags.

The presence of +fini in tags specifies that the instance at this class level has a destructor. The destructor is also implied whenever any of the data members of the class at the level has a destructor because the members' destructors are called from the class destructor.

Implementations MUST return failure if any of the following is true:

ccur refers to mcur.self.
level is equal to or more than 2⁸.
tags contains +fini and any function member f in ccur.func is such that: f.clv is equal to level and f.name is equal to _fini.

Implementations MUST modify the state as follows:

Set ccur.clv to level.
Set fcur to nothing.
Set text to ccur.text.
If tags does not contain +fini, finish.
Let f be a new function member representation.
Set f.mid to mcur.self.clv.
Set f.cid to level.
Set f.name to _fini.
Let fid be the default FID of f.
Verify that FID collision does not occur against fid.
Set f.fid to fid.
Insert f into ccur.func.

The `creg` function

Assigns a register type to a class.

<REGT> type: Register type.

This instruction declares that the class is a register class.

The class is assigned a register type and gains two predefined function members for transferring a value from a register to object data and vice-versa, called save and load respectively. Otherwise, there is no mapping between registers and object data. Object data is nothing more than an array of opaque octets.

Any class MAY define functions with these names. Their parameters and semantics are predefined only for register classes.

Implementations MUST modify the state as follows:

Return failure if any of the following is true:
- ccur.regt is not empty.
- item name collision occurs between ccur and save.
- item name collision occurs between ccur and load.
Set ccur.regt to type.
Set ccur.rclv to ccur.clv.
Create a new function member save.
Set save.name to save.
Set save.fid to the defualt FID of save.
Return failure if FID collision occurs against save.fid.
Create a new function parameter ireg.
Set ireg.type to (a copy of) type.
Set ireg.name to reg.
Append ireg to save.params.
Insert save into ccur.func.
Create a new function member load.
Insert +read into load.tags.
Set load.name to load.
Set load.fid to the defualt FID of load.
Return failure if FID collision occurs against load.fid.
Create a new function parameter oreg.
Set oreg.type to (a copy of) type.
Set oreg.name to reg.
Insert +output into oreg.tags.
Append oreg to load.params.
Insert load into ccur.func.

The `cond` function

Declares a named condition for the current class.

<NAME> name: Name of the condition.
<EXPR> expr: Expression of the condition.

Implementations MUST modify the state as follows:

Return failure if any condition c in ccur.cond is such that c.name is equal to name.
Create a new condition c.
Set c.name to name.
Set c.expr to expr.
Insert c into ccur.cond.

The `desc` and `data` functions

Declares the next data member, in memory order, of an interface descriptor (desc) / of an instance (data).

Classes which are iterfaces have an additional object, in addition to the usual instance of the class, called an interface descriptor. A class that implements an interface includes the interface descriptor as part of its class descriptor as well as declares an instance of the (interface) class as one of its data members.

Interface descriptors are data used by programs which manipulate instances of unknown classes via their set of implemented interfaces.

All members of a class descriptor are read-only, constant values. The values of an interface descriptor vary by module implementation. Data in a class descriptor is valid for every instance of the class.

<TYPE> type: Type of the object.
<NAME> name: Name of the object.
<ALEN> alen (optional): Length of an array.
<VAL> value (optional, only in data): Default value.
<UINT> align (optional): Memory alignment.
<CREF> cond (optional): Condition reference. The member exists if the condition holds.
<TAGS> tags (optional): Tags.
<NAME> uref (optional): Reference to a previously declared union.

When omitted, alen.max is zero.

When omitted, value is the undefined value.

When omitted, align is zero.

When omitted, cond is empty.

When omitted, tags is empty.

When omitted, uref is empty.

If alen.min is less than alen.max and alen.ref is empty, then the currently declared data member SHOULD be the last data member declared at the current class level. The minimum amount is only a hint to the programmer in this case.

If alen.ref is not empty, then the allocated space of the declared data member is known only at program run time. Thus, memory addresses of subsequent data members are variables, too.

The current union of list data is the data union which begins from the last element d in data, such that d.tags does not contain sameaddr.

Implementations MUST modify the state as follows:

Let data be an object reference.
Let prev be an object reference.
If in the desc function:
1. Return failure if ccur.tags does not contain iface.
2. Return failure if any data member d in ccur.desc is such that d.name is equal to name.
3. Make data refer to ccur.desc.
If in the data function:
1. Return failure if item name collision occurs between ccur and name.
2. Make data refer to ccur.data.
Return failure if any of the following is true:
- align is neither zero nor a memory alignment.
- alen.max is non-zero and alen is an invalid array length.
- tags contains +sameaddr and any of:
  - data is empty.
  - cond is empty, alen.max is non-zero and alen.ref is not empty.
  - value is not the undefined value.
- tags does not contain sameaddr and uref is not empty.
- tags contains sametext and any of:
  - data is empty.
  - uref is not empty.
- Class level violation occurs in data.
Create a new data member d.
Set d.mlv to mcur.self.clv.
Set d.clv to ccur.clv.
Set d.tags to tags.
Set d.type to type.
Set d.alen to alen.
Set d.align to align.
Set d.value to value.
Set d.cond to cond.
Set d.name to name.
If uref is not empty:
1. Find a data member u in data, u.name of which is equal to uref.
2. Return failure if not found.
3. Return failure if u.tags contains sameaddr.
4. Insert d after the last element of the data union that begins with u.
5. Make prev refer to u.
Otherwise:
1. Make prev refer to the last element in data.
2. Append d to data.
Make fcur refer to nothing.
Make text refer to prev.text if tags contains sametext. Otherwise, make text refer to d.text.

The `nval` function

Declares a named value.

<NAME> name: Name of the value.
<VAL> value: Named value.

Implementations MUST return failure if any of the following is true:

item name collision occurs between ccur and name.

Implementations MUST modify the state as follows:

Create a new named value v.
Set v.name to name.
Set v.value to value.
Append v to ccur.nval.
Set fcur to nothing.
Set text to v.text

The `nref` function

Declares a name for an item reference (an alias).

<NAME> name: Name for an item.
<IREF> iref: Reference to the item.

Implementations MUST return failure if any of the following is true:

item name collision occurs between ccur and name.

Implementations MUST modify the state as follows:

Create a new named reference r.
Set v.name to name.
Set v.iref to iref.
Insert r into ccur.nref.
Set fcur to nothing.
Set text to r.text

The `fbeg` function

Begins declaration of a function member.

<NAME> name: Name of the function.
<TAGS> tags (optional): Tags.
<FID> fid (optional): Identifier of the function.
<FIDN> fidn1 (optional): Identifier of the first autogenerated function.
<FIDN> fidn2 (optional): Identifier of the second autogenerated function.

The following tags may be specified in tags:

+message: Message function declaration.
+proto: Function prototype declaration.
+event: Event declaration.
+init: Class constructor declaration.
+static: Function is independent of an instance.
+read: Function does not write to an instance.
+module: Function requires access to module memory.
+kernel: Function requires access to kernel memory.
+more: Function expects more parameters than declared.

The tags +message, +proto, +event and +init are mutually exclusive.

If tags contains +message, a function map is declared, from a pair of a langauge tag and a character coding to a function, which takes the delcared parameters and returns a human-readable message. A module implementation SHOULD implement only one function per language. The kernel automatically converts the encoding to the requested one. The functions are local to an implementation and MAY NOT be exported. The FID of the function identifies the array of implemented functions. These functions are accessed through the kernel via a special call.

If tags contains +proto, only a function type is declared. No function is declared and no FID is allocated. A class may reference the prototype in order to declare one.

If tags contains +init, two functions are declared. The FID fid is assigned to the constructor. It is named name, which SHOULD begin with the word init. The function returns a status boolean and takes an implied read-write handle to an instance as the first parameter, called this. The FID named create is assigned to the creator. The name of the function is name with $create appended. The creator takes an implied set of parameters inserted at the front, which determine the location and type of allocated memory, where an instance will be constructed by the constructor. The just-constructed object is then returned to the caller.

If tags contains +event, three items are declared: a function prototype for an event handler and a handler installer and uninstaller functions. The FID fid is assigned to the event prototype. The event prototype has an implicit first parameter: a read-write handle to an undefined, previously installed object. The prototype is named name. The FID named install is assigned to the event installer. It returns a status boolean and takes two parameters: a function reference to a handler and a read-write handle to an object. The object is passed as the first argument to the handler. The function is named name with $install appended. The FID named uninstall is assigned to the event uninstaller. It returns a status boolean and takes one parameter: a function reference to a handler to be uninstalled. The function is named name with $uninstall appended.

Implementations MUST conform to the following algorithm:

Return failure if any of the following is true:
- tags contains both +static and +read.
- ccur refers to mcur.self and tags contains +read.
- item name collision occurs between ccur and name.
If tags contains +message:
1. If fid was omitted, set fid to the default FID of a member named name of ccur.
2. Return failure if any of the following is true:
  - tags contains +proto, +event or +init.
  - Either fidn1 or fidn2 was not omitted.
  - FID collision occurs against fid.
If tags contains +proto:
1. Return failure if any of the following is true:
  - tags contains +message, +event or +init.
  - tags contains +module or +kernel.
  - Either fid, fidn1 or fidn2 was not omitted.
If tags contains +init:
1. Let nameC be a copy of name.
2. Append $create to nameC.
3. If fid was omitted, set fid to the default FID of a member named name of ccur.
4. If fidn1 was not omitted, return failure if the associated name is not equal to create.
5. If fidn1 was omitted, set fidn1 to the default FID of a member named nameC of ccur.
6. Return failure if any of the following is true:
  - tags contains +message, +proto or +event.
  - fidn2 was not omitted.
  - FID collision occurs against fid or fidn1.
If tags contains +event:
1. Let fidI be an invalid function identifier (zero).
2. Let fidU be an invalid function identifier (zero).
3. Let nameI be a copy of name.
4. Let nameU be a copy of name.
5. Append $install to nameI.
6. Append $uninstall to nameU.
7. If fidn1 was not omitted and its associated name is equal to install, set fidI to fidn1.
8. If fidI is invalid and fidn2 was not omitted and its associated name is equal to install, set fidI to fidn2.
9. If fidI is invalid, set fidI to the default FID of a member named nameI of ccur.
10. If fidn1 was not omitted and its associated name is equal to uninstall, set fidU to fidn1.
11. If fidU is invalid and fidn2 was not omitted and its associated name is equal to uninstall, set fidI to fidn2.
12. If fidU is invalid, set fidU to the default FID of a member named nameU of ccur.
13. Return failure if any of the following is true:
  - tags contains +message, +proto or +init.
  - tags contains +read.
  - ccur does not refer to mcur.self and tags contains +static but neither +module nor +kernel.
  - fid was not omitted.
  - fidn1 is named neither install nor uninstall.
  - fidn2 is named neither install nor uninstall.
  - Both fidn1 and fidn2 are named install.
  - Both fidn1 and fidn2 are named uninstall.
  - FID collision occurs against fidI or fidU.
14. Create a new function member fi.
15. Set fi.tags to a copy of tags.
16. Insert +$install into fi.tags.
17. Remove +more from fi.tags.
18. Set fi.mlv to mcur.self.clv.
19. Set fi.clv to ccur.clv.
20. Set fi.fid to fidI.
21. Set fi.name to nameI.
22. Insert fi into ccur.func.
23. Create a new function member fu.
24. Set fu.tags to a copy of tags.
25. Insert +$uninstall into fu.tags.
26. Remove +more from fu.tags.
27. Set fu.mlv to mcur.self.clv.
28. Set fu.clv to ccur.clv.
29. Set fu.fid to fidU.
30. Set fu.name to nameU.
31. Insert fu into ccur.func.
32. Create a new function parameter p.
33. Set p.type to read.
34. Set p.name to handler.
35. Append a copy of p to fi.params.
36. Append a copy of p to fu.params.
37. Set p.type to a rdwr.
38. Set p.name to userdata.
39. Append p to fi.params.
40. Insert +proto into tags.
41. Remove +module from tags.
42. Remove +kernel from tags.
If tags contains +proto, set fid to zero.
If ccur refers to mcur.self, insert +static into tags.
Create a new function member f.
Set f.mlv to mcur.self.clv.
Set f.clv to ccur.clv.
Set f.fid to fid.
Set f.tags to tags.
Set f.name to name.
Insert f into ccur.func.
If tags contains +message:
1. Create a new function parameter p.
2. Insert +output to p.tags.
3. Set p.type to rdwr.
4. Set p.name to message.
5. Append p to f.params.
6. Create a new function parameter p.
7. Set p.type to FID.
8. Set p.name to enc_and_lang.
9. Append p to f.params.
Set fcur to f.
Set text to f.text.
Return success.

The `fend` function

Explicitly ends a function declaration.

No parameters.

Implementations MUST modify the state as follows:

Set fcur to nothing,
Set text to ccur.text.

The `ferr` function

Declares an error code that the current function may return.

<NAME> name: Name of the error message.
<FID> sid (optional): Function identifier of the error message.

If the function does not return an error code, then the document simply does not declare any error codes.

Implementations MUST conform to the following algorithm:

If fid is invalid or was omitted, set fid to the default FID of a member named name of a module.
Return failure if any of the following is true:
- fcur is nothing.
- fcur.tags contains +message.
- fcur.tags contains +event.
For each error code code in fcur.codes: Return failure if code.fid is equal to fid.
Create a new error code code.
Set code.name to name.
Set code.fid to fid.
Append code to fcur.codes.
Set text to f.text.
Return success.

The `fpar` function

Declares the next in-order parameter of the current function.

<ARGT> type: Type of the parameter.
<NAME> name: Name of the parameter.
<TAGS> tags (optional): Parameter tags.

Function arguments are objects passed by either handle or value.

Objects passed by value are specified by simply writing a class reference.

.fpar module.class:0 by_value

If a class is a register class, the value is passed via one register. Otherwise, a reference to a local copy of the object is the value. Objects passed by value MUST NOT be longer than 4096 octets.

Objects passed by handle are written by specifying the handle.

.fpar read<module.class:0> by_handle

By default, a parameter is an input parameter. In order to declare an output parameter, include +output in tags. No other tag is recognized.

.fpar module.class:0 output_value +output

Order in which parameters are declated is significant. It is RECOMMENDED to first declare output then input parameters.

Implementations MUST conform to the following algorithm:

Return failure if any of the following is true:
- fcur is nothing.
- name is this.
For each function parameter p in fcur.params: Return failure if f.name is equal to name.
Create a new function parameter p.
Set p.tags to tags.
Set p.type to type.
Set p.name to name.
Append p to fcur.params.
Set text to p.text.
Return success.

The `impf` function

Declares an implementation of a prototyped function.

<IREF> proto: Reference to a prototype function.
<NAME> name: Name of the function (the implementation).
<TAGS> tags (optional): Tags.
<FID> fid (optional): Identifier of the function.

Prototype implementations are marked with a special tag. The referenced prototype is stored as the return value type.

The following tags may be specified in tags:

+static: Function is independent of an instance.
+module: Function requires access to module memory.
+kernel: Function requires access to kernel memory.

Implementations MUST return failure if any of the following is true:

tags contains +proto.
tags contains +event.
tags contains +message.
tags contains +init.
tags contains +read.
tags contains +more.
item name collision occurs between ccur and name.
FID collision occurs against fid.

Implementations MUST modify the state as follows:

Insert +$protoref into tags.
Create a new function member f.
Set f.mlv to mcur.self.clv.
Set f.clv to ccur.clv.
Set f.fid to fid.
Set f.tags to tags.
Set f.name to name.
Create a new function parameter p.
Set p.type to proto.
Append p into f.params.
Insert f into ccur.func.
Set fcur to nothing.
Set text to f.text.

The `impc` function

Declares that the current class implements an interface.

<TYPE> type: Reference to the implemented interface.
<MEMB> mref (optional): Reference to the data member associated with the interface.

An interface object is an instance of type.

If mref is omitted, then the class does not declare any data member as the interface object of the interface. This is only permitted if type has no data members.

The interface object MUST NOT be preceeded by a variable-length member. The offset must be the same value for every instance of the class.

Module implementations set the values of interface descriptor fields via the implementation's class definition document. A RECOMMENDED syntax for these documents is proposed in another section.

Implementations MUST return failure if any of the following is true:

ccur.tags contains +iface.

Implementations MUST modify the state as follows:

Create a new interface reference iface.
Set iface.type to type.
Set iface.mref to member.
Set iface.clv to ccur.clv.
Insert iface into ccur.ifaces.
Set fcur to nothing.
Set text to ccur.text.

The `path` function

Declares a path to an external resource.

<PATH> path: Path to the resource.

Implementations MUST return failure if any of the following is true:

Any path p in mcur.paths is such that: p.path is equal to path.

Implementations MUST modify the state as follows:

Create a new path p.
Set p.path to path and p.mlv to mcur.self.clv.
Insert p into mcur.paths.
Set cbeg to nothing,
Set func to nothing,
Set data to nothing.
Set text to p.text.

Postprocessing

Check this section later when it's written.

Register names

Specials

boolean: A boolean value; either "true" or "false".
cmprval (comparision result value): Result of a comparision; one of: "error", "less than", "same as" or "more than".

Unsigned integers

u8: Integer in the range [0, 2⁸-1].
u16: Integer in the range [0, 2¹⁶-1].
u32: Integer in the range [0, 2³²-1].
u64: Integer in the range [0, 2⁶⁴-1].
u128: Integer in the range [0, 2¹²⁸-1].

Signed integers

i8: Integer in the range [-2⁷, 2⁷-1].
i16: Integer in the range [-2¹⁵, 2¹⁵-1].
i32: Integer in the range [-2³¹, 2³¹-1].
i64: Integer in the range [-2⁶³, 2⁶³-1].
i128: Integer in the range [-2¹²⁷, 2¹²⁷-1].

Vectors of integers

PLACEHOLDER.

Binary floating-point numbers

f16: IEEE 754 arithmetic format with base=2, p=11, e_max=15.
f32: IEEE 754 arithmetic format with base=2, p=24, e_max=127.
f64: IEEE 754 arithmetic format with base=2, p=53, e_max=1023.
f80x87: IEEE 754 arithmetic format with base=2, p=64, e_max=16383.
f128: IEEE 754 arithmetic format with base=2, p=113, e_max=16383.

Vectors of binary floating-point numbers

PLACEHOLDER.

Decimal floating-point numbers

d32: IEEE 754 arithmetic format with base=10, p=7, e_max=96.
d64: IEEE 754 arithmetic format with base=10, p=16, e_max=384.
d128: IEEE 754 arithmetic format with base=10, p=34, e_max=6144.

Predefined classes

Predefined classes have no associated class level.

Their names are written with capital letters.

Octet

There is only one fundamental type: an OCTET. All classes are essentially arrays of octets.

An octet occupies one memory address, under which there are at least 8 bits. If there are more than 8 bits, the excess bits MUST be cleared. There is no meaning associated with the bits of an octet.

Length of a class is expressed in octets.

Its register type is a vector of 8 bits. It is mappped to an 8-bit unsigned integer in practice.

Boolean

A BOOLEAN is either true (non-zero) or false (zero).

The length of a boolean is 1 octet.

Its memory alignment is 1 octet.

Its register type is a 1-bit unsigned integer. It is mappped to an 8-bit unsigned integer in practice.

Status boolean

A STATUS is a special boolean. It is used as the return value of functions.

If false, it means that there is nothing to report (success). Otherwise (if true), it means that the task's status stack was pushed onto. The caller should examine the stack before continuing.

The idea is that programs are written like this:

if function() returns true
{
  code when function reports something, usually failure
}
otherwise, continue

If the function returns a status boolean, it is implied that the code in the if-block is an unlikely branch, because functions are assummed to generally execute successfully.

A status code is returned via the status stack along with other, supplementary information.

Comparision result

A CMPRVAL is the result of a comparision. It is used as the return value of functions.

The length of a comparision result is 1 octet.

Its memory alignment is 1 octet.

Its register type is a 2-bit signed integer. It is mappped to an 8-bit signed integer in practice.

Possible values of a comparision result, when comparing object LHS against object RHS, are:

0: The objects are equal.
1 (and greater): LHS is greater than RHS.
-1: Comparision failed. Interpreted as a true status boolean.
-2 (and less): LHS is less than RHS.

Simply put, one first tests for -1 (unsuccessful) and then compares with 0.

Functions that return this value also set associated CPU flags accordingly, so that a conditional jump may immediately follow the function call.

Object length

The length of an objsize is 4 octets.

Its memory alignment is 4 octets.

Its register type is a 32-bit unsigned integer.

It represents a length of an object, in octets. Octets are ordered in increasing order of significance.

Memory address

The length of an ADDRESS is 8 octets.

Its memory alignment is 8 octets.

Its register type is a 64-bit unsigned integer.

It represents a memory address. Octets are ordered in increasing order of significance.

Function identifier

The length of an FID is 8 octets.

Its memory alignment is 8 octets.

Its register type is a 64-bit unsigned integer.

It represents a function identifier. Octets are ordered in increasing order of significance.

16-octet identifier

The length of an ID16 is 16 octets.

Its memory alignment is 8 octets.

It has no register type.

It is an opaque array of 16 octets.

System memory reference

The length of a HANDLE is 32 octets.

Its memory alignment is 8 octets.

Its register type is a CPU-defined virtual memory reference.

Its data members are as follows:

.cbeg HANDLE !NOID
.data ADDRESS address
.data ID16    node_id
.data OCTET   nonce [8]

address: Lower bits of the system memory address.
node_id: Higher bits of the system memory address.
nonce: Random value associated with the referenced object.

Loading from and saving into a handle are kernel functions, which translate the system memory address in the handle from and to a virtual memory address of the calling task.

This is not an object identifier. The address may point to any octet within an object.

Module reference

The length of an MREF is 24 octets.

Its memory alignment is 8 octets.

It has no register type.

Its data members are as follows:

.cbeg MREF !NOID
.data ID16  mcid
.data OCTET mclv
.data OCTET mbid [8] +sameaddr

mcid: Class identifier of the referenced module.
mclv: 8-bit unsigned integer. Minimum class level of the module.
mbid: Build identifier of the target M-Build.

A reference is either to a specific M-Build by its build identifier or to any M-Build which implements the given module at the given level.

A build identifier is an 8-octet (256-bit) value, which is computed from relevant parts of a program image. The exact way to compute it depends on the image format.

The class level is considered when octets of the mbid array at positions [1,7] have all of their bits cleared. Thus, such build identifiers are out of range and invalid.

Function reference

The length of an FREF is 32 octets.

Its memory alignment is 8 octets.

It has no register type.

Its data members are as follows:

.cbeg FREF !NOID
.data MREF  mref
.data FID   fid

mref: Module reference.
fid: Function identifier.

Interface descriptor

The length of an IFACE is variable. The minimum length is 24 octets.

Its memory alignment is 8 octets.

It has no register type.

Its data members are as follows:

.cbeg IFACE  !NOID
.data ID16    cid
.data objsize clv_len
.data objsize offset
.data OCTET   members [0:MAX]

cid: Class identifier of the interface.
clv_len: Class level and total length of the descriptor in octets. The level in is the most significant 8 bits. The length in is the remaining least significant 24 bits. The length is rounded up to a multiple of 8. The value is also an offset from the beginning of the descriptor to the next one.
offset: Offset from the beginning of a class instance, specifying the location of the associated interface object. If equal to 4294967295, then there is no such object.
members: Data members of the interface descriptor.

Class descriptor

The length of a CLASS is variable. The minimum length is 32 octets.

Its memory alignment is 8 octets.

It has no register type.

It is a structure defined as follows:

.cbeg CLASS !NOID
.data ID16    cid
.data objsize len_dsc
.data objsize len_min
.data objsize len_max
.data OCTET   align
.data OCTET   clv
.data OCTET   flags
.data OCTET   ifaces_len
.data IFACE   ifaces [ifaces_len:MAX]

All OCTET-typed fields are interpreted as an 8-bit unsigned integer.

cid: Class: identifier.
len_dsc: Total length of the descriptor, in octets. The length is rounded up to a multiple of 8. The value is also an offset from the beginning of the descriptor to the next one.
len_min: Minimum length of an instance, in octets.
len_max: Maximum length of an instance, in octets.
align: Alignment exponent.
clv: Class level.
flags: Vector of 8 flag bits.
ifaces_len: Amount of implemented interfaces.
ifaces: Descriptors of implemented interfaces.

Class descriptor for level n of a class is directly followed by the descriptor for level n-1.

The descriptor for level n contains only those interfaces that were introduced at level n.

The defined flag bits in flags are, counted from the least significant:

bit 0: The class has a destructor.
bit 1: The class contains handles. (It has an accessor.)
bits 2-7: Undefined. Must be cleared.

Document identification

This section contains information on how to identify and mark Documents as such in their respective systems.

Data Format Descriptor

Data Format Descriptor for Documents is TO-BE-DEFINED.

Documents have the class #Document.

Internet Media Type

Media type of Documents is text/prs.kagomeko.k1os.

The charset parameter MUST be included with the value UTF-8.

References

Hyperlinks in the document will point here in a later revision.

Developement considerations

Modules ought to follow the KISS (Keep It Simple, Stupid) rule. They are to be narrow in scope so that their developement ends one day, the module becomes finalized and no more levels are ever added to it. If it does not need revision, it means it’s good and can be safely used.

Having too much functionality in a module makes it more unstable. Stability is the most important trait every module author ought to pursue. Modules that are constantly revised are broken by design. Such modules ought to be scrapped, made obsolete, and then redesigned as new modules (with a new identifier).

The scale of a module should ideally be small enough for one person to not become mentally exhausted (burned out) while implementing it alone. There should be many implementations available that a user can choose from.

One should differentiate between a module and a software project. The module ought to be a part of the project, not the project itself. For example, if a Python interpreter were to be a module, then Python 2 and Python 3 interpreters would be different modules, which might be developed as part of the same project.

They are different because the most crucial part—the parser and interpreter—are different for version 2 and for version 3. The other reason is that Python 2 is phased out in favour of 3. Keeping both 2 and 3 in the same module is counter-productive. Remember that items—once defined—cannot be removed from a module.

Some modules are published and standarized for the general public. Other modules may be known only to select few or be created as part of developement or a user session and have the lifetime of only few hours.

Introduction

Keywords

Document

Instructions

Comments

Text

Line indentation

Formal syntax (ABNF)

Objects

Non-items

Characters

Object reference

Containers

Class identifier

Class level

Memory alignment

Tag

Name

Member reference

Item reference

Type reference

Array length

Value

Condition

Logical expression

Comparision

Condition reference

Path component

Interface reference

Module import

Module

Function number

Items

Data member

Function member

Function parameter

Error code

Named value

Named reference

Path

Class

Document Processor

Parser

Line parser

Line loading

Line processing

Instruction parser

Unsigned integer

Name

Tags

Relative item reference

Absolute item reference

Memory type

Value type

Parameter type

Function number

Array length

Literal value

Condition reference

Expression

Path

Instruction processor

The k1md function

The text function

The load function

The mlvl function

The cbeg function

The cend function

The clvl function

The creg function

The cond function

The desc and data functions

The nval function

The nref function

The fbeg function

The fend function

The ferr function

The fpar function

The impf function

The impc function

The `k1md` function

The `text` function

The `load` function

The `mlvl` function

The `cbeg` function

The `cend` function

The `clvl` function

The `creg` function

The `cond` function

The `desc` and `data` functions

The `nval` function

The `nref` function

The `fbeg` function

The `fend` function

The `ferr` function

The `fpar` function

The `impf` function

The `impc` function

The `path` function