Kueea AMv1:
Module Declaration Document
Draft; do not implement.
Introduction
This document defines the syntax and semantics of a Kueea Abstract Machine Version 1 Module Declaration Document, shortened throughout this document to just "Document".
One Document declares one Kueea AMv1 Module. The syntax is designed to be human-readable and fairly easy to read and write using the most simple text editors.
Document Processors are programs which take Documents as input. The primary output are source files in a given programming language. Other output include module documentation in HTML or other formats. They are part of toolchains that generate Kueea AMv1 M-Build images.
Keywords
The key words ‘MUST,’ ‘MUST NOT,’ ‘REQUIRED,’ ‘SHALL,’ ‘SHALL NOT,’ ‘SHOULD,’ ‘SHOULD NOT,’ ‘RECOMMENDED,’ ‘NOT RECOMMENDED,’ ‘MAY,’ and ‘OPTIONAL’ in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
Document
A Document is a sequence of Unicode characters. The encoding of characters MUST be UTF-8.
Lines are sequences of characters separated by a sequence of two characters (in order): U+000D CARRIAGE RETURN and U+000A LINE FEED.
Whitespace is either U+0020 SPACE or U+0009 HORIZONTAL TAB.
Documents are processed line by line. The maximum length of a line is 1024 code units (bytes), including the line separator.
Three types of data may appear on a line: instructions, comments and text.
Instructions
An instruction is a line directed at the Document Processor. It begins with optional whitespace followed by one U+002E FULL STOP character, the instruction name and its arguments. Each argument is preceeded by at least one whitespace character.
Instruction names and arguments are case-sensitive.
Instruction names are sequences of four small Latin letters.
Instruction character set is limited to the range [U+0000, U+007F].
Examples:
.inst arg1 arg2 .inst arg1
Comments
A comment is a line which is discarded. There are two kinds of comments.
A one-line comment begins with optional whitespace followed by no more than 1 consecutive U+0023 NUMBER SIGN character.
Examples:
# This is a one-line comment. # This is a one-line comment. THIS IS NOT A COMMENT. # # # This is a one-line comment.
A multi-line comment begins and ends with a line beginning with optional whitespace followed by 2 consecutive U+0023 NUMBER SIGN characters.
Examples:
## This is the first line of a multi-line comment. This is a comment. .This is a comment. # # This is inside a multi-line comment. ## This is the last line of a multi-line comment. THIS IS NOT A COMMENT.
Text
Any other line is text - secondary data associated with the current item, stored in a named buffer.
It is OPTIONAL for a Document Processor to process text.
Syntax and semantics of text are out of scope of this document.
By default, text is a human-readable textual description of the associated item, in Markdown. [MARKDOWN]
Examples:
.item example1 Description of the example1 item. Description of the example1 item. .item example2 Description of the example2 item.
Line indentation
The preceeding whitespace on an instruction sets the amount of ignored preceeding whitespace for text that comes after it.
Both of the whitespace characters count as one. Decide on the identation character for the document, please. U+0020 SPACE is RECOMMENDED because visual presentation of tabs vary.
Consider the following example:
.inst first Line 1-1. Line 1-2. Line 1-3. .inst second Line 2-1. Line 2-2. Line 2-3.
The first line is an instruction indented by 3 whitespace characters. The ignored indentation becomes 3.
The second line is text indented by 3 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has no preceeding whitespace characters.
The third line is text indented by 5 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has 2 preceeding whitespace characters.
The fourth line is text indented by 2 whitespace characters. The parser removes the first 3 whitespace characters. In this case, the line has less whitespace, so all is removed. The resulting line has 0 preceeding whitespace characters.
Text for the first item is thus:
Line 1-1. Line 1-2. Line 1-3.
The fifth line is an instruction indented by 0 whitespace characters. The ignored indentation becomes 0.
The sixth line is text indented by 3 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 3 preceeding whitespace characters.
The seventh line is text indented by 5 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 5 preceeding whitespace characters.
The eigth line is text indented by 0 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 0 preceeding whitespace characters.
Text for the second item is thus:
Line 2-1. Line 2-2. Line 2-3.
Formal syntax (ABNF)
The following MDOC
rule expresses the syntax in ABNF.
MDOC = k1md CRLF *( line CRLF ) ; k1md defined in another section line = comm / inst / text comm = cmul / cone cmul = *WSP "##" *OCTET CRLF *WSP "##" *OCTET cone = *WSP "#" *OCTET inst = *WSP "." *( WSP / VCHAR ) text = [ *WSP "\" ] *OCTET
ABNF rules are referenced in prose like this: <rule>.
The first non-whitespace character of <text> MAY be a U+005C REVERSE SOLIDUS. If so, the \
character is removed before further processing of the line. This is an escape mechanism in case the line begins with #
or .
.
Objects
An object is a set of variables. Variables of an object are referenced in prose like this: object.variable (using a dot separator).
Objects are categorized as items and non-items.
Non-items
This section defines non-item objects.
These objects are part of items.
Characters
A character is a Unicode code point.
Object reference
An object reference references another object.
When an object reference refers to nothing, it means that the reference is set to a value that does not reference any objects.
Containers
Some objects are stored in a container object.
A list is an ordered container, which means that position of elements in a list is significant.
A set is an unordered container.
A pair is a set of two elements.
Class identifier
Classes are identified by 128-bit values. These values are globally unique and are treated as opaque data.
It is expected that the value is a Universally Unique Identifier. The value MAY NOT be a valid UUID, although it SHOULD be.
The nil UUID value is reserved and means no value (invalid value). This value is called nil.
Class level
Members of a class are grouped under its levels.
A class level is an 8-bit unsigned binary integer.
An instance of a class at level n contains all members of the class declared at levels [0, n].
Memory alignment
An alignment exponent is an integer within the range [0, 31]. It is the exponent e in the expression 2e, result of which is the associated memory alignment in octets. Any other value is outside of a memory alignment's value range.
Tag
A tag is a list of characters.
The minimum length of the list is 1.
The maximum length of the list is 16.
Name
A name is a list of characters.
The minimum length of the list is 1.
The maximum length of the list is 64.
Most items have an associated name. Names are local in scope to the document.
Member reference
A member reference is a list of names.
Member references are relative to a class. The list designates the class's member by following the referenced names.
Names are deferenced in order and there is no way to reference a previously dereferenced name - infinite recursion never happens.
Item reference
An item reference is a pair of a name and a member reference.
The name contains a module's class identifier, a user-defined module alias or a predefined name.
When the name is empty, the item reference is an unknown item reference.
The member reference is relative to the module's class.
Type reference
A type reference is an object which consists of:
- ref: item reference to a class.
- clv: class level of the referenced class.
- hnd: type of reference.
Possible values of hnd are:
null
- No value set. Type reference is empty.
regt
- Register type.
creg
- Register type of the class.
cmem
- Instance of the class.
none
- Handle with no access rights; just the address. Use to avoid unnecessary duplication of rights.
read
- Handle with read rights.
rdex
- Handle with read and execute rights.
rdwr
- Handle with read and write rights.
rwex
- Handle with read, write and execute rights.
A handle is a system address associated with access rights to the referenced object.
The system gives access as many times as there are handles in a class. A class only needs to have one handle with specific rights to an object. Other handles to the same object SHOULD be of the no-access type in order to avoid unnecessary computations when passing objects around.
If ref is an unknown item reference and hnd designates a handle, the handle is to an unspecified object.
Array length
An array length is an object which consists of:
- ref: member reference to the data member which stores the amount of elements.
- min: minimum amount of elements.
- max: maximum amount of elements.
Amount of elements is a 32-bit unsigned binary integer.
The min and max values are used in calculation of the minimum length and the possible maximum length of a class instance.
If max is zero, an array is not declared. If it is equal to 4294967295 (232-1), the value represents the maximum possible amount of elements; otherwise, the value is the given number.
The data member referenced by ref MUST be an instance of a register class with an unsigned integer register type. This member's memory address MUST be lower than the array's, i,e. it must be declared before the array. Both min and max MUST NOT be greater than the maximum value representable by the referenced member's class.
The maximum possible amount of elements is either 4294967295 divided by the maximum class length of an element, rounded down, or the maximum value representable by the type of the data member referenced by ref if ref is not empty.
An invalid array length is one of which either:
- min is greater than max.
- min is equal to max and ref is not empty.
Value
Representation of a value is implementation-dependent. This document only defines the syntax of a value in an instruction.
A value may be the undefined value. This state represents an omitted parameter. (There is no syntax for representing it in a Document.)
Condition
A condition is an object which consists of:
- expr: expression of the condition.
- name: name of the condition.
An expression is an object which consists of:
- exop: operator of the expression.
- data: data of the expression determined by exop.
Each expression evaluates to either true or false.
A condition holds when its expression evaluates to true.
Logical expression
data for exop values 'any is true', 'any is false', 'all are true', and 'all are false' is a logical expression, which is an object that consists of:- exp: list of expressions, evaluated in list order.
An empty list evaluates to true.
Comparision
data for exop values 'is equal to', 'is not equal to', 'is less than', 'is less than or is equal to', 'is greater than', and 'is greater than or is equal to' is a comparision, which is an object that consists of:- ref: member reference to the data member which is compared with the value,
- val: the value.
The type of the referenced member SHOULD be a register class. Implementations MAY NOT support values that are arrays and instances of non-register classes.
Condition reference
data for exop values 'condition holds' and 'condition does not hold' is a condition reference, which is an object that consists of:- cnd: name of a condition.
Path component
A path component is a list of characters.
The minimum length of the list is 7.
The maximum length of the list is 1024.
Path components name external (out-of-memory) resources.
Interface reference
An interface reference is an object which consists of:
- type: data member type of the interface.
- mref: member reference to a data member.
- clv: class level of the associated class.
Module import
A module import is an object which consists of:
- mid: a module's class identifier.
- mlv: minimum class level of the module.
- name: an alias, name for the module.
Module
A module is an object which consists of:
- this: object reference to the module's class.
- types: set of classes.
- paths: set of path components.
- mload: set of module imports.
Function identifier
A function identifier (FID) is a 64-bit unsigned binary integer.
There MUST NOT be two functions with the same identifier within a module.
By default, the value is the result of passing a character string, constructed from the module's class identifier and relevant item names, to the 64-bit FNV-1a (Fowler-Noll-Vo) hash function:
hash = 0xCBF29CE484222325 for each octet_of_data to be hashed hash = hash XOR octet_of_data hash = hash * 0x100000001B3 # BEG DOCUMENT SPECIFIC if hash == 0 then hash = 0xFFFFFFFFFFFFFFFF # END DOCUMENT SPECIFIC return hash
The value 0 is invalid; it can be used as an undefined value.
Examples:
Input string | Output value |
---|---|
module_func | 0x0F7E93E1AF686350 |
class$00$function | 0x2862790D0CE9E837 |
Function identifiers should only be explicitly declared in case of a hash collision. The encoding of the string is the same as for the document - UTF-8, although it is practically limited to just the first 127 code points.
The input character string for the default FID of a function depends on the function's location within the item tree. If the function is a member of the module's class, the FID is computed over the name of the function only. Otherwise, the input string is computed as follows:
- Let s be an empty character string.
- Let c be the class.
- Let f be the function member.
- Append c.name to s.
- Append a U+0024 DOLLAR SIGN character to s.
- If c.clv is less than 16, append a U+0030 DIGIT ZERO to s.
- Append the hexadecimal representation of c.clv,
using character ranges [U+0030,U+0039] and [U+0041,U+0046]
(
0
-9
andA
-F
) for the values [0,15], to s. - Append a U+0024 DOLLAR SIGN character to s.
- Append f.name to s.
- Return s as the input character string.
Items
This section defines item objects.
All items contain a text variable, which is a list of objects, which each consists of:
- data: list of characters,
- format: name; format of the data.
When appending an object n to text, where o is the last object in text: if n.format equals to o.format, data in n.data MAY be appended to o.data with a preceeding line separator instead of appending a new object.
Data member
A data member is an item which consists of:
- mlv: class level of the associated module's class.
- clv: class level of the associated class.
- tags: set of tags.
- type: type reference.
- alen: array length.
- align: memory alignment or zero.
- value: default value.
- cond: name of a condition.
- name: name of the object.
The alignment of data member d is a 32-bit unsigned binary integer computed as follows:
- Return the default alignment of class referenced by type.ref at level type.clv if d.align is equal to zero.
- Return d.align.
The minimum length of data member d is a 32-bit unsigned binary integer computed as follows:
- Let ltype be a 32-bit unsigned binary integer.
- Set ltype to the minimum class length of class referenced by d.type.ref at level d.type.clv.
- Return ltype if d.alen.max is equal to zero.
- Return zero if d.alen.min is equal to zero.
- Let len be a 34-bit unsigned binary integer.
- Let off be a 34-bit unsigned binary integer.
- Let rem be a 32-bit unsigned binary integer.
- Set len to ltype.
- Set rem to d.alen.min.
- Set off to the alignment of d.
- Subtract one from off.
- While rem is greater than one:
- Add off to len.
- Set len to the binary AND of len and the binary NOT of off.
- Add ltype to len.
- Abort the program if len is greater than 4294967295.
- Subtract one from rem.
- Return len as a 32-bit unsigned binary integer.
The maximum length of data member d is a 32-bit unsigned binary integer computed as follows:
- Let ltype be a 32-bit unsigned binary integer.
- Set ltype to the maximum class length of class referenced by d.type.ref at level d.type.clv.
- Return ltype if d.alen.max is equal to zero.
- Let len be a 34-bit unsigned binary integer.
- Let off be a 34-bit unsigned binary integer.
- Let rem be a 32-bit unsigned binary integer.
- Set len to ltype.
- Set rem to d.alen.max.
- Set off to the alignment of d.
- Subtract one from off.
- While rem is greater than one:
- Add off to len.
- Set len to the binary AND of len and the binary NOT of off.
- Add ltype to len.
- Set len to 4294967295 if len is greater than 4294967295.
- Subtract one from rem.
- Return len as a 32-bit unsigned binary integer.
The allocated space of a data member is greater than or equal to its minimum length and less than or equal to its maximum length. This is the amount of octets that an instance occupies in memory. The value is a run-time variable if alen.ref is not empty.
The following tags are recognized in tags:
sameaddr
- Member is a non-canonical member of a data union.
sametext
- Item description is shared with the previous member.
A data union is a range in a list of data members, which begins with a member a, inclusive, and ends with a member b, exclusive, such that a.tags does not contain sameaddr
and b.tags does not contain sameaddr
. The member a is the canonical member of the union.
A non-canonical member of a data union is any data member d, d.tags of which contains sameaddr
.
The canonical member of a data union defines the allocated space of the data union as a whole as well as its memory alignment.
The maximum length of a non-canonical member of a data union MUST NOT be greater than the maximum length of the canonical member.
If a non-canonical member d is such that d.alen.max represents the maximum possible amount of elements and d.alen.ref is empty, then the maximum possible amount of elements is the maximum length of the canonical member divided by the maximum length of an element, rounded down.
If alignment of a non-canonical member is greater than the alignment of the canonical member, then the non-canonical member is aligned farther into the union in order to match its memory alignment. This reduces the remaining allocated space available to the member.
If cond is not empty, the member exists only when the referenced condition holds. In case of a canonical member of a data union, the condition applies to the data union in whole.
All non-canonical members d of a data union, d.cond of which is not empty, are mutually exclusive with each other and the canonical member. The canonical member exists if the data union exists and none of conditions referenced by d.cond hold.
All non-canonical members d of a data union, d.cond of which is empty, always exist. These members have their destructors and accessors ignored. Thus, types of such members SHOULD NOT have these.
Function member
A function member is an item which consists of:
- mlv: class level of the associated module's class.
- clv: class level of the associated class.
- fid: function identifier.
- tags: set of tags.
- params: list of function parameters.
- codes: list of error codes.
- name: name of the function.
Function parameter
A function parameter is an item which consists of:
- tags: set of tags.
- type: type reference.
- name: name of the parameter.
Error code
An error code is an item which consists of:
- name: name of a message function.
- fid: identifier of the function.
Error codes are declared in a predefined module.
Named value
A named value is an item which consists of:
- mlv: class level of the associated module's class.
- clv: class level of the associated class.
- value: value.
- name: name of the value.
Named reference
A named reference is an item which consists of:
- mlv: class level of the associated module's class.
- clv: class level of the associated class.
- iref: item reference.
- name: name of the reference.
Path
A path is an item which consists of:
- mlv: class level of the associated module's class.
- path: path component.
Class
A class is an item which consists of:
- cid: class identifier.
- clv: class level.
- regt: type reference.
- rclv: class level (for regt).
- desc: list of data members (descriptor).
- data: list of data members (instance).
- cond: set of conditions.
- func: set of function members.
- nref: set of named references.
- nval: set of named values.
- tags: set of tags.
- ifaces: set of interface references.
- name: name of the class.
A class, regt of which is non-empty, is called a register class. It has an implicit reserved member called regt
which is a name reference to the register type referenced by regt.
The default alignment of class c at level lv is computed as follows:
- Let cur be a 32-bit unsigned binary integer.
- Let max be a 32-bit unsigned binary integer.
- Set max to zero.
- For each data member d in c.data:
- Advance to the next element if d.clv is greater than
lv or d.tags contains
sameaddr
. - Set cur to the alignment of d.
- Advance to the next element if cur is less than or equal to max.
- Set max to cur.
- Return max.
- For each data member d in c.data:
The minimum class length of class c at level lv is computed as follows:
- Let cur be a 32-bit unsigned binary integer.
- Let len be a 33-bit unsigned binary integer.
- Set len to zero.
- For each data member d in c.data:
- Advance to the next element if d.clv is greater than
lv or d.tags contains
sameaddr
. - If d.alen.max is not equal to zero and d.alen.ref is empty and d is not the last element in c.data, set cur to the maximum length of d; otherwise, set cur to the minimum length of d.
- Add cur to len.
- Abort the program if len is greater than 4294967295.
- Advance to the next element if d.clv is greater than
lv or d.tags contains
- Return len.
The maximum class length of class c at level lv is computed as follows:
- Let cur be a 32-bit unsigned binary integer.
- Let len be a 33-bit unsigned binary integer.
- Set len to zero.
- For each data member d in c.data:
- Advance to the next element if d.clv is greater than lv.
- Set cur to the maximum length of d.
- Add cur to len.
- Set len to 4294967295 if len is greater than 4294967295.
- Return len.
Document Processor
A Document Processor is a program that uses a Parser to fill a set of loaded modules and then does something with them.
Parser
A Parser consists of a line parser, an instruction parser and an instruction processor.
The state of a Parser consists of the following objects:
- mset: set of loaded modules.
- mcur: object reference to the current module.
- text: object reference to the current item description.
- format: name of the current text format.
The mset is supplied by the caller as an input/output parameter. Additionaly, the parser takes four objects as input parameters:
- mcid: the target module identifier.
- mclv: the target module's class level.
- ignoreText: boolean, if true, text lines are ignored.
- fetch: an external function for fetching Documents; input is the target module identifier and level.
The parser loads a module using the following algorithm and then postprocess the set of loaded modules. Postprocessing MAY be done as part of loading or as a separate step. The result is either success or failure.
- Search mset for a module m such that m.this.cid is equal to mcid.
- If found and m.this.clv is no less than mclv, return success.
- If found and m.this.clv is less than mclv, remove m from mset.
- Obtain an octet stream doc by calling fetch, passing mcid and mclv as arguments.
- If fetch has failed, return failure.
- Set mcur to a newly created module.
- Set mcur.this to a newly created class.
- Insert mcur.this into mcur.types.
- Set mcur.this.cid to mcid.
- Set text to nothing,
- Set format to
markdown
. - Invoke the line parser passing doc and ignoreText.
- If the parser failed, delete mcur and return failure.
- If the mcur.this.clv is less than mclv, delete mcur and return failure.
- Insert mcur into mset.
- Iterate mcur.mload and recursively call this algorithm with the arguments from the elements; if any of the referenced modules failed to load, return failure.
- Return success.
Line parser
State of the line parser consists of the following objects:
- line: current line (list of characters).
- wslv: current line indentation (integer).
- skip: ignored line indentation (ingeter).
The line parser takes two objects as input parameters:
- input: byte stream.
- ignoreText: boolean.
The parser loads lines until the end of input. Each loaded line is then processed.
Line loading
In order to load a line follow, the parser follows algorithm:
- Let cr be a boolean.
- Set line to the empty list.
- Set cr to false.
- While input is not empty:
- Decode the next character c from input.
- If line is longer than 1024 characters, return failure.
- Append c to line.
- If cr is true and c is U+000A LINE FEED, remove the last character in line and return success.
- If c is U+000D CARRIAGE RETURN, set cr to true; otherwise, set cr to false.
- Return success.
Line processing
The line parser counts the amount of whitespace at the beginning of line and stores the resulting value in wslv.
The first line in the document MUST begin with the sequence of eight bytes with values as defined by <k1os>.
Further processing depends on the first character after the whitespace.
If the line is a one-line comment, the line is ignored.
If the line is a multi-line comment, the parser loads subsequent lines until another multi-line comment is encountered. All of these lines are ignored.
If the line is an instruction, skip is set to wslv. Whitespace at the beginning and end of line is removed. The U+002E FULL STOP at the beginning is removed. The instruction parser is then invoked on line. In case of failure, the Parser MUST immediately return failure.
The other remaining possibility is that the line is text. If ignoreText is true, the line is ignored. Otherwise, up to skip whitespace characters are removed since the beginning of line. If the first character after the removal is \
, the character is removed. If text references an object, create and append an object o with o.data equal to line and o.format equal to format, to text.
Instruction parser
The instruction parser converts the loaded line into a list of typed arguments and invokes the named instruction processor's function.
Instructions begin with a four-letter function name, followed by argument tokens, each preceeded by whitespace, as per the <cmd0> rule.
cmd0 = fun0 *( 1*WSP arg0 ) fun0 = 4LOALPHA arg0 = TAGS / FID / FIDN / ALEN / VAL / CID / CREF arg0 /= EXPR / UINT / ARGT / REGT / MEMT / IREF / MREF / NAME
The parser MUST return failure on any unrecognized function or argument, or when parsed arguments do not match the function's arguments.
Argument tokens are defined such that the parser can determine the type of an argument from the first few characters. The order of alternatives in <arg0> is the recommended order of tests.
Argument rules use capital letters by convention.
Class identifier
The <CID> rule represents a class identifier in hexadecimal notation.
CID = "!" ( id16 / %s"NOID" ) id16 = 2HEXDIGIT 15( [ "-" ] 2HEXDIGIT )
The keyword NOID
is the no-value ID (every octet is equal to zero).
Unsigned integer
The <UINT> rule represents a 64-bit unsigned binary integer value, in decimal or hexadecimal notation.
UINT = udec / uhex udec = 1*20DIGIT uhex = "0x" 1*16HEXDIGIT
Integer arguments are unsigned and may be at most 64-bit long. They are either in decimal or hexadecimal notation.
Function identifier
The <FID> rule represents a function identifier. The <FIDN> rule represents a pair of a name and a function identifier.
FID = "#" UINT FIDN = "#" NAME "#" UINT
Because there are instructions with more than one function identifier, there are two variants: one with a name and one without.
The value of an FID, when specified as an argument, MUST be non-zero.
Tags
The <FIDN> rule represents a list of tags.
TAGS = tag *( 1*WSP tag ) tag = "+" 1*16LOALPHA
Tags are character string of up to 16 characters, which are preceeded by a U+002B PLUS SIGN character.
They are parsed into a list of strings.
Functions test for the presence of a tag in the list.
Name
The <NAME> rule represents a name.
NAME = LOALPHA *63( LOALPHA / DIGIT / "_" )
Items are given a human-readable name for reference. All letters in names MUST be small.
It is RECOMMENDED that names of functions be composed in subject-object-verb order, for example object_units_replace
. The recommendation is for consistency and grouping of members. Note that one can generate aliases in camelCase
, too, if needed. Source code could be converted back and forth.
Member reference
The <MREF> rule represents a member reference.
MREF = 1*( "." NAME )
Item reference
The <IREF> rule represents an item reference.
IREF = [ CID / NAME ] MREF
An item reference begins with a module reference, followed by a member reference.
The module reference may be omitted as a shorthand for referencing the module declared by the currently parsed document.
References are resolved after all modules are loaded. Referenced item MAY be declared later in a document.
Memory type
The <MEMT> rule represents a type reference associated with a data member.
MEMT = href / oref href = hndr ":" ( oref / "?" ) hndr = %s"none" / %s"read" / %s"rdex" / %s"rdwr" / %s"rwex" oref = %s"mem:" NAME / UINT ":" IREF
Reference to a class instance begins with an item reference to a class, followed by its level after a colon.
Reference to a predefined class instance begins with mem:
, followed by the predefined class name.
Predefined classes handle
, iface
and class
can only be referenced via a handle; they cannot be used as a memory type.
Handles begin with access rights associated with the handle (<hndr>), followed by a class reference in angle brackets.
Handles may reference objects of undefined (?
) type.
For example, read:0:stream.buffer
means a read-only handle to an instance of class buffer
at level 0 from module stream
.
Register type
The <REGT> rule represents a type reference to a register type.
REGT = %s"reg:" ( oref / NAME )
<NAME> MUST be a register type name as defined in [MACHINE], except the name of a memory reference. (Handles are for that.)
Parameter type
The <ARGT> rule represents a type reference associated with a function parameter.
ARGT = REGT / MEMT
Array length
ALEN = "[" [ MREF ":" ] arrl [ ":" arrl ] "]" arrl = UINT / %s"MAX"
The value obtained by parsing <arrl> into an unsigned integer MUST NOT be greater than 232-1.
The special keyword MAX
is an alias for 232-1.
Parsing examples:
[10] => min = 10, max = 10, var = () [1:20] => min = 1, max = 20, var = () [2:MAX] => min = 2, max = MAX, var = () [len:4:255] => min = 4, max = 255, var = ("len") [obj.len:MAX] => min = 0, max = MAX, var = ("obj", "len")
Value
VAL = "=" vval vval = vreg / vref / vobj / varr / CID vreg = valf / vali / valu / valc / valb vref = "&" IREF vobj = "{" [ NAME "=" vval *( "," NAME "=" vval ) ] "}" varr = "[" [ vval ] *( "," [ vval ] ) "]" sign = "+" / "-" vhex = "0x" 1*HEXDIGIT vdec = 1*DIGIT fdec = vdec [ "." 1*DIGIT ] [ "e" [ sign ] vdec ] fbin = vhex [ "." 1*HEXDIGIT ] [ "p" [ sign ] vdec ] valu = vhex / vdec vali = sign valu valf = [ sign ] ( "NaN" / "INF" / fdec / fbin ) valc = %s"ce" / %s"lt" / %s"eq" / %s"gt" valb = %s"true" / %s"false"
Values specify values of octets on given positions of an object. They can also be assigned to a name, creating a named value for reference.
The <VAL> rule is the most complex one; it contains recursion. Parsers MUST verify that all value lists are correctly terminated. Please study the rules calmly and thoroughly.
A register value can only be assigned to an instance of a register class; it is one of:
- <valf>: real number.
- <vali>: (signed) integer.
- <valu>: unsigned integer.
- <valc>: comparision result value.
- <valb>: boolean.
<valn> references a named value.
<varr> is array of values. The target object MUST also be an array. The values in the array MUST be valid elements of the object. If the target array is longer, the excess values are undefined. An element MAY be omitted, in which case its value is undefined.
<CID> is a more readable notation for an array of 16 unsigned integers. Target objects of type id16
are expected, although, technically, any array of 16 or more integers is valid for this value.
<vobj> is a set of name-value pairs. The names reference a data member of the target object. It is an error if there is no data member with a matching name. Unreferenced members have undefined values.
Condition reference
The <CREF> rule represents a condition reference.
CREF = "?" NAME
A condition reference begins with a question mark, followed by a name of a condition.
Expression
The <EXPR> rule represents an expression.
EXPR = "(" ( elog / eref / ecmp ) ")" elog = elop *( 0*WSP EXPR ) elop = "all=1" / "all=0" / "any=1" / "any=0" eref = erop 0*WSP CREF erop = "is=1" / "is=0" ecmp = ecop 0*WSP MREF 0*WSP VAL ecop = "eq" / "ne" / "lt" / "le" / "gt" / "ge"
An expression begins with a left parenthesis, followed by the expression operator and its arguments, followed by a right parenthesis.
Input text | Expression type and operator name |
---|---|
all=1 | logical expression, 'all are true' |
all=0 | logical expression, 'all are false' |
any=1 | logical expression, 'any is true' |
any=0 | logical expression, 'any is false' |
eq | comparision; 'is equal to' |
ne | comparision; 'is not equal to' |
lt | comparision; 'is less than' |
le | comparision; 'is less than or is equal to' |
gt | comparision; 'is greater than' |
ge | comparision; 'is greater than or is equal to' |
is=1 | condition reference; 'condition holds' |
is=0 | condition reference; 'condition does not hold' |
Path
PATH = "/" ppfx "/" 1*pchar ; pchar from RFC 3986 ppfx = %s"data" / %s"node" / %s"sync"
Path to an external resource.
Paths MUST begin with a predefined prefix and conform to the rules of URI [[RFC3986]][] path components.
Path prefixes correspond to:
/data/
: read-only resources,/node/
: node-specific resources,/sync/
: synchronized resources,/user/
: user-provided resources; cannot be declared; these are named by the user, not the module.
Paths are case-sensitive.
The full URI is kueea1am://<id16><PATH>
.
Instruction processor
Functions are defined by listing their parameters in ascending order, describing the function and formally specifiying its outcome.
State of the instruction processor consists of the following variables:
- mfin: level finalization state (boolean).
- clvl: current class level (integer).
- ccur: current class (reference).
- fcur: current function (reference).
An item name colllision between a class c and a name name occurs when any of the following is true:
- name is
regt
. - Any data member d in c.data is such that: d.name is equal to name.
- Any function member f in c.func is such that: f.name is equal to name.
- Any named value v in c.nval is such that: v.name is equal to name.
- Any named reference i in c.nref is such that: i.name is equal to name.
Additionally, if c is mcur.this:
- Any class c in mcur.types is such that: c.name is equal to name.
An FID collision occurs against a function identifier fid when any class c in mcur.types is such that: any function member f in c.func is such that: f.fid is non-zero and equal to fid.
A class level violation occurs in list data when data is not an empty list and its last element d is such that: d.clv is equal to or greater than ccur.clv and d.mlv is less than mcur.this.clv.
The k1md
function
k1md = %s".k1md" 2SP CID ; The first 8 bytes are: 2E 6B 31 61 6D 20 20 21
Verifies input as a Document and begins declaration of a module (at level 0, finalized).
- <CID> mcid
- Class identifier of the declared module.
The instruction MUST be the first line of a Document. It is special in that it has no preceeding whitespace. It MUST NOT be on any subsequent line (except the first).
Modules are a singleton objects. Nodes have at most one instance of a given module, which is shared by all module implementations on that node. The instance is stored in volatile memory of the module's context.
Each module has a set of predefined function members. These functions are defined in another section of this document.
A module declaration declares a class. Thus, modules also have predefined functions of a class.
Implementations MUST return failure if any of the following is true:
- version is higher (more) than 0.
- mcur.this.cid is not mid.
Implementations MUST modify the state as follows:
- Set mcur.this.clv to 0.
- Set mcur.this.name to
this
. - Set mfin to true.
- Set ccur to mcur.this.
- Set fcur to nothing.
- Set text to ccur.text.
- Insert predefined functions to mcur.this.func.
The text
function
Updates the active text buffer(s).
- <NAME> name
- Buffer name.
- <TAGS> tags
- Tags.
The default buffer is named markdown
.
Example:
This will go into the `markdown` buffer. .text html <p>This will go into the <code>html</code> buffer.</p> .text ia32 .text amd64 +multi This will go into both "ia32" and "amd64" buffers.
Implementations MUST modify the state as follows:
- Set format to type.
The load
function
References an external module.
- <CID> mid
- Module identifier.
- <UINT> mlv
- Required minimum level of the module.
- <NAME> name (optional)
- Alias for the module.
Modules reference items declared in other modules. A reference to a module that has not been loaded is invalid.
Item references are processed after all modules are loaded. This instruction does not have to appear before a reference. It is RECOMMENDED that it appears at the beginning of a document.
Implementations MUST return failure if any of the following is true:
- mid is nil.
- mlv is equal to or more than 28.
- name was given and any module import i in mcur.mload is such that: i.name is equal to name.
Implementations MUST modify the state as follows:
- Find a module import i in mcur.mload, where i.mid is equal to mid.
- If found and i.mlv is less than mlv:
- Set i.mlv to mlv.
- Create a new module import i.
- Set i.mid to mid.
- Set i.mlv to mlv.
- Set i.name to name.
- Insert i into mcur.mload.
The mlvl
function
Increases the current level of the module.
- <UINT> mlv
- Module level.
- <TAGS> tags
- Tags associated with the level.
The new level applies to all items declared afterward. Module levels may only be increased.
tags contains either +final
or +draft
. If the level is final, no changes to it (and lower ones) will ever be made. This rule refers to the resulting tree of declared items; descriptions of the items are not considered.
What this means is that if a Document declares the same module as another, already known Document, then the declared items in both of these documents MUST be the same, except for their textual descriptions.
Implementations MUST return failure if any of the following is true:
- mlv is equal to or greater than 28.
- mlv is less than mcur.this.clv.
- mlv is equal to 0, mcur.this.clv is equal to 0 and one of either: mcur.this.desc is not empty, mcur.this.data is not empty, mcur.this.func is not empty. mcur.this.nref is not empty. mcur.this.nval is not empty. mcur.this.ifaces is not empty. mcur.types has more than 1 element.
- tags contains both
+final
and+draft
. - tags contains neither
+final
nor+draft
. - tags contains
+final
and mfin is false.
Implementations MUST modify the state as follows:
- Set ccur to mcur.this.
- Set fcur to nothing.
- Set text to ccur.text.
- Set mfin to false if
+draft
in tags. - Set ccur.clv to mlv.
The cbeg
function
Begins a class declaration.
- <NAME> name
- Name of the class.
- <TAGS> tags
- Type of the class.
- <CID> cid (optional)
- Identifier of the class.
By default, cid is set to a version 5 UUID (namespace, SHA-1) with the namespace being the module's class identifier and the name being name (encoded in UTF-8).
If cid is nil, the class will not have a descriptor. It is not possible to allocate an instance of such a class. These classes are only used as an object type.
If tags contains +iface
, the class is an interface. Interfaces require cid for their identification. It is not possible to allocate an instance of an interface. These classes can be used as an object type.
Each class has a set of predefined function members. These functions are defined in another section of this document.
Implementations MUST return failure if any of the following is true:
- tags contains
+iface
and cid is nil. - Root name collision occurs between mcur and name.
- Any class c in mcur.types is such that: c.name is equal to name and c.cid is not equal to cid.
- cid is not nil and any class c in mcur.types is such that: c.cid is equal to cid and c.name is not equal to name.
Implementations MUST modify the state as follows:
- Find the class c in mcur.types, where c.name is equal to name.
- If c was not found:
- Set c to a newly created class.
- Insert c into mcur.types.
- Set c.cid to cid.
- Set c.clv to 0.
- Set c.name to name.
- Set c.tags to tags.
- Set ccur to c.
- Set fcur to nothing.
- Set text to c.text.
- Insert predefined functions to ccur.func.
The cend
function
Explicitly ends a class declaration.
No parameters.
Implementations MUST modify the state as follows:
- Set ccur to mcur.this.
- Set fcur to nothing.
- Set text to mcur.this.text.
The clvl
function
Modifies the current level of the current class.
- <UINT> level
- New level.
- <TAGS> tags (optional)
- Tags.
The presence of +fini
in tags specifies that the instance at this class level has a destructor. The destructor is also implied whenever any of the data members of the class at the level has a destructor because the members' destructors are called from the class destructor.
Implementations MUST return failure if any of the following is true:
- ccur refers to mcur.this.
- level is equal to or more than 28.
- tags contains
+fini
and class level violation occurs in ccur.data. - tags contains
+fini
and any function member f in ccur.func is such that: f.clv is equal to level and f.name is equal to_fini
.
Implementations MUST modify the state as follows:
- Set ccur.clv to level.
- Set fcur to nothing.
- Set text to ccur.text.
- If tags does not contain
+fini
, finish. - Let f be a new function member representation.
- Set f.mid to mcur.this.clv.
- Set f.cid to level.
- Set f.name to
_fini
. - Let fid be the default FID of f.
- Verify that FID collision does not occur against fid.
- Set f.fid to fid.
- Insert f into ccur.func.
The creg
function
Assigns a register type to a class.
- <REGT> type
- Register type.
This instruction declares that the class is a register class.
The class is assigned a register type and gains two predefined function members for transferring a value from a register to object data and vice-versa, called save
and load
respectively. Otherwise, there is no mapping between registers and object data. Object data is nothing more than an array of opaque octets.
Any class MAY define functions with these names. Their parameters and semantics are predefined only for register classes.
Implementations MUST modify the state as follows:
- Return failure if any of the following is true:
- ccur.regt is not empty.
- class level violation occurs in ccur.data.
- item name collision occurs between ccur and
save
.- item name collision occurs between ccur and
load
.
- ccur.regt is not empty.
- Set ccur.regt to type.
- Set ccur.rclv to ccur.clv.
- Create a new function member save.
- Set save.name to
save
. - Set save.fid to the defualt FID of save.
- Return failure if FID collision occurs against save.fid.
- Create a new function parameter ireg.
- Set ireg.type to (a copy of) type.
- Set ireg.name to
reg
. - Append ireg to save.params.
- Insert save into ccur.func.
- Create a new function member load.
- Insert
+read
into load.tags. - Set load.name to
load
. - Set load.fid to the defualt FID of load.
- Return failure if FID collision occurs against load.fid.
- Create a new function parameter oreg.
- Set oreg.type to (a copy of) type.
- Set oreg.name to
reg
. - Insert
+output
into oreg.tags. - Append oreg to load.params.
- Insert load into ccur.func.
The cond
function
Declares a named condition for the current class.
- <NAME> name
- Name of the condition.
- <EXPR> expr
- Expression of the condition.
Implementations MUST modify the state as follows:
- Return failure if any condition c in ccur.cond is such that c.name is equal to name.
- Create a new condition c.
- Set c.name to name.
- Set c.expr to expr.
- Insert c into ccur.cond.
The desc
and data
functions
Declares the next data member, in memory order, of an interface descriptor (desc
) / of an instance (data
).
Classes which are iterfaces have an additional object, in addition to the usual instance of the class, called an interface descriptor. A class that implements an interface includes the interface descriptor as part of its class descriptor as well as declares an instance of the (interface) class as one of its data members.
Interface descriptors are data used by programs which manipulate instances of unknown classes via their set of implemented interfaces.
All members of a class descriptor are read-only, constant values. The values of an interface descriptor vary by module implementation. Data in a class descriptor is valid for every instance of the class.
- <TYPE> type
- Type of the object.
- <NAME> name
- Name of the object.
- <ALEN> alen (optional)
- Length of an array.
- <VAL> value (optional, only in
data
) - Default value.
- <UINT> align (optional)
- Memory alignment.
- <CREF> cond (optional)
- Condition reference. The member exists if the condition holds.
- <TAGS> tags (optional)
- Tags.
- <NAME> uref (optional)
- Reference to a previously declared union.
When omitted, alen.max is zero.
When omitted, value is the undefined value.
When omitted, align is zero.
When omitted, cond is empty.
When omitted, tags is empty.
When omitted, uref is empty.
If alen.min is less than alen.max and alen.ref is empty, then the currently declared data member SHOULD be the last data member declared at the current class level. The minimum amount is only a hint to the programmer in this case.
If alen.ref is not empty, then the allocated space of the declared data member is known only at program run time. Thus, memory addresses of subsequent data members are variables, too.
The current union of list data is the data union which begins from the last element d in data, such that d.tags does not contain sameaddr
.
Implementations MUST modify the state as follows:
- Let data be an object reference.
- Let prev be an object reference.
- If in the
desc
function:- Return failure if ccur.tags does not contain
iface
. - Return failure if any data member d in ccur.desc is such that d.name is equal to name.
- Make data refer to ccur.desc.
- Return failure if ccur.tags does not contain
- If in the
data
function:- Return failure if item name collision occurs between ccur and name.
- Make data refer to ccur.data.
- Return failure if any of the following is true:
- align is neither zero nor a memory alignment.
- alen.max is non-zero and alen is an invalid array length.
- tags contains
+sameaddr
and any of:- data is empty.
- cond is empty, alen.max is non-zero and alen.ref is not empty.
- value is not the undefined value.
- tags does not contain
sameaddr
and uref is not empty. - tags contains
sametext
and any of:- data is empty.
- uref is not empty.
- Class level violation occurs in data.
- Create a new data member d.
- Set d.mlv to mcur.this.clv.
- Set d.clv to ccur.clv.
- Set d.tags to tags.
- Set d.type to type.
- Set d.alen to alen.
- Set d.align to align.
- Set d.value to value.
- Set d.cond to cond.
- Set d.name to name.
- If uref is not empty:
- Find a data member u in data, u.name of which is equal to uref.
- Return failure if not found.
- Return failure if u.tags contains
sameaddr
. - Insert d after the last element of the data union that begins with u.
- Make prev refer to u.
- Otherwise:
- Make prev refer to the last element in data.
- Append d to data.
- Make fcur refer to nothing.
- Make text refer to prev.text if
tags contains
sametext
. Otherwise, make text refer to d.text.
The nval
function
Declares a named value.
- <NAME> name
- Name of the value.
- <VAL> value
- Named value.
Implementations MUST return failure if any of the following is true:
- item name collision occurs between ccur and name.
Implementations MUST modify the state as follows:
- Create a new named value v.
- Set v.name to name.
- Set v.value to value.
- Append v to ccur.nval.
- Set fcur to nothing.
- Set text to v.text
The nref
function
Declares a name for an item reference (an alias).
- <NAME> name
- Name for an item.
- <IREF> iref
- Reference to the item.
Implementations MUST return failure if any of the following is true:
- item name collision occurs between ccur and name.
Implementations MUST modify the state as follows:
- Create a new named reference r.
- Set v.name to name.
- Set v.iref to iref.
- Insert r into ccur.nref.
- Set fcur to nothing.
- Set text to r.text
The fbeg
function
Begins declaration of a function member.
- <NAME> name
- Name of the function.
- <TAGS> tags (optional)
- Tags.
- <FID> fid (optional)
- Identifier of the function.
- <FIDN> fidn1 (optional)
- Identifier of the first autogenerated function.
- <FIDN> fidn2 (optional)
- Identifier of the second autogenerated function.
The following tags may be specified in tags:
+message
- Message function declaration.
+proto
- Function prototype declaration.
+event
- Event declaration.
+init
- Class constructor declaration.
+static
- Function is independent of an instance.
+read
- Function does not write to an instance.
+module
- Function requires access to module memory.
+kernel
- Function requires access to kernel memory.
+more
- Function expects more parameters than declared.
The tags +message
, +proto
, +event
and +init
are mutually exclusive.
If tags contains +message
, a function map is declared, from a pair of a langauge tag and a character coding to a function, which takes the delcared parameters and returns a human-readable message. A module implementation SHOULD implement only one function per language. The kernel automatically converts the encoding to the requested one. The functions are local to an implementation and MAY NOT be exported. The FID of the function identifies the array of implemented functions. These functions are accessed through the kernel via a special call.
If tags contains +proto
, only a function type is declared. No function is declared and no FID is allocated. A class may reference the prototype in order to declare one.
If tags contains +init
, two functions are declared. The FID fid is assigned to the constructor. It is named name, which SHOULD begin with the word init
. The function returns a status boolean and takes an implied read-write handle to an instance as the first parameter, called this
. The FID named create
is assigned to the creator. The name of the function is name with $create
appended. The creator takes an implied set of parameters inserted at the front, which determine the location and type of allocated memory, where an instance will be constructed by the constructor. The just-constructed object is then returned to the caller.
If tags contains +event
, three items are declared: a function prototype for an event handler and a handler installer and uninstaller functions. The FID fid is assigned to the event prototype. The event prototype has an implicit first parameter: a read-write handle to an undefined, previously installed object. The prototype is named name. The FID named install
is assigned to the event installer. It returns a status boolean and takes two parameters: a function reference to a handler and a read-write handle to an object. The object is passed as the first argument to the handler. The function is named name with $install
appended. The FID named uninstall
is assigned to the event uninstaller. It returns a status boolean and takes one parameter: a function reference to a handler to be uninstalled. The function is named name with $uninstall
appended.
Implementations MUST conform to the following algorithm:
- Return failure if any of the following is true:
- tags contains both
+static
and+read
. - ccur refers to mcur.this and
tags contains
+read
. - item name collision occurs between ccur and name.
- tags contains both
- If tags contains
+message
:- If fid was omitted, set fid to the default FID of a member named name of ccur.
- Return failure if any of the following is true:
- tags contains
+proto
,+event
or+init
. - Either fidn1 or fidn2 was not omitted.
- FID collision occurs against fid.
- tags contains
- If tags contains
+proto
:- Return failure if any of the following is true:
- tags contains
+message
,+event
or+init
. - tags contains
+module
or+kernel
. - Either fid, fidn1 or fidn2 was not omitted.
- tags contains
- Return failure if any of the following is true:
- If tags contains
+init
:- Let nameC be a copy of name.
- Append
$create
to nameC. - If fid was omitted, set fid to the default FID of a member named name of ccur.
- If fidn1 was not omitted, return failure
if the associated name is not equal to
create
. - If fidn1 was omitted, set fidn1 to the default FID of a member named nameC of ccur.
- Return failure if any of the following is true:
- tags contains
+message
,+proto
or+event
. - fidn2 was not omitted.
- FID collision occurs against fid or fidn1.
- tags contains
- If tags contains
+event
:- Let fidI be an invalid function identifier (zero).
- Let fidU be an invalid function identifier (zero).
- Let nameI be a copy of name.
- Let nameU be a copy of name.
- Append
$install
to nameI. - Append
$uninstall
to nameU. - If fidn1 was not omitted and
its associated name is equal to
install
, set fidI to fidn1. - If fidI is invalid and fidn2 was not omitted
and its associated name is equal to
install
, set fidI to fidn2. - If fidI is invalid, set fidI to the default FID of a member named nameI of ccur.
- If fidn1 was not omitted and
its associated name is equal to
uninstall
, set fidU to fidn1. - If fidU is invalid and fidn2 was not omitted
and its associated name is equal to
uninstall
, set fidI to fidn2. - If fidU is invalid, set fidU to the default FID of a member named nameU of ccur.
- Return failure if any of the following is true:
- tags contains
+message
,+proto
or+init
. - tags contains
+read
. - ccur does not refer to mcur.this and
tags contains
+static
but neither+module
nor+kernel
. - fid was not omitted.
- fidn1 is named neither
install
noruninstall
. - fidn2 is named neither
install
noruninstall
. - Both fidn1 and fidn2 are named
install
. - Both fidn1 and fidn2 are named
uninstall
. - FID collision occurs against fidI or fidU.
- tags contains
- Create a new function member fi.
- Set fi.tags to a copy of tags.
- Insert
+$install
into fi.tags. - Remove
+more
from fi.tags. - Set fi.mlv to mcur.this.clv.
- Set fi.clv to ccur.clv.
- Set fi.fid to fidI.
- Set fi.name to nameI.
- Insert fi into ccur.func.
- Create a new function member fu.
- Set fu.tags to a copy of tags.
- Insert
+$uninstall
into fu.tags. - Remove
+more
from fu.tags. - Set fu.mlv to mcur.this.clv.
- Set fu.clv to ccur.clv.
- Set fu.fid to fidU.
- Set fu.name to nameU.
- Insert fu into ccur.func.
- Create a new function parameter p.
- Set p.type to
read
. - Set p.name to
handler
. - Append a copy of p to fi.params.
- Append a copy of p to fu.params.
- Set p.type to a
rdwr>
. - Set p.name to
userdata
. - Append p to fi.params.
- Insert
+proto
into tags. - Remove
+module
from tags. - Remove
+kernel
from tags.
- If tags contains
+proto
, set fid to zero. - If ccur refers to mcur.this,
insert
+static
into tags. - Create a new function member f.
- Set f.mlv to mcur.this.clv.
- Set f.clv to ccur.clv.
- Set f.fid to fid.
- Set f.tags to tags.
- Set f.name to name.
- Insert f into ccur.func.
- If tags contains
+message
:- Create a new function parameter p.
- Insert
+output
to p.tags. - Set p.type to
rdwr>
. - Set p.name to
message
. - Append p to f.params.
- Create a new function parameter p.
- Set p.type to
FID
. - Set p.name to
enc_and_lang
. - Append p to f.params.
- Set fcur to f.
- Set text to f.text.
- Return success.
The fend
function
Explicitly ends a function declaration.
No parameters.
Implementations MUST modify the state as follows:
- Set fcur to nothing,
- Set text to ccur.text.
The ferr
function
Declares an error code that the current function may return.
- <NAME> name
- Name of the error message.
- <FID> sid (optional)
- Function identifier of the error message.
If the function does not return an error code, then the document simply does not declare any error codes.
Implementations MUST conform to the following algorithm:
- If fid is invalid or was omitted, set fid to the default FID of a member named name of a module.
- Return failure if any of the following is true:
- fcur is nothing.
- fcur.tags contains
+message
. - fcur.tags contains
+event
.
- For each error code code in fcur.codes: Return failure if code.fid is equal to fid.
- Create a new error code code.
- Set code.name to name.
- Set code.fid to fid.
- Append code to fcur.codes.
- Set text to f.text.
- Return success.
The fpar
function
Declares the next in-order parameter of the current function.
- <ARGT> type
- Type of the parameter.
- <NAME> name
- Name of the parameter.
- <TAGS> tags (optional)
- Parameter tags.
Function arguments are objects passed by either handle or value.
Objects passed by value are specified by simply writing a class reference.
.fpar module.class:0 by_value
If a class is a register class, the value is passed via one register. Otherwise, a reference to a local copy of the object is the value. Objects passed by value MUST NOT be longer than 4096 octets.
Objects passed by handle are written by specifying the handle.
.fpar read<module.class:0> by_handle
By default, a parameter is an input parameter. In order to declare an output parameter, include +output
in tags. No other tag is recognized.
.fpar module.class:0 output_value +output
Order in which parameters are declated is significant. It is RECOMMENDED to first declare output then input parameters.
Implementations MUST conform to the following algorithm:
- Return failure if any of the following is true:
- fcur is nothing.
- name is
this
.
- For each function parameter p in fcur.params: Return failure if f.name is equal to name.
- Create a new function parameter p.
- Set p.tags to tags.
- Set p.type to type.
- Set p.name to name.
- Append p to fcur.params.
- Set text to p.text.
- Return success.
The impf
function
Declares an implementation of a prototyped function.
- <IREF> proto
- Reference to a prototype function.
- <NAME> name
- Name of the function (the implementation).
- <TAGS> tags (optional)
- Tags.
- <FID> fid (optional)
- Identifier of the function.
Prototype implementations are marked with a special tag. The referenced prototype is stored as the return value type.
The following tags may be specified in tags:
+static
- Function is independent of an instance.
+module
- Function requires access to module memory.
+kernel
- Function requires access to kernel memory.
Implementations MUST return failure if any of the following is true:
- tags contains
+proto
. - tags contains
+event
. - tags contains
+message
. - tags contains
+init
. - tags contains
+read
. - tags contains
+more
. - item name collision occurs between ccur and name.
- FID collision occurs against fid.
Implementations MUST modify the state as follows:
- Insert
+$protoref
into tags. - Create a new function member f.
- Set f.mlv to mcur.this.clv.
- Set f.clv to ccur.clv.
- Set f.fid to fid.
- Set f.tags to tags.
- Set f.name to name.
- Create a new function parameter p.
- Set p.type to proto.
- Append p into f.params.
- Insert f into ccur.func.
- Set fcur to nothing.
- Set text to f.text.
The impc
function
Declares that the current class implements an interface.
- <TYPE> type
- Reference to the implemented interface.
- <MEMB> mref (optional)
- Reference to the data member associated with the interface.
An interface object is an instance of type.
If mref is omitted, then the class does not declare any data member as the interface object of the interface. This is only permitted if type has no data members.
The interface object MUST NOT be preceeded by a variable-length member. The offset must be the same value for every instance of the class.
Module implementations set the values of interface descriptor fields via the implementation's class definition document. A RECOMMENDED syntax for these documents is proposed in another section.
Implementations MUST return failure if any of the following is true:
- ccur.tags contains
+iface
.
Implementations MUST modify the state as follows:
- Create a new interface reference iface.
- Set iface.type to type.
- Set iface.mref to member.
- Set iface.clv to ccur.clv.
- Insert iface into ccur.ifaces.
- Set fcur to nothing.
- Set text to ccur.text.
The path
function
Declares a path to an external resource.
- <PATH> path
- Path to the resource.
Implementations MUST return failure if any of the following is true:
- Any path p in mcur.paths is such that: p.path is equal to path.
Implementations MUST modify the state as follows:
- Create a new path p.
- Set p.path to path and p.mlv to mcur.self.clv.
- Insert p into mcur.paths.
- Set cbeg to nothing,
- Set func to nothing,
- Set data to nothing.
- Set text to p.text.
Postprocessing
Check this section later when it's written.
Register names
Specials
boolean
- A boolean value; either "true" or "false".
cmprval
(comparision result value)- Result of a comparision; one of: "error", "less than", "same as" or "more than".
Unsigned integers
u8
- Integer in the range [0, 28-1].
u16
- Integer in the range [0, 216-1].
u32
- Integer in the range [0, 232-1].
u64
- Integer in the range [0, 264-1].
u128
- Integer in the range [0, 2128-1].
Signed integers
i8
- Integer in the range [-27, 27-1].
i16
- Integer in the range [-215, 215-1].
i32
- Integer in the range [-231, 231-1].
i64
- Integer in the range [-263, 263-1].
i128
- Integer in the range [-2127, 2127-1].
Vectors of integers
PLACEHOLDER.
Binary floating-point numbers
f16
- IEEE 754 arithmetic format with base=2, p=11, emax=15.
f32
- IEEE 754 arithmetic format with base=2, p=24, emax=127.
f64
- IEEE 754 arithmetic format with base=2, p=53, emax=1023.
f80x87
- IEEE 754 arithmetic format with base=2, p=64, emax=16383.
f128
- IEEE 754 arithmetic format with base=2, p=113, emax=16383.
Vectors of binary floating-point numbers
PLACEHOLDER.
Decimal floating-point numbers
d32
- IEEE 754 arithmetic format with base=10, p=7, emax=96.
d64
- IEEE 754 arithmetic format with base=10, p=16, emax=384.
d128
- IEEE 754 arithmetic format with base=10, p=34, emax=6144.
Predefined classes
Predefined classes have no associated class level.
Their names are written with capital letters.
Octet
There is only one fundamental type: an OCTET
. All classes are essentially arrays of octets.
An octet occupies one memory address, under which there are at least 8 bits. If there are more than 8 bits, the excess bits MUST be cleared. There is no meaning associated with the bits of an octet.
Length of a class is expressed in octets.
Its register type is a vector of 8 bits. It is mappped to an 8-bit unsigned integer in practice.
Boolean
A BOOLEAN
is either true (non-zero) or false (zero).
The length of a boolean is 1 octet.
Its memory alignment is 1 octet.
Its register type is a 1-bit unsigned integer. It is mappped to an 8-bit unsigned integer in practice.
Status boolean
A STATUS
is a special boolean. It is used as the return value of functions.
If false, it means that there is nothing to report (success). Otherwise (if true), it means that the task's status stack was pushed onto. The caller should examine the stack before continuing.
The idea is that programs are written like this:
if function() returns true { code when function reports something, usually failure } otherwise, continue
If the function returns a status boolean, it is implied that the code in the if-block is an unlikely branch, because functions are assummed to generally execute successfully.
A status code is returned via the status stack along with other, supplementary information.
Comparision result
A CMPRVAL
is the result of a comparision. It is used as the return value of functions.
The length of a comparision result is 1 octet.
Its memory alignment is 1 octet.
Its register type is a 2-bit signed integer. It is mappped to an 8-bit signed integer in practice.
Possible values of a comparision result, when comparing object LHS against object RHS, are:
- 0
- The objects are equal.
- 1 (and greater)
- LHS is greater than RHS.
- -1
- Comparision failed. Interpreted as a true status boolean.
- -2 (and less)
- LHS is less than RHS.
Simply put, one first tests for -1 (unsuccessful) and then compares with 0.
Functions that return this value also set associated CPU flags accordingly, so that a conditional jump may immediately follow the function call.
Object length
The length of an objsize
is 4 octets.
Its memory alignment is 4 octets.
Its register type is a 32-bit unsigned integer.
It represents a length of an object, in octets. Octets are ordered in increasing order of significance.
Memory address
The length of an ADDRESS
is 8 octets.
Its memory alignment is 8 octets.
Its register type is a 64-bit unsigned integer.
It represents a memory address. Octets are ordered in increasing order of significance.
Function identifier
The length of an FID
is 8 octets.
Its memory alignment is 8 octets.
Its register type is a 64-bit unsigned integer.
It represents a function identifier. Octets are ordered in increasing order of significance.
16-octet identifier
The length of an ID16
is 16 octets.
Its memory alignment is 8 octets.
It has no register type.
It is an opaque array of 16 octets.
System memory reference
The length of a HANDLE
is 32 octets.
Its memory alignment is 8 octets.
Its register type is a CPU-defined virtual memory reference.
Its data members are as follows:
.cbeg HANDLE !NOID .data ADDRESS address .data ID16 node_id .data OCTET nonce [8]
address
- Lower bits of the system memory address.
node_id
- Higher bits of the system memory address.
nonce
- Random value associated with the referenced object.
Loading from and saving into a handle are kernel functions, which translate the system memory address in the handle from and to a virtual memory address of the calling task.
This is not an object identifier. The address may point to any octet within an object.
Module reference
The length of an MREF
is 24 octets.
Its memory alignment is 8 octets.
It has no register type.
Its data members are as follows:
.cbeg MREF !NOID .data ID16 mcid .data OCTET mclv .data OCTET mbid [8] +sameaddr
mcid
- Class identifier of the referenced module.
mclv
- 8-bit unsigned integer. Minimum class level of the module.
mbid
- Build identifier of the target M-Build.
A reference is either to a specific M-Build by its build identifier or to any M-Build which implements the given module at the given level.
A build identifier is an 8-octet (256-bit) value, which is computed from relevant parts of a program image. The exact way to compute it depends on the image format.
The class level is considered when octets of the mbid
array at positions [1,7] have all of their bits cleared. Thus, such build identifiers are out of range and invalid.
Function reference
The length of an FREF
is 32 octets.
Its memory alignment is 8 octets.
It has no register type.
Its data members are as follows:
.cbeg FREF !NOID .data MREF mref .data FID fid
mref
- Module reference.
fid
- Function identifier.
Interface descriptor
The length of an IFACE
is variable. The minimum length is 24 octets.
Its memory alignment is 8 octets.
It has no register type.
Its data members are as follows:
.cbeg IFACE !NOID .data ID16 cid .data objsize clv_len .data objsize offset .data OCTET members [0:MAX]
cid
- Class identifier of the interface.
clv_len
- Class level and total length of the descriptor in octets. The level in is the most significant 8 bits. The length in is the remaining least significant 24 bits. The length is rounded up to a multiple of 8. The value is also an offset from the beginning of the descriptor to the next one.
offset
- Offset from the beginning of a class instance, specifying the location of the associated interface object. If equal to 4294967295, then there is no such object.
members
- Data members of the interface descriptor.
Class descriptor
The length of a CLASS
is variable. The minimum length is 32 octets.
Its memory alignment is 8 octets.
It has no register type.
It is a structure defined as follows:
.cbeg CLASS !NOID .data ID16 cid .data objsize len_dsc .data objsize len_min .data objsize len_max .data OCTET align .data OCTET clv .data OCTET flags .data OCTET ifaces_len .data IFACE ifaces [ifaces_len:MAX]
All OCTET
-typed fields are interpreted as an 8-bit unsigned integer.
cid
- Class: identifier.
len_dsc
- Total length of the descriptor, in octets. The length is rounded up to a multiple of 8. The value is also an offset from the beginning of the descriptor to the next one.
len_min
- Minimum length of an instance, in octets.
len_max
- Maximum length of an instance, in octets.
align
- Alignment exponent.
clv
- Class level.
flags
- Vector of 8 flag bits.
ifaces_len
- Amount of implemented interfaces.
ifaces
- Descriptors of implemented interfaces.
Class descriptor for level n of a class is directly followed by the descriptor for level n-1.
The descriptor for level n contains only those interfaces that were introduced at level n.
The defined flag bits in flags
are, counted from the least significant:
- bit 0
- The class has a destructor.
- bit 1
- The class contains handles. (It has an accessor.)
- bits 2-7
- Undefined. Must be cleared.
Predefined functions
This section defines predefined functions.
Each predefined function name begins with a U+005F LOW LINE character (_
), (which cannot normally be part of an item name). The function is also marked with a tag named $predef
. Register functions are additionaly marked with a tag named $reg
.
All functions except module functions operate on an instance. The first parameter (read-write handle) is omitted in definitions.
Register functions operate directly on memory of an object. They are defined in case the value does not need to be loaded. Most of them can be implemented with one or two CPU instuctions.
Register types also have two or three kinds of corresponding functions that operate entirely on CPU registers and does not reference memory. Names for these do not begin with _
(as this distinction is unnecessary) and have a digit appended to the name:
- Set with
1
appended which is the same; - Set with
2
appended which returns the result instead of overwriting and discards the defined return value. - Set with
3
appended which returns the result instead of overwriting and accepts an additional parameter for the defined return value, if any.
In C++ syntax with u8
and u32
as register types:
u8 c; u32 v, s; c = v.add1(2); // v = v + 2; c = carry bit; s = v.add2(2); // v = v ; ; s = v + 2 s = v.add3(2, c); // v = v ; c = carry bit; s = v + 2
Module functions
Module functions can only be called by the kernel. They require access to the module's context (where the instance is kept).
Creator
- Name
_create
- Parameters
- None.
- Return value
- No-access handle to the instance.
This function is called when there is no instance available and a function that requires access to the module's context is being called. An implementation at the lowest loaded level is chosen.
It creates a new instance of the module.
The instance becomes the current instance of the node. It remains in memory until the node shuts down or the instance is upgraded or downgraded.
Upgrador
- Name
_upgrade
- Parameters
- No-access handle to the old instance.
- Return value
- No-access handle to the new instance.
This function is called after an implementation is unloaded and the lowest level of all module implementations is higher than the level of the current instance. An implementation at the lowest loaded level is chosen.
The current instance is locked before the call and unlocked after.
This function MAY fail, in which case a null handle is returned. Failure have no consequences.
Downgrador
- Name
_downgrade
- Parameters
- No-access handle to the old instance.
- Target class level.
- Return value
- No-access handle to the new instance.
This function is called before an implementation is loaded or unloaded and the highest level of all module implementations would be lower than the level of the current instance. An implementation at the current instance level or higher is chosen.
The current instance is locked before the call and unlocked after.
This function MAY fail, in which case a null handle is returned. Failure prevents the loading of a lower level implementation.
Class functions
Class functions can only be called by the kernel.
Destructor
- Name
_destruct
- Parameters
- Read-write handle to the instance.
- Return value
- Status boolean.
The destructor is called when there are no more references from a memory context to the class instance.
The destructor is called in a new task.
The kernel deallocates memory of the instance after this function returns.
Mutual exclusion
- Name
_lock
- Parameters
- None.
- Return value
- None.
Acquires a lock on the object for exclusive access.
If the caller (task) disappears, the object is unlocked.
- Name
_unlock
- Parameters
- None.
- Return value
- None.
Releases a previously acquired exclusive access lock on the object.
The task fails if the object has not been locked by the caller.
Memory access updater
- Name
_access
- Parameters
- Read-only handle to the instance.
- Memory context change (unsigned integer).
- Return value
- Status boolean.
This function is called when a memory context gains or loses access to the instance; it propagates the call to relevant referenced objects.
This function executes within the same context as the initial caller. The instance is locked before the call and unlocked after.
This MUST be an automatically generated function which is exactly the same (machine code is the same) regardless of implementation. One reason is to be able to verify that the machine code is correct. The function is crucial for correct node operation.
Register functions
This section defines register functions.
Load and save
These functions apply to every register class.
- Name
_load
- Parameters
- None.
- Return value
- Register type; current value.
This function loads the value from memory into CPU registers.
- Name
_save
- Parameters
- Register type; new value.
- Return value
- None.
This function saves the value from CPU registers into memory.
Bitwise operations
These functions apply to every register class.
- Name
_not
- Parameters
- None.
- Return value
- None.
Performs the bitwise NOT operation (logical negation on each bit) on the value, then saves the result into the object, overwriting the initial value.
- Name
_and
- Parameters
- Second value; register type.
- Return value
- None.
Performs the bitwise AND operation (logical conjunction on each pair of corresponding bits) on the value and the second value, then saves the result into the object, overwriting the initial value.
- Name
_xor
- Parameters
- Second value; register type.
- Return value
- None.
Performs the bitwise XOR operation (exclusive disjunction on each pair of corresponding bits) on the value and the second value, then saves the result into the object, overwriting the initial value.
- Name
_set
- Parameters
- Second value; register type.
- Return value
- None.
Performs the bitwise OR operation (logical disjunction on each pair of corresponding bits) on the value and the second value, then saves the result into the object, overwriting the initial value.
- Name
_clr
- Parameters
- Second value; register type.
- Return value
- None.
Performs the bitwise AND operation on the value and the second value, after performing the bitwise NOT operation on the second value, then saves the result into the object, overwriting the initial value.
Logical bit shifts
These functions apply to every register class.
- Name
_lsl
- Parameters
- Amount of bits; unsigned integer of width N.
- Return value
- None.
Performs a logical left shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.
- Name
_lsr
- Parameters
- Amount of bits; unsigned integer of width N.
- Return value
- None.
Performs a logical right shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.
Circular bit shifts
These functions apply to every register class.
- Name
_csl
- Parameters
- Amount of bits; unsigned integer of width N.
- Return value
- None.
Performs a circular left shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.
- Name
_csr
- Parameters
- Amount of bits; unsigned integer of width N.
- Return value
- None.
Performs a circular right shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.
Arithmetic shifts
These functions apply only to signed integers.
- Name
_asl
- Parameters
- Amount of bits; unsigned integer of width N.
- Return value
- None.
Performs an arithmetic left shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.
- Name
_asr
- Parameters
- Amount of bits; unsigned integer of width N.
- Return value
- None.
Performs an arithmetic right shift on the value by the amount of bits, then saves the result into the object, overwriting the initial value.
Fundamental arithmetic
These functions apply to values that are single numbers.
- Name
_neg
- Parameters
- None.
- Return value
- None.
Performs an arithmetic negation (sign inversion) on the value, then saves the result into the object, overwriting the initial value.
This function does nothing if the number is unsinged.
- Name
_add
- Parameters
- Addend; register type.
- Return value
- For unsigned integers: carry bit; boolean.
- For signed integers: overflow bit; boolean.
Performs an arithmetic addition on the value and the addend, then saves the sum into the object, overwriting the initial value.
- Name
_sub
- Parameters
- Subtrahend; register type.
- Return value
- For unsigned integers: carry bit; boolean.
- For signed integers: overflow bit; boolean.
Performs an arithmetic subtraction on the value (minuend) and the subtrahend, then saves the difference into the object, overwriting the initial value.
- Name
_mul
- Parameters
- Multiplier; register type.
- Return value
- For integers: higher half bits of the product; register type.
Performs an arithmetic multiplication on the value and the multiplier, then saves the lower half bits of the product into the object, overwriting the initial value.
- Name
_div
- Parameters
- Divisor; register type.
- Return value
- For integers: remainder; register type.
Performs an arithmetic division on the value (dividend) and the divisor, then saves the quotient into the object, overwriting the initial value.
Document identification
This section contains information on how to identify and mark Documents as such in their respective systems.
Data Format Descriptor
Data Format Descriptor for Documents is
TO-BE-DEFINED
.Documents have the class
#Document
.Internet Media Type
Media type of Documents is
text/prs.kagomeko.k1os
.The
charset
parameter MUST be included with the valueUTF-8
.References
Hyperlinks in the document will point here in a later revision.
Developement considerations
Modules ought to follow the KISS (Keep It Simple, Stupid) rule. They are to be narrow in scope so that their developement ends one day, the module becomes finalized and no more levels are ever added to it. If it does not need revision, it means it’s good and can be safely used.
Having too much functionality in a module makes it more unstable. Stability is the most important trait every module author ought to pursue. Modules that are constantly revised are broken by design. Such modules ought to be scrapped, made obsolete, and then redesigned as new modules (with a new identifier).
The scale of a module should ideally be small enough for one person to not become mentally exhausted (burned out) while implementing it alone. There should be many implementations available that a user can choose from.
One should differentiate between a module and a software project. The module ought to be a part of the project, not the project itself. For example, if a Python interpreter were to be a module, then Python 2 and Python 3 interpreters would be different modules, which might be developed as part of the same project.
They are different because the most crucial part—the parser and interpreter—are different for version 2 and for version 3. The other reason is that Python 2 is phased out in favour of 3. Keeping both 2 and 3 in the same module is counter-productive. Remember that items—once defined—cannot be removed from a module.
Some modules are published and standarized for the general public. Other modules may be known only to select few or be created as part of developement or a user session and have the lifetime of only few hours.