Introduction
This document defines the syntax and semantics of a Kueea Abstract Machine Version 1 Module Declaration Document, shortened throughout this document to just "Document".
One Document declares one Kueea AMv1 Module. The syntax is designed to be human-readable and fairly easy to read and write using the most simple text editors.
Document Processors are programs which take Documents as input. The primary output are source files in a given programming language. Other output include module documentation in HTML or other formats. They are part of toolchains that generate Kueea AMv1 M-Build images.
Keywords
The key words ‘MUST,’ ‘MUST NOT,’ ‘REQUIRED,’ ‘SHALL,’ ‘SHALL NOT,’ ‘SHOULD,’ ‘SHOULD NOT,’ ‘RECOMMENDED,’ ‘NOT RECOMMENDED,’ ‘MAY,’ and ‘OPTIONAL’ in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
Document
A Document is a sequence of Unicode characters. The encoding of characters MUST be UTF-8.
Lines are sequences of characters separated by a sequence of two characters (in order): U+000D CARRIAGE RETURN and U+000A LINE FEED.
Whitespace is either U+0020 SPACE or U+0009 HORIZONTAL TAB.
Documents are processed line by line. The maximum length of a line is 1024 code units (bytes), including the line separator.
Three types of data may appear on a line: instructions, comments and text.
Instructions
An instruction is a line directed at the Document Processor. It begins with optional whitespace followed by one U+002E FULL STOP character, the instruction name and its arguments. Each argument is preceeded by at least one whitespace character.
Instruction names and arguments are case-sensitive.
Instruction names are sequences of four small Latin letters.
Instruction character set is limited to the range [U+0000, U+007F].
Examples:
.inst arg1 arg2 .inst arg1
Comments
A comment is a line which is discarded. There are two kinds of comments.
A one-line comment begins with optional whitespace followed by no more than 1 consecutive U+0023 NUMBER SIGN character.
Examples:
# This is a one-line comment. # This is a one-line comment. THIS IS NOT A COMMENT. # # # This is a one-line comment.
A multi-line comment begins and ends with a line beginning with optional whitespace followed by 2 consecutive U+0023 NUMBER SIGN characters.
Examples:
## This is the first line of a multi-line comment. This is a comment. .This is a comment. # # This is inside a multi-line comment. ## This is the last line of a multi-line comment. THIS IS NOT A COMMENT.
Text
Any other line is text - secondary data associated with the current item, stored in a named buffer.
It is OPTIONAL for a Document Processor to process text.
Syntax and semantics of text are out of scope of this document.
By default, text is a human-readable textual description of the associated item, in Markdown. [MARKDOWN]
Examples:
.item example1 Description of the example1 item. Description of the example1 item. .item example2 Description of the example2 item.
Line indentation
The preceeding whitespace on an instruction sets the amount of ignored preceeding whitespace for text that comes after it.
Both of the whitespace characters count as one. Decide on the identation character for the document, please. U+0020 SPACE is RECOMMENDED because visual presentation of tabs vary.
Consider the following example:
.inst first Line 1-1. Line 1-2. Line 1-3. .inst second Line 2-1. Line 2-2. Line 2-3.
The first line is an instruction indented by 3 whitespace characters. The ignored indentation becomes 3.
The second line is text indented by 3 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has no preceeding whitespace characters.
The third line is text indented by 5 whitespace characters. The parser removes the first 3 whitespace characters. The resulting line has 2 preceeding whitespace characters.
The fourth line is text indented by 2 whitespace characters. The parser removes the first 3 whitespace characters. In this case, the line has less whitespace, so all is removed. The resulting line has 0 preceeding whitespace characters.
Text for the first item is thus:
Line 1-1. Line 1-2. Line 1-3.
The fifth line is an instruction indented by 0 whitespace characters. The ignored indentation becomes 0.
The sixth line is text indented by 3 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 3 preceeding whitespace characters.
The seventh line is text indented by 5 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 5 preceeding whitespace characters.
The eigth line is text indented by 0 whitespace characters. The parser removes the first 0 whitespace characters. The resulting line has 0 preceeding whitespace characters.
Text for the second item is thus:
Line 2-1. Line 2-2. Line 2-3.
Formal syntax (ABNF)
The following MDOC
rule expresses the syntax in ABNF.
MDOC = k1md CRLF *( line CRLF ) ; k1md defined in another section line = comm / inst / text comm = cmul / cone cmul = *WSP "##" *OCTET CRLF *WSP "##" *OCTET cone = *WSP "#" *OCTET inst = *WSP "." *( WSP / VCHAR ) text = [ *WSP "\" ] *OCTET
ABNF rules are referenced in prose like this: <rule>.
The first non-whitespace character of <text> MAY be a U+005C REVERSE SOLIDUS. If so, the \
character is removed before further processing of the line. This is an escape mechanism in case the line begins with #
or .
.
Objects
An object is a set of variables. Variables of an object are referenced in prose like this: object.variable (using a dot separator).
Objects are categorized as items and non-items.
Non-items
This section defines non-item objects.
These objects are part of items.
Characters
A character is a Unicode code point.
Object reference
An object reference references another object.
When an object reference refers to nothing, it means that the reference is set to a value that does not reference any objects.
Containers
Some objects are stored in a container object.
A list is an ordered container, which means that position of elements in a list is significant.
A set is an unordered container.
A pair is a set of two elements.
Class identifier
A class identifier is an object composed of:
- mid: module identifier.
- cno: class number.
A module identifier is a 120-bit value.
A module identifier is nil when all of its bits are cleared; this value is reserved.
If the first (most significant) bit of a module identifier is set, then the module is a standard module. Standard modules are the same on all Abstract Machines.
If the bit is cleared, the module is non-standard. Such modules are local to a given Abstract Machine.
The remaining 119 bits are randomly generated.
A class number is an integer in the range [0,255]. It identifies a class within the module.
Class numbers are assigned by name or explicitly by module authors. The exception is the module class (self
) which has class number 0.
Class level
Members of a class are grouped under its levels.
A class level is an integer in the range [0,255].
An instance of a class at level n contains all members of the class declared at levels [0, n].
Memory alignment
An alignment exponent is an integer within the range [0, 31]. It is the exponent e in the expression 2e, result of which is the associated memory alignment in octets. Any other value is outside of a memory alignment's value range.
Tag
A tag is a list of characters.
The minimum length of the list is 1.
The maximum length of the list is 16.
Name
A name is a list of characters.
The minimum length of the list is 1.
The maximum length of the list is 64.
Most items have an associated name. Names are local in scope to the document.
Member reference
A member reference is a list of names.
Member references are relative to a class. The list designates the class's member by following the referenced names.
Names are deferenced in order and there is no way to reference a previously dereferenced name - infinite recursion never happens.
Item reference
An item reference is a pair of a name and a member reference.
The name contains a module's class identifier, a user-defined module alias or a predefined name.
When the name is empty, the item reference is an unknown item reference.
The member reference is relative to the module's class.
Type reference
A type reference is an object which consists of:
- ref: item reference to a class.
- clv: class level of the referenced class.
- hnd: type of reference.
Possible values of hnd are:
null
- No value set. Type reference is empty.
regt
- Register type.
creg
- Register type of the class.
cmem
- Instance of the class.
none
- Handle with no access rights; just the address. Use to avoid unnecessary duplication of rights.
read
- Handle with read rights.
rdex
- Handle with read and execute rights.
rdwr
- Handle with read and write rights.
rwex
- Handle with read, write and execute rights.
A handle is a system address associated with access rights to the referenced object.
The system gives access as many times as there are handles in a class. A class only needs to have one handle with specific rights to an object. Other handles to the same object SHOULD be of the no-access type in order to avoid unnecessary computations when passing objects around.
If ref is an unknown item reference and hnd designates a handle, the handle is to an unspecified object.
Array length
An array length is an object which consists of:
- ref: member reference to the data member which stores the amount of elements.
- min: minimum amount of elements.
- max: maximum amount of elements.
Amount of elements is a 32-bit unsigned binary integer.
The min and max values are used in calculation of the minimum length and the possible maximum length of a class instance.
If max is zero, an array is not declared. If it is equal to 4294967295 (232-1), the value represents the maximum possible amount of elements; otherwise, the value is the given number.
The data member referenced by ref MUST be an instance of a register class with an unsigned integer register type. This member's memory address MUST be lower than the array's, i,e. it must be declared before the array. Both min and max MUST NOT be greater than the maximum value representable by the referenced member's class.
The maximum possible amount of elements is either 4294967295 divided by the maximum class length of an element, rounded down, or the maximum value representable by the type of the data member referenced by ref if ref is not empty.
An invalid array length is one of which either:
- min is greater than max.
- min is equal to max and ref is not empty.
Value
Representation of a value is implementation-dependent. This document only defines the syntax of a value in an instruction.
A value may be the undefined value. This state represents an omitted parameter. (There is no syntax for representing it in a Document.)
Condition
A condition is an object which consists of:
- expr: expression of the condition.
- name: name of the condition.
An expression is an object which consists of:
- exop: operator of the expression.
- data: data of the expression determined by exop.
Each expression evaluates to either true or false.
A condition holds when its expression evaluates to true.
Logical expression
data for exop values 'any is true', 'any is false', 'all are true', and 'all are false' is a logical expression, which is an object that consists of:- exp: list of expressions, evaluated in list order.
An empty list evaluates to true.
Comparision
data for exop values 'is equal to', 'is not equal to', 'is less than', 'is less than or is equal to', 'is greater than', and 'is greater than or is equal to' is a comparision, which is an object that consists of:- ref: member reference to the data member which is compared with the value,
- val: the value.
The type of the referenced member SHOULD be a register class. Implementations MAY NOT support values that are arrays and instances of non-register classes.
Condition reference
data for exop values 'condition holds' and 'condition does not hold' is a condition reference, which is an object that consists of:- cnd: name of a condition.
Path component
A path component is a list of characters.
The minimum length of the list is 7.
The maximum length of the list is 1024.
Path components name external (out-of-memory) resources.
Interface reference
An interface reference is an object which consists of:
- type: data member type of the interface.
- mref: member reference to a data member.
- clv: class level of the associated class.
Module import
A module import is an object which consists of:
- mid: a module identifier.
- mclv: minimum class level of the module class.
- name: an alias, name for the module.
Module
A module is an object which consists of:
- self: object reference to the module's class.
- types: set of classes.
- paths: set of path components.
- mload: set of module imports.
Function number
A function number is a 64-bit unsigned binary integer.
There MUST NOT be two functions with the same number within a module.
By default, the value is the result of passing a character string, constructed from the module's class identifier and relevant item names, to the 64-bit FNV-1a (Fowler-Noll-Vo) hash function:
hash = 0xCBF29CE484222325 for each octet_of_data to be hashed hash = hash XOR octet_of_data hash = hash * 0x100000001B3 # BEG DOCUMENT SPECIFIC if hash == 0 then hash = 0xFFFFFFFFFFFFFFFF # END DOCUMENT SPECIFIC return hash
The value 0 is invalid; it can be used as an undefined value.
Examples:
Input string | Output value |
---|---|
module_func | 0x0F7E93E1AF686350 |
class$00$function | 0x2862790D0CE9E837 |
Function numbers should only be explicitly declared in case of a hash collision. The encoding of the string is the same as for the document - UTF-8, although it is practically limited to just the first 127 code points.
The input character string for the default FID of a function depends on the function's location within the item tree. If the function is a member of the module's class, the FID is computed over the name of the function only. Otherwise, the input string is computed as follows:
- Let s be an empty character string.
- Let c be the class.
- Let f be the function member.
- Append c.name to s.
- Append a U+0024 DOLLAR SIGN character to s.
- If c.clv is less than 16, append a U+0030 DIGIT ZERO to s.
- Append the hexadecimal representation of c.clv,
using character ranges [U+0030,U+0039] and [U+0041,U+0046]
(
0
-9
andA
-F
) for the values [0,15], to s. - Append a U+0024 DOLLAR SIGN character to s.
- Append f.name to s.
- Return s as the input character string.
Items
This section defines item objects.
All items contain a text variable, which is a list of objects, which each consists of:
- data: list of characters,
- format: name; format of the data.
When appending an object n to text, where o is the last object in text: if n.format equals to o.format, data in n.data MAY be appended to o.data with a preceeding line separator instead of appending a new object.
Data member
A data member is an item which consists of:
- mlv: class level of the associated module's class.
- clv: class level of the associated class.
- tags: set of tags.
- type: type reference.
- alen: array length.
- align: memory alignment or zero.
- value: default value.
- cond: name of a condition.
- name: name of the object.
The alignment of data member d is a 32-bit unsigned binary integer computed as follows:
- Return the default alignment of class referenced by type.ref at level type.clv if d.align is equal to zero.
- Return d.align.
The minimum length of data member d is a 32-bit unsigned binary integer computed as follows:
- Let ltype be a 32-bit unsigned binary integer.
- Set ltype to the minimum class length of class referenced by d.type.ref at level d.type.clv.
- Return ltype if d.alen.max is equal to zero.
- Return zero if d.alen.min is equal to zero.
- Let len be a 34-bit unsigned binary integer.
- Let off be a 34-bit unsigned binary integer.
- Let rem be a 32-bit unsigned binary integer.
- Set len to ltype.
- Set rem to d.alen.min.
- Set off to the alignment of d.
- Subtract one from off.
- While rem is greater than one:
- Add off to len.
- Set len to the binary AND of len and the binary NOT of off.
- Add ltype to len.
- Abort the program if len is greater than 4294967295.
- Subtract one from rem.
- Return len as a 32-bit unsigned binary integer.
The maximum length of data member d is a 32-bit unsigned binary integer computed as follows:
- Let ltype be a 32-bit unsigned binary integer.
- Set ltype to the maximum class length of class referenced by d.type.ref at level d.type.clv.
- Return ltype if d.alen.max is equal to zero.
- Let len be a 34-bit unsigned binary integer.
- Let off be a 34-bit unsigned binary integer.
- Let rem be a 32-bit unsigned binary integer.
- Set len to ltype.
- Set rem to d.alen.max.
- Set off to the alignment of d.
- Subtract one from off.
- While rem is greater than one:
- Add off to len.
- Set len to the binary AND of len and the binary NOT of off.
- Add ltype to len.
- Set len to 4294967295 if len is greater than 4294967295.
- Subtract one from rem.
- Return len as a 32-bit unsigned binary integer.
The allocated space of a data member is greater than or equal to its minimum length and less than or equal to its maximum length. This is the amount of octets that an instance occupies in memory. The value is a run-time variable if alen.ref is not empty.
The following tags are recognized in tags:
sameaddr
- Member is a non-canonical member of a data union.
sametext
- Item description is shared with the previous member.
A data union is a range in a list of data members, which begins with a member a, inclusive, and ends with a member b, exclusive, such that a.tags does not contain sameaddr
and b.tags does not contain sameaddr
. The member a is the canonical member of the union.
A non-canonical member of a data union is any data member d, d.tags of which contains sameaddr
.
The canonical member of a data union defines the allocated space of the data union as a whole as well as its memory alignment.
The maximum length of a non-canonical member of a data union MUST NOT be greater than the maximum length of the canonical member.
If a non-canonical member d is such that d.alen.max represents the maximum possible amount of elements and d.alen.ref is empty, then the maximum possible amount of elements is the maximum length of the canonical member divided by the maximum length of an element, rounded down.
If alignment of a non-canonical member is greater than the alignment of the canonical member, then the non-canonical member is aligned farther into the union in order to match its memory alignment. This reduces the remaining allocated space available to the member.
If cond is not empty, the member exists only when the referenced condition holds. In case of a canonical member of a data union, the condition applies to the data union in whole.
All non-canonical members d of a data union, d.cond of which is not empty, are mutually exclusive with each other and the canonical member. The canonical member exists if the data union exists and none of conditions referenced by d.cond hold.
All non-canonical members d of a data union, d.cond of which is empty, always exist. These members have their destructors and accessors ignored. Thus, types of such members SHOULD NOT have these.
Function member
A function member is an item which consists of:
- mlv: class level of the associated module's class.
- clv: class level of the associated class.
- fno: function number.
- tags: set of tags.
- params: list of function parameters.
- codes: list of error codes.
- name: name of the function.
Function parameter
A function parameter is an item which consists of:
- tags: set of tags.
- type: type reference.
- name: name of the parameter.
Error code
An error code is an item which consists of:
- name: name of a message function.
- fno: number of the function.
Error codes are declared in a predefined module.
Named value
A named value is an item which consists of:
- mlv: class level of the associated module's class.
- clv: class level of the associated class.
- value: value.
- name: name of the value.
Named reference
A named reference is an item which consists of:
- mlv: class level of the associated module's class.
- clv: class level of the associated class.
- iref: item reference.
- name: name of the reference.
Path
A path is an item which consists of:
- mlv: class level of the associated module's class.
- path: path component.
Class
A class is an item which consists of:
- cid: class identifier.
- clv: class level.
- regt: type reference.
- rclv: class level (for regt).
- desc: list of data members (descriptor).
- data: list of data members (instance).
- cond: set of conditions.
- func: set of function members.
- nref: set of named references.
- nval: set of named values.
- tags: set of tags.
- ifaces: set of interface references.
- name: name of the class.
A class, regt of which is non-empty, is called a register class. It has an implicit reserved member called regt
which is a name reference to the register type referenced by regt.
The default alignment of class c at level lv is computed as follows:
- Let cur be a 32-bit unsigned binary integer.
- Let max be a 32-bit unsigned binary integer.
- Set max to zero.
- For each data member d in c.data:
- Advance to the next element if d.clv is greater than
lv or d.tags contains
sameaddr
. - Set cur to the alignment of d.
- Advance to the next element if cur is less than or equal to max.
- Set max to cur.
- Return max.
- For each data member d in c.data:
The minimum class length of class c at level lv is computed as follows:
- Let cur be a 32-bit unsigned binary integer.
- Let len be a 33-bit unsigned binary integer.
- Set len to zero.
- For each data member d in c.data:
- Advance to the next element if d.clv is greater than
lv or d.tags contains
sameaddr
. - If d.alen.max is not equal to zero and d.alen.ref is empty and d is not the last element in c.data, set cur to the maximum length of d; otherwise, set cur to the minimum length of d.
- Add cur to len.
- Abort the program if len is greater than 4294967295.
- Advance to the next element if d.clv is greater than
lv or d.tags contains
- Return len.
The maximum class length of class c at level lv is computed as follows:
- Let cur be a 32-bit unsigned binary integer.
- Let len be a 33-bit unsigned binary integer.
- Set len to zero.
- For each data member d in c.data:
- Advance to the next element if d.clv is greater than lv.
- Set cur to the maximum length of d.
- Add cur to len.
- Set len to 4294967295 if len is greater than 4294967295.
- Return len.
Document Processor
A Document Processor is a program that uses a Parser to fill a set of loaded modules and then does something with them.
Parser
A Parser consists of a line parser, an instruction parser and an instruction processor.
The state of a Parser consists of the following objects:
- mset: set of loaded modules.
- mcur: object reference to the current module.
- text: object reference to the current item description.
- format: name of the current text format.
The mset is supplied by the caller as an input/output parameter. Additionaly, the parser takes four objects as input parameters:
- mcid: the target module identifier.
- mclv: the target module's class level.
- ignoreText: boolean, if true, text lines are ignored.
- fetch: an external function for fetching Documents; input is the target module identifier and level.
The parser loads a module using the following algorithm and then postprocess the set of loaded modules. Postprocessing MAY be done as part of loading or as a separate step. The result is either success or failure.
- Search mset for a module m such that m.this.cid is equal to mcid.
- If found and m.this.clv is no less than mclv, return success.
- If found and m.this.clv is less than mclv, remove m from mset.
- Obtain an octet stream doc by calling fetch, passing mcid and mclv as arguments.
- If fetch has failed, return failure.
- Set mcur to a newly created module.
- Set mcur.self to a newly created class.
- Insert mcur.self into mcur.types.
- Set mcur.self.cid to mcid.
- Set text to nothing,
- Set format to
markdown
. - Invoke the line parser passing doc and ignoreText.
- If the parser failed, delete mcur and return failure.
- If the mcur.self.clv is less than mclv, delete mcur and return failure.
- Insert mcur into mset.
- Iterate mcur.mload and recursively call this algorithm with the arguments from the elements; if any of the referenced modules failed to load, return failure.
- Return success.
Line parser
State of the line parser consists of the following objects:
- line: current line (list of characters).
- wslv: current line indentation (integer).
- skip: ignored line indentation (ingeter).
The line parser takes two objects as input parameters:
- input: byte stream.
- ignoreText: boolean.
The parser loads lines until the end of input. Each loaded line is then processed.
Line loading
In order to load a line follow, the parser follows algorithm:
- Let cr be a boolean.
- Set line to the empty list.
- Set cr to false.
- While input is not empty:
- Decode the next character c from input.
- If line is longer than 1024 characters, return failure.
- Append c to line.
- If cr is true and c is U+000A LINE FEED, remove the last character in line and return success.
- If c is U+000D CARRIAGE RETURN, set cr to true; otherwise, set cr to false.
- Return success.
Line processing
The line parser counts the amount of whitespace at the beginning of line and stores the resulting value in wslv.
The first line in the document MUST begin with the sequence of eight bytes with values as defined by <k1os>.
Further processing depends on the first character after the whitespace.
If the line is a one-line comment, the line is ignored.
If the line is a multi-line comment, the parser loads subsequent lines until another multi-line comment is encountered. All of these lines are ignored.
If the line is an instruction, skip is set to wslv. Whitespace at the beginning and end of line is removed. The U+002E FULL STOP at the beginning is removed. The instruction parser is then invoked on line. In case of failure, the Parser MUST immediately return failure.
The other remaining possibility is that the line is text. If ignoreText is true, the line is ignored. Otherwise, up to skip whitespace characters are removed since the beginning of line. If the first character after the removal is \
, the character is removed. If text references an object, create and append an object o with o.data equal to line and o.format equal to format, to text.
Instruction parser
The instruction parser converts the loaded line into a list of typed arguments and invokes the named instruction processor's function.
Instructions begin with a four-letter function name, followed by argument tokens, each preceeded by whitespace, as per the <cmd0> rule.
cmd0 = fun0 *( 1*WSP arg0 ) fun0 = 4LOALPHA arg0 = TAGS / FNO1 / FNO2 / ALEN / LVAL / CREF arg0 /= EXPR / UINT / VALT / MEMT / AREF / RREF / NAME
The parser MUST return failure on any unrecognized function or argument, or when parsed arguments do not match the function's arguments.
Argument tokens are defined such that the parser can determine the type of an argument from the first few characters. The order of alternatives in <arg0> is the recommended order of tests.
Argument rules use capital letters by convention.
Unsigned integer
UINT = Udec / Uhex Udec = 1*20DIGIT Uhex = %s"0x" 1*16HEXDIGIT
The <UINT> rule represents a 64-bit unsigned binary integer value.
It is in decimal (<Udec>) or in hexadecimal (<Uhex>) notation.
Name
NAME = LOALPHA *63( LOALPHA / DIGIT / "_" )
The <NAME> rule represents an item name.
It is a character sequence of up to 64 characters. All letters in a name MUST be small.
Names are used in item references. References are resolved after all documents are fully loaded. The item being referenced MAY be declared in a document later on.
It is RECOMMENDED that names of functions be composed in subject-object-verb order, for example object_units_replace
. The recommendation is for consistency and grouping of members. Note that one can generate aliases in camelCase
, too, if needed. Source code could be converted back and forth.
Tags
TAGS = tag *( 1*WSP tag ) tag = "+" 1*16LOALPHA
The <TAGS> rule represents a list of tags (<tag>).
They are short character sequences of up to 16 characters. All letters in a tag MUST be small.
Relative item reference
RREF = 1*( "." NAME )
The <RREF> rule represents a relative item reference.
Items are referenced in tree order by their name.
Absolute item reference
AREF = ( Ahex / Atxt ) RREF Ahex = "!" 2HEXDIGIT 15( [ ( "-" / ":" ) ] 2HEXDIGIT ) Atxt = "." [ NAME ]
The <AREF> rule represents an absolute item reference.
It begins with a reference to the module, either by identifier (<Ahex>) or by name (<Atxt>), and then an item reference (<RREF>) relative to the module.
If the name variant has an empty name, the named module is the currently declared module.
Memory type
TMEM = [ Thnd ] Tobj Thnd /= %s"none:" / %s"read:" / %s"rdex:" / %s"rdwr:" / %s"rwex:" Tobj = %s"OCTET" / ( %s"C" / %s"T" ) AREF
The <MEMT> rule represents a type reference to a memory type.
If the type of handle (<Thnd>) is specified, the memory type is a handle; the remaining portion specifies to which class the handle refers to.
The class can be either OCTET
(an octet; in case of handles, a handle to an octet is a handle to any class), a reference to a class item (when preceeded by a C
), or a reference to a type defintion item (when preceeded by a T
).
Value type
TVAL = %s"tval:" ( NAME / Tobj )
The <TVAL> rule represents a type reference to a value type.
The type is either given directly by <NAME> or it is derived from the referenced class (<Tobj).
Parameter type
TPAR = TVAL / TMEM
The <TPAR> rule represents a type reference to any type.
It is defined for function parameters, which can be either memory types or value types.
Function number
The <FNO1> rule represents an unnamed function number.
The <FNO2> rule represents a named function number.
FNO1 = "#" UINT FNO2 = "#" NAME "#" UINT
Because more than one function may be declared with one instruction, there are two variants: named and unnamed.
The value, when specified as an argument, MUST be non-zero.
Array length
ALEN = "[" [ RREF ":" ] arrl [ ":" arrl ] "]" arrl = UINT / %s"MAX"
The value obtained by parsing <arrl> into an unsigned integer MUST NOT be greater than 231.
The special keyword MAX
means the maximum permitted value.
Parsing examples:
[10] => min = 10, max = 10, var = () [1:20] => min = 1, max = 20, var = () [2:MAX] => min = 2, max = MAX, var = () [.len:4:255] => min = 4, max = 255, var = ("len") [.obj.len:MAX] => min = 0, max = MAX, var = ("obj", "len")
Literal value
LVAL = "=" vtxt vtxt = vval / vref / vobj / varr vval = valf / vali / valu / valc / valb vref = "&" IREF vobj = "{" [ NAME "=" vtxt *( "," NAME "=" vtxt ) ] "}" varr = "[" [ vtxt ] *( "," [ vtxt ] ) "]" sign = "+" / "-" vhex = "0x" 1*HEXDIGIT vdec = 1*DIGIT fdec = vdec [ "." 1*DIGIT ] [ "e" [ sign ] vdec ] fbin = vhex [ "." 1*HEXDIGIT ] [ "p" [ sign ] vdec ] valu = vhex / vdec vali = sign valu valf = [ sign ] ( "NaN" / "INF" / fdec / fbin ) valc = %s"ce" / %s"lt" / %s"eq" / %s"gt" valb = %s"true" / %s"false"
Values specify values of octets on given positions of an object. They can also be assigned to a name, creating a named value for reference.
The <VAL> rule is the most complex one; it contains recursion. Parsers MUST verify that all value lists are correctly terminated. Please study the rules calmly and thoroughly.
A register value can only be assigned to an instance of a register class; it is one of:
- <valf>: real number.
- <vali>: (signed) integer.
- <valu>: unsigned integer.
- <valc>: comparision result value.
- <valb>: boolean.
<valn> references a named value.
<varr> is array of values. The target object MUST also be an array. The values in the array MUST be valid elements of the object. If the target array is longer, the excess values are undefined. An element MAY be omitted, in which case its value is undefined.
<vobj> is a set of name-value pairs. The names reference a data member of the target object. It is an error if there is no data member with a matching name. Unreferenced members have undefined values.
Condition reference
The <CREF> rule represents a condition reference.
CREF = "?" NAME
A condition reference begins with a question mark, followed by a name of a condition.
Expression
The <EXPR> rule represents an expression.
EXPR = "(" ( elog / eref / ecmp ) ")" elog = elop *( 0*WSP EXPR ) elop = "all=1" / "all=0" / "any=1" / "any=0" eref = erop 0*WSP CREF erop = "is=1" / "is=0" ecmp = ecop 0*WSP RREF 0*WSP VAL ecop = "eq" / "ne" / "lt" / "le" / "gt" / "ge"
An expression begins with a left parenthesis, followed by the expression operator and its arguments, followed by a right parenthesis.
Input text | Expression type and operator name |
---|---|
all=1 | logical expression, 'all are true' |
all=0 | logical expression, 'all are false' |
any=1 | logical expression, 'any is true' |
any=0 | logical expression, 'any is false' |
eq | comparision; 'is equal to' |
ne | comparision; 'is not equal to' |
lt | comparision; 'is less than' |
le | comparision; 'is less than or is equal to' |
gt | comparision; 'is greater than' |
ge | comparision; 'is greater than or is equal to' |
is=1 | condition reference; 'condition holds' |
is=0 | condition reference; 'condition does not hold' |
Path
PATH = "/" ppfx "/" 1*pchar ; pchar from RFC 3986 ppfx = %s"data" / %s"node" / %s"sync"
Path to an external resource.
Paths MUST begin with a predefined prefix and conform to the rules of URI [[RFC3986]][] path components.
Path prefixes correspond to:
/data/
: read-only resources,/node/
: node-specific resources,/sync/
: synchronized resources,/user/
: user-provided resources; cannot be declared; these are named by the user, not the module.
Paths are case-sensitive.
The full URI is kueea1am://<Ahex><PATH>
.
Instruction processor
Functions are defined by listing their parameters in ascending order, describing the function and formally specifiying its outcome.
State of the instruction processor consists of the following variables:
- mfin: level finalization state (boolean).
- clvl: current class level (integer).
- ccur: current class (reference).
- fcur: current function (reference).
An item name colllision between a class c and a name name occurs when any of the following is true:
- name is
regt
. - Any data member d in c.data is such that: d.name is equal to name.
- Any function member f in c.func is such that: f.name is equal to name.
- Any named value v in c.nval is such that: v.name is equal to name.
- Any named reference i in c.nref is such that: i.name is equal to name.
An FID collision occurs against a function number fid when any class c in mcur.types is such that: any function member f in c.func is such that: f.fid is non-zero and equal to fid.
The k1md
function
k1md = %s".k1md" SP %s"M" Ahex 1*WSP ( %s"+draft" / %s"+final" ) ; The first 8 bytes are: 2E 6B 31 61 6D 20 4D 21
Verifies input as a Document and begins declaration of the 'self' class of the module.
- <Mhex> mid
- Module identifier of the declared module.
- <TAGS> tags
- Finalization state. Either the
+draft
or the+final
tag.
The instruction MUST be the first line of a Document. It is special in that it has no preceeding whitespace. It MUST NOT be on any subsequent line (except the first).
Implementations MUST modify the state as follows:
- Set mcur.self.name to
self
. - Set mfin to true if tags contains
+final
. - Set ccur to mcur.self.
- Set fcur to nothing.
- Set text to ccur.text.
The text
function
Updates the active text buffer(s).
- <NAME> name
- Buffer name.
- <TAGS> tags
- Tags.
The default buffer is named markdown
.
Example:
This will go into the `markdown` buffer. .text html <p>This will go into the <code>html</code> buffer.</p> .text ia32 .text amd64 +multi This will go into both "ia32" and "amd64" buffers.
Implementations MUST modify the state as follows:
- Set format to type.
The load
function
Loads another module.
- <MID> mid
- Module identifier.
- <UINT> mlv
- Required minimum level of the module class.
- <NAME> name (optional)
- Name for the module.
Item references may reference items declared in other modules. A reference to a module that has not been loaded is invalid.
Item references are processed after all modules are loaded. This instruction does not have to appear before a reference. It is RECOMMENDED that it appears at the beginning of a document.
Implementations MUST return failure if any of the following is true:
- mid is nil.
- mlv is equal to or more than 28.
- name was given and any module import i in mcur.mload is such that: i.name is equal to name.
Implementations MUST modify the state as follows:
- Find a module import i in mcur.mload, where i.mid is equal to mid.
- If found and i.mlv is less than mlv:
- Set i.mlv to mlv.
- Create a new module import i.
- Set i.mid to mid.
- Set i.mlv to mlv.
- Set i.name to name.
- Insert i into mcur.mload.
The mlvl
function
Increases the current level of the module.
- <UINT> mlv
- Module level.
- <TAGS> tags
- Tags associated with the level.
The new level applies to all items declared afterward. Module levels may only be increased.
tags contains either +final
or +draft
. If the level is final, no changes to it (and lower ones) will ever be made. This rule refers to the resulting tree of declared items; descriptions of the items are not considered.
What this means is that if a Document declares the same module as another, already known Document, then the declared items in both of these documents MUST be the same, except for their textual descriptions.
Implementations MUST return failure if any of the following is true:
- mlv is equal to or greater than 28.
- mlv is less than mcur.self.clv.
- mlv is equal to 0, mcur.self.clv is equal to 0 and one of either: mcur.self.desc is not empty, mcur.self.data is not empty, mcur.self.func is not empty. mcur.self.nref is not empty. mcur.self.nval is not empty. mcur.self.ifaces is not empty. mcur.types has more than 1 element.
- tags contains both
+final
and+draft
. - tags contains neither
+final
nor+draft
. - tags contains
+final
and mfin is false.
Implementations MUST modify the state as follows:
- Set ccur to mcur.self.
- Set fcur to nothing.
- Set text to ccur.text.
- Set mfin to false if
+draft
in tags. - Set ccur.clv to mlv.
The cbeg
function
Begins a class declaration.
- <NAME> name
- Name of the class.
- <TAGS> tags
- Type of the class.
- <UINT> cno (optional)
- Class number.
By default, cno is set in the post-processing stage, by ordering classes by class level and name, and then assigning the numbers in ascending order, skipping those values which have been explicitly set.
If tags contains +nodesc
, the class will not have a descriptor nor a class identifier. It is not possible to allocate an instance of such a class. These classes are only used as an object type.
If tags contains +iface
, the class is an interface. Interfaces require cno for their identification. It is not possible to allocate an instance of an interface. These classes can be used as an object type.
Each class has a set of predefined function members. These functions are defined in another section of this document.
Implementations MUST modify the state as follows:
- Return failure if cno is greater than 255.
- Return failure if tags contains both
+iface
and+nodesc
. - Look for a class c in mcur.types, such that c.name is equal to name.
- If c was found:
- Return failure if cno is not zero.
- Return failure if tags is not empty.
- Return failure if c.clv equals 255.
- Increase c.clv by 1.
- If c was not found:
- Let c be a newly created class.
- Set c.cid to mcur.self.cid.
- Set c.cid.cno to cno.
- Set c.clv to 0.
- Set c.name to name.
- Set c.tags to tags.
- If cno is zero (not set),
insert
+$autocno
into c.tags. - Insert c into mcur.types.
- Set ccur to c.
- Set fcur to nothing.
- Set text to c.text.
- Insert predefined functions to ccur.func.
The cend
function
Explicitly ends a class declaration.
No parameters.
Implementations MUST modify the state as follows:
- Set ccur to mcur.self.
- Set fcur to nothing.
- Set text to mcur.self.text.
The clvl
function
Modifies the current level of the current class.
- <UINT> level
- New level.
- <TAGS> tags (optional)
- Tags.
The presence of +fini
in tags specifies that the instance at this class level has a destructor. The destructor is also implied whenever any of the data members of the class at the level has a destructor because the members' destructors are called from the class destructor.
Implementations MUST return failure if any of the following is true:
- ccur refers to mcur.self.
- level is equal to or more than 28.
- tags contains
+fini
and any function member f in ccur.func is such that: f.clv is equal to level and f.name is equal to_fini
.
Implementations MUST modify the state as follows:
- Set ccur.clv to level.
- Set fcur to nothing.
- Set text to ccur.text.
- If tags does not contain
+fini
, finish. - Let f be a new function member representation.
- Set f.mid to mcur.self.clv.
- Set f.cid to level.
- Set f.name to
_fini
. - Let fid be the default FID of f.
- Verify that FID collision does not occur against fid.
- Set f.fid to fid.
- Insert f into ccur.func.
The creg
function
Assigns a register type to a class.
- <REGT> type
- Register type.
This instruction declares that the class is a register class.
The class is assigned a register type and gains two predefined function members for transferring a value from a register to object data and vice-versa, called save
and load
respectively. Otherwise, there is no mapping between registers and object data. Object data is nothing more than an array of opaque octets.
Any class MAY define functions with these names. Their parameters and semantics are predefined only for register classes.
Implementations MUST modify the state as follows:
- Return failure if any of the following is true:
- ccur.regt is not empty.
- item name collision occurs between ccur and
save
.- item name collision occurs between ccur and
load
. - item name collision occurs between ccur and
- ccur.regt is not empty.
- Set ccur.regt to type.
- Set ccur.rclv to ccur.clv.
- Create a new function member save.
- Set save.name to
save
. - Set save.fid to the defualt FID of save.
- Return failure if FID collision occurs against save.fid.
- Create a new function parameter ireg.
- Set ireg.type to (a copy of) type.
- Set ireg.name to
reg
. - Append ireg to save.params.
- Insert save into ccur.func.
- Create a new function member load.
- Insert
+read
into load.tags. - Set load.name to
load
. - Set load.fid to the defualt FID of load.
- Return failure if FID collision occurs against load.fid.
- Create a new function parameter oreg.
- Set oreg.type to (a copy of) type.
- Set oreg.name to
reg
. - Insert
+output
into oreg.tags. - Append oreg to load.params.
- Insert load into ccur.func.
The cond
function
Declares a named condition for the current class.
- <NAME> name
- Name of the condition.
- <EXPR> expr
- Expression of the condition.
Implementations MUST modify the state as follows:
- Return failure if any condition c in ccur.cond is such that c.name is equal to name.
- Create a new condition c.
- Set c.name to name.
- Set c.expr to expr.
- Insert c into ccur.cond.
The desc
and data
functions
Declares the next data member, in memory order, of an interface descriptor (desc
) / of an instance (data
).
Classes which are iterfaces have an additional object, in addition to the usual instance of the class, called an interface descriptor. A class that implements an interface includes the interface descriptor as part of its class descriptor as well as declares an instance of the (interface) class as one of its data members.
Interface descriptors are data used by programs which manipulate instances of unknown classes via their set of implemented interfaces.
All members of a class descriptor are read-only, constant values. The values of an interface descriptor vary by module implementation. Data in a class descriptor is valid for every instance of the class.
- <TYPE> type
- Type of the object.
- <NAME> name
- Name of the object.
- <ALEN> alen (optional)
- Length of an array.
- <VAL> value (optional, only in
data
) - Default value.
- <UINT> align (optional)
- Memory alignment.
- <CREF> cond (optional)
- Condition reference. The member exists if the condition holds.
- <TAGS> tags (optional)
- Tags.
- <NAME> uref (optional)
- Reference to a previously declared union.
When omitted, alen.max is zero.
When omitted, value is the undefined value.
When omitted, align is zero.
When omitted, cond is empty.
When omitted, tags is empty.
When omitted, uref is empty.
If alen.min is less than alen.max and alen.ref is empty, then the currently declared data member SHOULD be the last data member declared at the current class level. The minimum amount is only a hint to the programmer in this case.
If alen.ref is not empty, then the allocated space of the declared data member is known only at program run time. Thus, memory addresses of subsequent data members are variables, too.
The current union of list data is the data union which begins from the last element d in data, such that d.tags does not contain sameaddr
.
Implementations MUST modify the state as follows:
- Let data be an object reference.
- Let prev be an object reference.
- If in the
desc
function:- Return failure if ccur.tags does not contain
iface
. - Return failure if any data member d in ccur.desc is such that d.name is equal to name.
- Make data refer to ccur.desc.
- Return failure if ccur.tags does not contain
- If in the
data
function:- Return failure if item name collision occurs between ccur and name.
- Make data refer to ccur.data.
- Return failure if any of the following is true:
- align is neither zero nor a memory alignment.
- alen.max is non-zero and alen is an invalid array length.
- tags contains
+sameaddr
and any of:- data is empty.
- cond is empty, alen.max is non-zero and alen.ref is not empty.
- value is not the undefined value.
- tags does not contain
sameaddr
and uref is not empty. - tags contains
sametext
and any of:- data is empty.
- uref is not empty.
- Class level violation occurs in data.
- Create a new data member d.
- Set d.mlv to mcur.self.clv.
- Set d.clv to ccur.clv.
- Set d.tags to tags.
- Set d.type to type.
- Set d.alen to alen.
- Set d.align to align.
- Set d.value to value.
- Set d.cond to cond.
- Set d.name to name.
- If uref is not empty:
- Find a data member u in data, u.name of which is equal to uref.
- Return failure if not found.
- Return failure if u.tags contains
sameaddr
. - Insert d after the last element of the data union that begins with u.
- Make prev refer to u.
- Otherwise:
- Make prev refer to the last element in data.
- Append d to data.
- Make fcur refer to nothing.
- Make text refer to prev.text if
tags contains
sametext
. Otherwise, make text refer to d.text.
The nval
function
Declares a named value.
- <NAME> name
- Name of the value.
- <VAL> value
- Named value.
Implementations MUST return failure if any of the following is true:
- item name collision occurs between ccur and name.
Implementations MUST modify the state as follows:
- Create a new named value v.
- Set v.name to name.
- Set v.value to value.
- Append v to ccur.nval.
- Set fcur to nothing.
- Set text to v.text
The nref
function
Declares a name for an item reference (an alias).
- <NAME> name
- Name for an item.
- <IREF> iref
- Reference to the item.
Implementations MUST return failure if any of the following is true:
- item name collision occurs between ccur and name.
Implementations MUST modify the state as follows:
- Create a new named reference r.
- Set v.name to name.
- Set v.iref to iref.
- Insert r into ccur.nref.
- Set fcur to nothing.
- Set text to r.text
The fbeg
function
Begins declaration of a function member.
- <NAME> name
- Name of the function.
- <TAGS> tags (optional)
- Tags.
- <FID> fid (optional)
- Identifier of the function.
- <FIDN> fidn1 (optional)
- Identifier of the first autogenerated function.
- <FIDN> fidn2 (optional)
- Identifier of the second autogenerated function.
The following tags may be specified in tags:
+message
- Message function declaration.
+proto
- Function prototype declaration.
+event
- Event declaration.
+init
- Class constructor declaration.
+static
- Function is independent of an instance.
+read
- Function does not write to an instance.
+module
- Function requires access to module memory.
+kernel
- Function requires access to kernel memory.
+more
- Function expects more parameters than declared.
The tags +message
, +proto
, +event
and +init
are mutually exclusive.
If tags contains +message
, a function map is declared, from a pair of a langauge tag and a character coding to a function, which takes the delcared parameters and returns a human-readable message. A module implementation SHOULD implement only one function per language. The kernel automatically converts the encoding to the requested one. The functions are local to an implementation and MAY NOT be exported. The FID of the function identifies the array of implemented functions. These functions are accessed through the kernel via a special call.
If tags contains +proto
, only a function type is declared. No function is declared and no FID is allocated. A class may reference the prototype in order to declare one.
If tags contains +init
, two functions are declared. The FID fid is assigned to the constructor. It is named name, which SHOULD begin with the word init
. The function returns a status boolean and takes an implied read-write handle to an instance as the first parameter, called this
. The FID named create
is assigned to the creator. The name of the function is name with $create
appended. The creator takes an implied set of parameters inserted at the front, which determine the location and type of allocated memory, where an instance will be constructed by the constructor. The just-constructed object is then returned to the caller.
If tags contains +event
, three items are declared: a function prototype for an event handler and a handler installer and uninstaller functions. The FID fid is assigned to the event prototype. The event prototype has an implicit first parameter: a read-write handle to an undefined, previously installed object. The prototype is named name. The FID named install
is assigned to the event installer. It returns a status boolean and takes two parameters: a function reference to a handler and a read-write handle to an object. The object is passed as the first argument to the handler. The function is named name with $install
appended. The FID named uninstall
is assigned to the event uninstaller. It returns a status boolean and takes one parameter: a function reference to a handler to be uninstalled. The function is named name with $uninstall
appended.
Implementations MUST conform to the following algorithm:
- Return failure if any of the following is true:
- tags contains both
+static
and+read
. - ccur refers to mcur.self and
tags contains
+read
. - item name collision occurs between ccur and name.
- tags contains both
- If tags contains
+message
:- If fid was omitted, set fid to the default FID of a member named name of ccur.
- Return failure if any of the following is true:
- tags contains
+proto
,+event
or+init
. - Either fidn1 or fidn2 was not omitted.
- FID collision occurs against fid.
- tags contains
- If tags contains
+proto
:- Return failure if any of the following is true:
- tags contains
+message
,+event
or+init
. - tags contains
+module
or+kernel
. - Either fid, fidn1 or fidn2 was not omitted.
- tags contains
- Return failure if any of the following is true:
- If tags contains
+init
:- Let nameC be a copy of name.
- Append
$create
to nameC. - If fid was omitted, set fid to the default FID of a member named name of ccur.
- If fidn1 was not omitted, return failure
if the associated name is not equal to
create
. - If fidn1 was omitted, set fidn1 to the default FID of a member named nameC of ccur.
- Return failure if any of the following is true:
- tags contains
+message
,+proto
or+event
. - fidn2 was not omitted.
- FID collision occurs against fid or fidn1.
- tags contains
- If tags contains
+event
:- Let fidI be an invalid function identifier (zero).
- Let fidU be an invalid function identifier (zero).
- Let nameI be a copy of name.
- Let nameU be a copy of name.
- Append
$install
to nameI. - Append
$uninstall
to nameU. - If fidn1 was not omitted and
its associated name is equal to
install
, set fidI to fidn1. - If fidI is invalid and fidn2 was not omitted
and its associated name is equal to
install
, set fidI to fidn2. - If fidI is invalid, set fidI to the default FID of a member named nameI of ccur.
- If fidn1 was not omitted and
its associated name is equal to
uninstall
, set fidU to fidn1. - If fidU is invalid and fidn2 was not omitted
and its associated name is equal to
uninstall
, set fidI to fidn2. - If fidU is invalid, set fidU to the default FID of a member named nameU of ccur.
- Return failure if any of the following is true:
- tags contains
+message
,+proto
or+init
. - tags contains
+read
. - ccur does not refer to mcur.self and
tags contains
+static
but neither+module
nor+kernel
. - fid was not omitted.
- fidn1 is named neither
install
noruninstall
. - fidn2 is named neither
install
noruninstall
. - Both fidn1 and fidn2 are named
install
. - Both fidn1 and fidn2 are named
uninstall
. - FID collision occurs against fidI or fidU.
- tags contains
- Create a new function member fi.
- Set fi.tags to a copy of tags.
- Insert
+$install
into fi.tags. - Remove
+more
from fi.tags. - Set fi.mlv to mcur.self.clv.
- Set fi.clv to ccur.clv.
- Set fi.fid to fidI.
- Set fi.name to nameI.
- Insert fi into ccur.func.
- Create a new function member fu.
- Set fu.tags to a copy of tags.
- Insert
+$uninstall
into fu.tags. - Remove
+more
from fu.tags. - Set fu.mlv to mcur.self.clv.
- Set fu.clv to ccur.clv.
- Set fu.fid to fidU.
- Set fu.name to nameU.
- Insert fu into ccur.func.
- Create a new function parameter p.
- Set p.type to
read
. - Set p.name to
handler
. - Append a copy of p to fi.params.
- Append a copy of p to fu.params.
- Set p.type to a
rdwr>
. - Set p.name to
userdata
. - Append p to fi.params.
- Insert
+proto
into tags. - Remove
+module
from tags. - Remove
+kernel
from tags.
- If tags contains
+proto
, set fid to zero. - If ccur refers to mcur.self,
insert
+static
into tags. - Create a new function member f.
- Set f.mlv to mcur.self.clv.
- Set f.clv to ccur.clv.
- Set f.fid to fid.
- Set f.tags to tags.
- Set f.name to name.
- Insert f into ccur.func.
- If tags contains
+message
:- Create a new function parameter p.
- Insert
+output
to p.tags. - Set p.type to
rdwr>
. - Set p.name to
message
. - Append p to f.params.
- Create a new function parameter p.
- Set p.type to
FID
. - Set p.name to
enc_and_lang
. - Append p to f.params.
- Set fcur to f.
- Set text to f.text.
- Return success.
The fend
function
Explicitly ends a function declaration.
No parameters.
Implementations MUST modify the state as follows:
- Set fcur to nothing,
- Set text to ccur.text.
The ferr
function
Declares an error code that the current function may return.
- <NAME> name
- Name of the error message.
- <FID> sid (optional)
- Function identifier of the error message.
If the function does not return an error code, then the document simply does not declare any error codes.
Implementations MUST conform to the following algorithm:
- If fid is invalid or was omitted, set fid to the default FID of a member named name of a module.
- Return failure if any of the following is true:
- fcur is nothing.
- fcur.tags contains
+message
. - fcur.tags contains
+event
.
- For each error code code in fcur.codes: Return failure if code.fid is equal to fid.
- Create a new error code code.
- Set code.name to name.
- Set code.fid to fid.
- Append code to fcur.codes.
- Set text to f.text.
- Return success.
The fpar
function
Declares the next in-order parameter of the current function.
- <ARGT> type
- Type of the parameter.
- <NAME> name
- Name of the parameter.
- <TAGS> tags (optional)
- Parameter tags.
Function arguments are objects passed by either handle or value.
Objects passed by value are specified by simply writing a class reference.
.fpar module.class:0 by_value
If a class is a register class, the value is passed via one register. Otherwise, a reference to a local copy of the object is the value. Objects passed by value MUST NOT be longer than 4096 octets.
Objects passed by handle are written by specifying the handle.
.fpar read<module.class:0> by_handle
By default, a parameter is an input parameter. In order to declare an output parameter, include +output
in tags. No other tag is recognized.
.fpar module.class:0 output_value +output
Order in which parameters are declated is significant. It is RECOMMENDED to first declare output then input parameters.
Implementations MUST conform to the following algorithm:
- Return failure if any of the following is true:
- fcur is nothing.
- name is
this
.
- For each function parameter p in fcur.params: Return failure if f.name is equal to name.
- Create a new function parameter p.
- Set p.tags to tags.
- Set p.type to type.
- Set p.name to name.
- Append p to fcur.params.
- Set text to p.text.
- Return success.
The impf
function
Declares an implementation of a prototyped function.
- <IREF> proto
- Reference to a prototype function.
- <NAME> name
- Name of the function (the implementation).
- <TAGS> tags (optional)
- Tags.
- <FID> fid (optional)
- Identifier of the function.
Prototype implementations are marked with a special tag. The referenced prototype is stored as the return value type.
The following tags may be specified in tags:
+static
- Function is independent of an instance.
+module
- Function requires access to module memory.
+kernel
- Function requires access to kernel memory.
Implementations MUST return failure if any of the following is true:
- tags contains
+proto
. - tags contains
+event
. - tags contains
+message
. - tags contains
+init
. - tags contains
+read
. - tags contains
+more
. - item name collision occurs between ccur and name.
- FID collision occurs against fid.
Implementations MUST modify the state as follows:
- Insert
+$protoref
into tags. - Create a new function member f.
- Set f.mlv to mcur.self.clv.
- Set f.clv to ccur.clv.
- Set f.fid to fid.
- Set f.tags to tags.
- Set f.name to name.
- Create a new function parameter p.
- Set p.type to proto.
- Append p into f.params.
- Insert f into ccur.func.
- Set fcur to nothing.
- Set text to f.text.
The impc
function
Declares that the current class implements an interface.
- <TYPE> type
- Reference to the implemented interface.
- <MEMB> mref (optional)
- Reference to the data member associated with the interface.
An interface object is an instance of type.
If mref is omitted, then the class does not declare any data member as the interface object of the interface. This is only permitted if type has no data members.
The interface object MUST NOT be preceeded by a variable-length member. The offset must be the same value for every instance of the class.
Module implementations set the values of interface descriptor fields via the implementation's class definition document. A RECOMMENDED syntax for these documents is proposed in another section.
Implementations MUST return failure if any of the following is true:
- ccur.tags contains
+iface
.
Implementations MUST modify the state as follows:
- Create a new interface reference iface.
- Set iface.type to type.
- Set iface.mref to member.
- Set iface.clv to ccur.clv.
- Insert iface into ccur.ifaces.
- Set fcur to nothing.
- Set text to ccur.text.
The path
function
Declares a path to an external resource.
- <PATH> path
- Path to the resource.
Implementations MUST return failure if any of the following is true:
- Any path p in mcur.paths is such that: p.path is equal to path.
Implementations MUST modify the state as follows:
- Create a new path p.
- Set p.path to path and p.mlv to mcur.self.clv.
- Insert p into mcur.paths.
- Set cbeg to nothing,
- Set func to nothing,
- Set data to nothing.
- Set text to p.text.
Postprocessing
Check this section later when it's written.
Register names
Specials
boolean
- A boolean value; either "true" or "false".
cmprval
(comparision result value)- Result of a comparision; one of: "error", "less than", "same as" or "more than".
Unsigned integers
u8
- Integer in the range [0, 28-1].
u16
- Integer in the range [0, 216-1].
u32
- Integer in the range [0, 232-1].
u64
- Integer in the range [0, 264-1].
u128
- Integer in the range [0, 2128-1].
Signed integers
i8
- Integer in the range [-27, 27-1].
i16
- Integer in the range [-215, 215-1].
i32
- Integer in the range [-231, 231-1].
i64
- Integer in the range [-263, 263-1].
i128
- Integer in the range [-2127, 2127-1].
Vectors of integers
PLACEHOLDER.
Binary floating-point numbers
f16
- IEEE 754 arithmetic format with base=2, p=11, emax=15.
f32
- IEEE 754 arithmetic format with base=2, p=24, emax=127.
f64
- IEEE 754 arithmetic format with base=2, p=53, emax=1023.
f80x87
- IEEE 754 arithmetic format with base=2, p=64, emax=16383.
f128
- IEEE 754 arithmetic format with base=2, p=113, emax=16383.
Vectors of binary floating-point numbers
PLACEHOLDER.
Decimal floating-point numbers
d32
- IEEE 754 arithmetic format with base=10, p=7, emax=96.
d64
- IEEE 754 arithmetic format with base=10, p=16, emax=384.
d128
- IEEE 754 arithmetic format with base=10, p=34, emax=6144.
Predefined classes
Predefined classes have no associated class level.
Their names are written with capital letters.
Octet
There is only one fundamental type: an OCTET
. All classes are essentially arrays of octets.
An octet occupies one memory address, under which there are at least 8 bits. If there are more than 8 bits, the excess bits MUST be cleared. There is no meaning associated with the bits of an octet.
Length of a class is expressed in octets.
Its register type is a vector of 8 bits. It is mappped to an 8-bit unsigned integer in practice.
Boolean
A BOOLEAN
is either true (non-zero) or false (zero).
The length of a boolean is 1 octet.
Its memory alignment is 1 octet.
Its register type is a 1-bit unsigned integer. It is mappped to an 8-bit unsigned integer in practice.
Status boolean
A STATUS
is a special boolean. It is used as the return value of functions.
If false, it means that there is nothing to report (success). Otherwise (if true), it means that the task's status stack was pushed onto. The caller should examine the stack before continuing.
The idea is that programs are written like this:
if function() returns true { code when function reports something, usually failure } otherwise, continue
If the function returns a status boolean, it is implied that the code in the if-block is an unlikely branch, because functions are assummed to generally execute successfully.
A status code is returned via the status stack along with other, supplementary information.
Comparision result
A CMPRVAL
is the result of a comparision. It is used as the return value of functions.
The length of a comparision result is 1 octet.
Its memory alignment is 1 octet.
Its register type is a 2-bit signed integer. It is mappped to an 8-bit signed integer in practice.
Possible values of a comparision result, when comparing object LHS against object RHS, are:
- 0
- The objects are equal.
- 1 (and greater)
- LHS is greater than RHS.
- -1
- Comparision failed. Interpreted as a true status boolean.
- -2 (and less)
- LHS is less than RHS.
Simply put, one first tests for -1 (unsuccessful) and then compares with 0.
Functions that return this value also set associated CPU flags accordingly, so that a conditional jump may immediately follow the function call.
Object length
The length of an objsize
is 4 octets.
Its memory alignment is 4 octets.
Its register type is a 32-bit unsigned integer.
It represents a length of an object, in octets. Octets are ordered in increasing order of significance.
Memory address
The length of an ADDRESS
is 8 octets.
Its memory alignment is 8 octets.
Its register type is a 64-bit unsigned integer.
It represents a memory address. Octets are ordered in increasing order of significance.
Function identifier
The length of an FID
is 8 octets.
Its memory alignment is 8 octets.
Its register type is a 64-bit unsigned integer.
It represents a function identifier. Octets are ordered in increasing order of significance.
16-octet identifier
The length of an ID16
is 16 octets.
Its memory alignment is 8 octets.
It has no register type.
It is an opaque array of 16 octets.
System memory reference
The length of a HANDLE
is 32 octets.
Its memory alignment is 8 octets.
Its register type is a CPU-defined virtual memory reference.
Its data members are as follows:
.cbeg HANDLE !NOID .data ADDRESS address .data ID16 node_id .data OCTET nonce [8]
address
- Lower bits of the system memory address.
node_id
- Higher bits of the system memory address.
nonce
- Random value associated with the referenced object.
Loading from and saving into a handle are kernel functions, which translate the system memory address in the handle from and to a virtual memory address of the calling task.
This is not an object identifier. The address may point to any octet within an object.
Module reference
The length of an MREF
is 24 octets.
Its memory alignment is 8 octets.
It has no register type.
Its data members are as follows:
.cbeg MREF !NOID .data ID16 mcid .data OCTET mclv .data OCTET mbid [8] +sameaddr
mcid
- Class identifier of the referenced module.
mclv
- 8-bit unsigned integer. Minimum class level of the module.
mbid
- Build identifier of the target M-Build.
A reference is either to a specific M-Build by its build identifier or to any M-Build which implements the given module at the given level.
A build identifier is an 8-octet (256-bit) value, which is computed from relevant parts of a program image. The exact way to compute it depends on the image format.
The class level is considered when octets of the mbid
array at positions [1,7] have all of their bits cleared. Thus, such build identifiers are out of range and invalid.
Function reference
The length of an FREF
is 32 octets.
Its memory alignment is 8 octets.
It has no register type.
Its data members are as follows:
.cbeg FREF !NOID .data MREF mref .data FID fid
mref
- Module reference.
fid
- Function identifier.
Interface descriptor
The length of an IFACE
is variable. The minimum length is 24 octets.
Its memory alignment is 8 octets.
It has no register type.
Its data members are as follows:
.cbeg IFACE !NOID .data ID16 cid .data objsize clv_len .data objsize offset .data OCTET members [0:MAX]
cid
- Class identifier of the interface.
clv_len
- Class level and total length of the descriptor in octets. The level in is the most significant 8 bits. The length in is the remaining least significant 24 bits. The length is rounded up to a multiple of 8. The value is also an offset from the beginning of the descriptor to the next one.
offset
- Offset from the beginning of a class instance, specifying the location of the associated interface object. If equal to 4294967295, then there is no such object.
members
- Data members of the interface descriptor.
Class descriptor
The length of a CLASS
is variable. The minimum length is 32 octets.
Its memory alignment is 8 octets.
It has no register type.
It is a structure defined as follows:
.cbeg CLASS !NOID .data ID16 cid .data objsize len_dsc .data objsize len_min .data objsize len_max .data OCTET align .data OCTET clv .data OCTET flags .data OCTET ifaces_len .data IFACE ifaces [ifaces_len:MAX]
All OCTET
-typed fields are interpreted as an 8-bit unsigned integer.
cid
- Class: identifier.
len_dsc
- Total length of the descriptor, in octets. The length is rounded up to a multiple of 8. The value is also an offset from the beginning of the descriptor to the next one.
len_min
- Minimum length of an instance, in octets.
len_max
- Maximum length of an instance, in octets.
align
- Alignment exponent.
clv
- Class level.
flags
- Vector of 8 flag bits.
ifaces_len
- Amount of implemented interfaces.
ifaces
- Descriptors of implemented interfaces.
Class descriptor for level n of a class is directly followed by the descriptor for level n-1.
The descriptor for level n contains only those interfaces that were introduced at level n.
The defined flag bits in flags
are, counted from the least significant:
- bit 0
- The class has a destructor.
- bit 1
- The class contains handles. (It has an accessor.)
- bits 2-7
- Undefined. Must be cleared.
Document identification
This section contains information on how to identify and mark Documents as such in their respective systems.
Data Format Descriptor
Data Format Descriptor for Documents is TO-BE-DEFINED
.
Documents have the class #Document
.
Internet Media Type
Media type of Documents is text/prs.kagomeko.k1os
.
The charset
parameter MUST be included with the value UTF-8
.
References
Hyperlinks in the document will point here in a later revision.
Developement considerations
Modules ought to follow the KISS (Keep It Simple, Stupid) rule. They are to be narrow in scope so that their developement ends one day, the module becomes finalized and no more levels are ever added to it. If it does not need revision, it means it’s good and can be safely used.
Having too much functionality in a module makes it more unstable. Stability is the most important trait every module author ought to pursue. Modules that are constantly revised are broken by design. Such modules ought to be scrapped, made obsolete, and then redesigned as new modules (with a new identifier).
The scale of a module should ideally be small enough for one person to not become mentally exhausted (burned out) while implementing it alone. There should be many implementations available that a user can choose from.
One should differentiate between a module and a software project. The module ought to be a part of the project, not the project itself. For example, if a Python interpreter were to be a module, then Python 2 and Python 3 interpreters would be different modules, which might be developed as part of the same project.
They are different because the most crucial part—the parser and interpreter—are different for version 2 and for version 3. The other reason is that Python 2 is phased out in favour of 3. Keeping both 2 and 3 in the same module is counter-productive. Remember that items—once defined—cannot be removed from a module.
Some modules are published and standarized for the general public. Other modules may be known only to select few or be created as part of developement or a user session and have the lifetime of only few hours.