Time of page creation
Time of last modification
Kueea (author)
<inumi@fumu-no-kagomeko.kueea.cyou>

Kueea Network

Draft; do not implmement.


Introduction

Kueea Network is a network of computing systems which share resources.

The main purpose of the network is gathering and preservation of knowledge, whatever the knowledge in question is about. The first ideas leading to the creation of the network emerged around the year 2004, when the project was just another Web site. The project evolved into a network system around the year 2015.

The goal was to avoid duplication of information by connecting databases. As research progressed, it became clear that a database alone is useless. There needs to be a system of programs that use and manage the data. Database usefulness needs to be proven by using it for something perceivable, like an audio or video player (that reads signals from the database).

Kueea Network consists of both the programs and the database.

Definition of terms

A kueea /kwiːa/ (uncountable) is a network resource, which is distinct from other network resources in that it has the ability to cause change. Most commonly, kueea represent their respective users, but it may also be something more abstract, such as a brand, or public image, of a company or a natural phenomena such as wind and water (for example when describing the "author" of a rock formation).

The term user is reserved to unambigously refer to the physical entity (usually a human) interfacing with a Kueea Node.

A Kueea (System) Node is a virtual or physical machine with computing and data storage capabilities. It is a terminal that users use to send and receive messages as kueea. Kueea Nodes are abbreviated to just "nodes" within this document.

A Kueea System is a computing system, which is implemented by a network of Kueea Nodes. Each such system has a corresponding set of immutable parameters which self-identify a given system. The parameters describe and configure a system's subsystems.

Kueea Network (without "the") is a network of Kueea Relay Nodes, which MAY NOT be Kueea Nodes, that pass messages between Kueea Nodes, independent of all Kueea Systems that exist within Kueea Network. A Kueea System Node MUST also be a Kueea Relay Node.

System parameters

System parameters are a set of name-value pairs. Names are unique within a set. A value can be one of either:

When a named value is a set, a dot (.) character in a name reference refers to a named value within the named set; the dot is a separator. The part before the dot names the enclosing set, the rest of the name is the named value within the set. The rule is applied recursively on the text after the separator.

Names are case sensitive and are sequences of characters. Names SHOULD be in English.

Character set is the most recent revision of the Universal Coded Character Set. [ISO10646] [UNICODE]

When a value is said to be an external reference, it is a unique sequence of characters that references something. The value SHOULD be a Uniform Resource Identifier, [RFC3986] which has a well-defined dereference mechanism to the referenced thing.

System identifier

A ruleset is a serialization of system parameters.

The canonical ruleset is a serialization using JSON. Sets of name-value pairs are expressed as objects, numbers as numbers, arrays of values as arrays, and sequences of characters as strings. [RFC8259] This serialization is devoid of whitespace between values and, within sets, values are ordered by name in ascending order by code point.

A system identifier is a set of at least two hashes, where a hash is a pair of a hash identifier and a hash value.

The hashes are used for looking up the ruleset in a database. At least two hashes are used as a protection against hash collissions. All of the hashes MUST match the looked up ruleset.

A hash identifier is a number. Recognized hash identifiers are those registered in the IANA Named Information Hash Algorithm Registry. [RFC6920] The number corresponds to the Suite ID field.

A hash value is a sequence of bits. Hash values are computed over the canonical ruleset.

Therefore, all system parameters are immutable and any change in any parameter changes the system identifier.

Node identifier

A Kueea Node may freely move around within the underlying network, causing its network-protocol address (physical location) to change.

A node identifier is a bit sequence which uniquely identifies a Kueea Node within the scope of a given Kueea System.

The length of a node identifier MUST NOT exceed 65528 bits.

As a system parameter value, a node identifier is a sequence of characters.

The domain of node identifiers depends on the node subsystem.

In order to uniquely reference a Kueea Node within Kueea Network, one needs to create a pair of system and node identifiers. While having node identifiers be globally unique might be desirable, it would require all Kueea Relay Nodes to synchronize their data.

Kueea identifier

A kueea identifier is a bit sequence which uniquely identifies a kueea within the scope of a given Kueea System.

The length of a kueea identifier MUST NOT exceed 65528 bits.

As a system parameter value, a kueea identifier is a sequence of characters.

The domain of kueea identifiers depends on a system parameter. The section about receiving a message contains more information.

Systems database

Each Kueea Relay Node contains a systems database, which collects information about Kueea Systems: their system parameters and data about their nodes and kueea.

Information in this database is considered public knowledge.

The initial set of systems in the database is transmitted out-of-band. The ideal situation is when these systems are passed by word of mouth, when this information is something one is taught at school.

If a node receives a message to a system it does not recognize, the receiver can ask the sender about the system parameters and consequently automatically populate the database.

Message

A message is a data structure composed of the following fields:

  1. destination system identifier,
  2. destination node identifier,
  3. source kueea identifier,
  4. message payload,
  5. source kueea signature.

Message payload is a sequence of bits, which MUST NOT be longer than 4294967288 bits.

The payload begins with a subsystem number, which is a non-negative number, followed by subsystem payload.

Subsystems are defined by the system parameter named subsystems. Its value is an array of sets of name-value pairs. The subsystem number is an index to this array.

Each name-value pair in the set is called a subsystem parameter. Subsystem parameters named proto and required MUST exist. The value of proto is an external reference identifying the subsystem. The value of required is a number, which when non-zero, indicates that every node in the system MUST support the subsystem in question.

A signature is a pair of a signature scheme identifier and a sequence of bits - the signature data.

The set of recognized signature scheme identifiers is determined by the system parameter sigSchemes, which is an array of external references, each naming a signature scheme. The signature scheme identifier is an index to this array.

Message bitstream

In order to encode a bit sequence with a length width of m bits: append m bits that encodes the length of the sequence as an unsigned integer, most significant bits first, followed by the bits of the sequence, in sequence order.

In order to encode a subsystem number number and subsystem payload data (a bit sequence) into a message payload payload:

  1. Let payload be an empty sequence of bits.
  2. While number is greater than or equal to 255:
    1. Append eight set bits (111111112) to payload.
    2. Subtract 255 from number.
  3. Append number to payload, encoded as an unsigned integer spanning 8 bits, most significant bits first.
  4. Append data to payload,
  5. Return payload.

In order to encode a message into a message bitstream:

  1. Let msg be an empty sequence of bits.
  2. Append the amount of hashes in the destination network identifier to msg, encoded as an unsigned integer spanning 8 bits, most significant bits first.
  3. For each hash in the destination network identifier: encode the hash into the Binary Name Format [RFC6920], then append the resulting sequence of bits to msg, encoded as a bit sequence with a length width of 16 bits.
  4. Append the destination node identifier to msg, encoded as a bit sequence with a length width of 16 bits.
  5. Append the source kueea identifier to msg, encoded as a bit sequence with a length width of 16 bits.
  6. Append the chosen signature scheme identifier to msg, encoded as an unsigned integer spanning 8 bits, most significant bits first.
  7. Append the message payload to msg, encoded as a bit sequence with a length width of 32 bits.
  8. Compute signature data over msg and append the resulting bit sequence to msg. (Length is implied by the identifier.)
  9. Return msg.

The kueea signs the message as part of generating a message bitstream. The bitstream cannot be computed if the signature cannot be generated.

Kueea Network Protocol

Each Kueea Relay Node MUST implement at least one Kueea Network Protocol. A Kueea Network Protocol defines the method of message transfer, which is composed of three functions:

Kueea Nodes MUST implement the protocol of their Kueea System; Kueea Relay Nodes MAY implement multiple protocols simultaneously. They are thus being able to pass on messages to and from Kueea Systems utilizing different Kueea Network Protocols. Such a translation would happen when one Kueea Node sends a message to another one which is not part of the same Kueea System.

The name of the system parameter is netProto. Its value is an external reference to an implementation. Each implementation MAY define additional system parameters. Such parameters MUST begin with the character sequence net.

Node distance

Given a system identifier, source and destination node identifiers, the distance function outputs two values: the estimated time of delivery and the amount of routing points on the way to the node.

The source node counts as one routing point (minimum is one). If the amount of routing points is zero, it means the node is unreachable.

The estimated time SHOULD be in nanoseconds.

Sending a message

Given a destination system identifier, destination node identifier, source kueea, message payload and a signature scheme identifier, the send function outputs a value indicating whether the message bitstream was successfully accepted for delivery over a physical network link. The output value is either a boolean or an error code.

Once accepted, the message is assumed to be delivered by the caller. There is no indication that the message actually reached its destination. Higher level protocols must define a reply if a confirmation is needed.

The destination system identifier in a message MUST NOT be empty.

If the destination node identifier is empty, the message is broadcasted to all nodes in the destination system.

An implementation MUST protect the message bitstream from tampering and evesdropping while the data is in transit. One way to achieve it is to encrypt and sign a message to its receipient so that only the receipient is able to read it. Broadcast messages may be protected with symmetric encryption using encryption data communicated via a system parameter.

Receiving a message

Upon receiving a message, an implementation verifies the signature. Kueea Relay Nodes MAY NOT verify signatures. The destination Kueea Node MUST verify the signature. A message which failed signature verification MUST be discarded.

Dereference mechanism of a kueea identifier to data necessary for verification of a signature is defined by the system parameter sigVerify. Its value is an external reference to said mechanism. This mechanism also implicitly defines the domain of kueea identifiers.

The recv event handler is executed with the decoded message as input once the signature in the message has been successfully verified. Decoded here means decoded from the bitstream into a message object.

Subsystem: Roles

Kueea Nodes may serve a particular role in a Kueea System. The role assignment is managed by the role assignment subsystem, which MUST be defined and is always required.

A node broadcasts to its system an intent to serve a given role. Other nodes, upon receiving the intent, may accept it or not. The acceptance is done either explicitly by a reply message or implicitly by receiving incoming data associated with the role.

Examples of roles are: value registry, data archive, public cache, etc. Basically, these are nodes that are frequently accessed by others. Most of them are expected to have fast connections and high availability.

The intent is periodically broadcasted to ensure it reaches every node. (One cannot obtain this data from a node that rejected the intent.) Its frequency is a function of how much a service is used and time. If the role is actively utilized, informing others is unnecessary. If no one utilizes the role, the intent should be sent more often, but if it is still not being used, then it is better to stop for a while.

A healthy network is one where multiple nodes take on the same role, so that if one of them becomes unavailable, there are others remaining; the availability of a service is thus preserved.

An important part here is that roles are established dynamically and may not require any intervention from a user. Role assignment can be made fully automatic.

Joining and leaving a Kueea System

In order to join a Kueea System, the administrator of a node broadcasts a "join request" message. Its syntactical structure is defined by the implementation.

The message contains identifiers of the node and its administrator (kueea) as well as message transfer subsystem's addresses of the node. The message MUST be signed by the administrator, which is held responsible for the node's network behaviour.

Note that the administrator is not necessarily a node's user, but as an administrator they have the means to know who and when was using the machine and can lecture the user in question.

Recipients individually reply to the message with one that either accepts the new node into the system or refuses the join request. If a Kueea Node accepts the request, the reply MUST be sent. The refusal message is RECOMMENDED.

Each node updates its systems database with the received data.

The request to leave a system is done similarily, although nodes MAY be automatically removed if they are unreachable.

Subsystem: Remote Procedure Call

Each Kueea System SHOULD define a required RPC subsystem. It is the core subsystem in a Kueea System responsible for controlling nodes. The subsystem defines machine requirements of a node in the system. Its primary purpose is execution of decentralized programs, which are the foundation for a network-based operating system.

This should actually be a "must" but is only a "should", because Kueea Network technology may also be useful in other contexts.

Resources

Resources in Kueea Network are all vertices of a graph. They are either device-independent, in which case their identifiers are node-agnostic and do not take device or node identifiers into account; or they are device-specific, in which case the node needs to be specified.

Device-specific resources are generally limited to physical devices only.

Kueea

Resources called kueea are special resources, in that these resources have the ability to make changes in the network. Other resources are inanimatable objects, which do not act on their own. For example, an author of a document would be a kueea.

Kueea do not necessarily represent living beings. They may be abstract concepts (like nature) or fictional characters. Organizations also have kueea – their public image. Think of them as masks a human puts on when interacting with others. One may also think of a communication channel or a label.

Kueea is a resource which has authentication properties. These properties are used to verify cryptographic signatures. Technically, kueea are what sign messages in Kueea Network.

Users

The term user refers to a physical being (usually a human) which interacts with a Kueea Node via its terminal device(s), by which the user issues commands to its kueea to do things within a system.

A given user is always the same user. A kueea may represent a user, but it is not the user. The current user of a kueea is a property that may change, for example because one user sold its kueea to another.

From the point of view of Kueea Network, users are out of its scope. The network can contain information about them, though.

A user resource is a collection of kueea credentials. The credentials are required to generate signatures (control kueea). The user resource should be stored on a device one carries with oneself. It could be a pendrive or some other removable storage device. The data should never be stored on a publicly accessible storage medium.

Data and metadata

Bit sequences are categorized as either data or metadata.

Format of a metadata sequence is predefined. The sequences contain statements about resources in the network. Resources might be binary or might not have any binary representation. All of these sequences are publicly accessible resources.

Format of a data sequence is undefined. These sequences SHOULD be immutable resources. Every data sequence have a corresponding metadata sequence, which contains description of the former.

Subsystem: Metadata

Metadata sequences have a name in a metadata subsystem. This system defines the format of metadata sequences and their identification and management protocols.

It names and describes resources for Kueea Node users.

The systems database SHOULD be stored within the metadata subsystem.

Subsystem: Data storage

Both data and metadata sequences are stored in a data storage subsystem.

This system assigns lookup parameters to bit sequences, which are all based entirely on the sequences themselves. The parameters are generally the length of the sequence and the results of applying a hash function on it.

Users generally concern themselves with the metadata subsystem only. Data sequences are managed by node software in the background.

Because other subsystems need to somehow initialize themselves, the data storage subsystem MUST provide in its interface access to special bit sequences called bootstrap sequences. Implementations MUST support at least one such sequence (for the metadata subsystem).

Subsystem: Data archival

Within a Kueea System there is a data archival subsystem. The subsystem ensures that bit sequences can be accessed by users.

It monitors availabily of bit sequences and tries to preserve it. Because Kueea Systems are meant to be decentralized, this subsystem SHOULD maintain a state where a sequence is available on more than one Kueea Node.

Policy

Kueea Nodes automatically enforce their system's policy, which is a set of rules written in a machine-readable format. The policy is communicated via system parameters, most of which is tied to the metadata subsystem.

Access denial

Kueea that break the rules are banned for a period of time. The banishment is done by sending a broadcast message with the proof of the breakage, which is one of the reasons all messages are signed.

Denial of access is never permanent. The system parameters banMin and banMax are both numbers, specifying the minimum and maximum duration a kueea may be banned for, in seconds; the maximum allowed time is 31 days (2678400 seconds); the minimum allowed time is 1 minute (60 seconds). The value of banMin MUST be lower than banMax.

The policy is publicly available for review and inspection. Every ban MUST refer to the policy rule that was broken.

Nodes are banned by banning their administrator, which in turn invalidates all join requests sent by the kueea. All of the administator's nodes are thus banned, not just one.

The punishment for breaking rules is in that a kueea becomes alone, without access to any of the resources and services offered by the system. If it wishes to regain access to them, it must follow the policy.

If a user does not agree with the rules in the policy, the options left are to establish a new system or look for another. Discussing a policy can only be done while following the rules.

The policy SHOULD only contain rules about a node's network activity, not about the content of human-readable messages on a forum, etc. The latter should be mostly lax, if any, although such rules may also exist.

A node that does not follow the policy in its Kueea System effectively creates another, hidden system with some other policy. In order to become part of this hidden system, one would need to somehow obtain addresses of its hidden nodes. In other words, life is very difficult for such systems. All nodes would also have to put trust in each other for staying hidden. Discovering these nodes exposes the ones that break the rules.

Access control

Access control in Kueea Systems is consensus-based. Rights are determined by the policy of a system.

Metadata sequences contain access control information among other things. Metadata is always transferred before the data, which might be protected. Access is to be denied before a data request in sent over the wire.

In other words, metadata subsytem's messages are exchanged first. If the metadata indicates a kueea can access a resource, then the corresponding subsystem's messages are sent.

If the data is unavailable locally, the node will look for it on the network. In order to successfully fetch it, the other node must also agree on the assessment that the requesting kueea may access the data.

Kueea that request data without the necessary credentials to said data are mailicious and SHOULD be banned. The access tests MUST be done on both local and remote nodes.