Intro to Multi-Keyed Files

From CometWiki

Jump to: navigation, search

Comet Keyed Files

The most common file type that is handled by the Comet file system is the Keyed file. Using files of this type, programs can store and retrieve information with unmatched ease and performance.

Terminology:The data stored in a file is in the form of RECORDS. Records are usually specified in format statements. Each record is composed of FIELDS. Each field is composed of a varying number of bytes of ASCII characters.

Here is an example of a format statement:

Student: Format FirstName$; LastName$; ID$; balance; group; grade; Account$ 

Fields may be string or numeric.

A string field corresponds to an Internet Basic string variable. Its length is exactly the same as the declared length of that variable. String fields are padded on the right with blanks. A numeric field corresponds to an Internet Basic numeric variable. Its length is exactly the same as the declared length of that variable plus one byte for the sign and another for the decimal point (Numeric length + 2). Numeric fields are padded on the left with blanks. The sign is the right most character in the field. A sign of blank means that the number is positive.

Note:The length of variables returned by the STR() function are slightly different than the lengths of fields placed in the records of a file. The numeric fields in a file always contain space for a sign and decimal point. The STR() function of a numeric with precision of zero contains just one position for the sign.

Declaration Field Length STR() Length
Length 5 & local a$ 5
Length 5.0 & local X 7 6
Length 5.3 & local Y 7 7


Keys

  • Records in a keyed file are also composed of KEYS.
  • Keys may correspond to fields, partial fields, or span several fields.
  • Keys are described by their byte length and position within the record.
  • The bytes comprising a key must be contiguous within the record.
  • The maximum key length is 254 bytes.
  • There must be one and only one PRIMARY KEY for each keyed file.
  • Optionally there may be multiple SUB KEYS.
  • The data comprising the Primary key will be unique within the file.
  • Uniqueness is not required for sub keys.
  • Each sub key has a name associated with it.
  • Sub Key names are not necessarily the same as field names.
  • Sub key names are case insensitive and may have blank characters embedded within them.

Operations

Keyed files may be erased just as any other Comet file

Erase file-name, dir=directory-name, [excp=exception address]

Creating a Keyed file:

CREATE file-name, Record-length, K, Key-length, Key-position, Dir=directory-name, [excp=exception address]

This creates the keyed file with a primary key only.

Creating a Subkey:

CreateKey file-name, key-name, Key-length, Key-position, Dir=directory-name, [excp=exception address]

This creates a sub key for the file with a name of key-name. The file must exist and MUST BE EMPTY when creating sub keys. Up to 99 sub keys may be created for each file. Key names may be up to 64 characters in length, are case insensitive, and spaces within key names are not significant.

Using Keyed files

All operations on Comet files are done through LUNs. LUN stands for “Logical Unit Number”. Each Program may have up to 100 LUNs open at any given time. LUN 0 is reserved for the terminal or display. This leaves LUN 1 to LUN 99 available for connection to files or devices.

Opening a keyed file

To open a given file for use on a LUN using the PRIMARY KEY:

Open(LUN)file-name[, Dir=directory-name][, excp=exception address]

To open a given file for use on a LUN using a SUB KEY of a given name:

Open(LUN)file-name, key=key-name[, Dir=directory-name][, excp=exception address]

Once the file is opened using a sub key, SOME operations on that LUN will be directed to that sub key, while others are oriented around the record and not the key.

Write operations
Write
Insert
Rewrite
Write(LUN, Format) [, excp=exception address]

Write operations are always done using the record no matter which sub key the file is opened on. It will use the primary key to govern whether to overwrite a record or not. The Insert and Rewrite statements govern the action taken if the same primary key already exists in the file. If the primary key already exists, Insert will fail with an excp=56. If the primary key is not found, rewrite will fail with an excp=32. If the write/insert/rewrite is successful, each sub key will be maintained correctly – if a given sub key has changed for a particular record, that sub key will be deleted and the new one will take its place.

Since all key information is contained in the record for the file, it is never necessary to supply a key value in a write statement. If a key value is supplied, that value will be copied into the record as the primary key no matter which key the LUN designates.

Read operations
Read
Input
Inquire 
Read (LUN, Format)[key=Key-value][, excp=exception address]

Read operations use the particular key designated by the LUN. If a key value is specified for a file opened using a sub key, the record read will be the first record encountered with that sub key. (There will be more on sub key order later.) If a key is not specified, the next record in key order will be returned.

Functions

Key Function

X$ = key(LUN [, excp=exception address])

The key function returns the next key in the file. If it is performed for a LUN opened on a sub key, and the file is read sequentially, the key value returned will be the same until the file points to a different key value.

Fstat Function

x$ = FSTAT(filename[,DIR=directory][,EXCP=statement-label])

The Fstat function returns information about the file. See the FSTAT reference for the full contents of the information string returned.


Keystat Function

x$ = keystat(LUN)KeyNumber[, excp=exception address]

Where KeyNumber is a number from 0 to 99 indicating the key field index. An index of zero referrs to the primary key, a key number of one referrs to the first sub key etc.

The data returned by the KeyStat Function is:

Position Length Description/Values
1 5 Key length
6 5 Key Position
11 64 Key Name

The key Name returned is in internal format: All alpha characters are converted to upper case and internal spaces removed.

Other operations

Position

Position(LUN)key=key-value[, excp=exception address]

Position on a LUN opened with a sub key will set the file so that the next record read will be the first record containing that sub key. Subsequent sequential reads will read subsequent records.

Extract

Extract (LUN, Format)[key=Key-value][, excp=exception address]

Extract reads a record from the file and locks that particular record from other reads or extracts until the next write to the LUN. Any key value supplied must be a primary key. The LUN Must refer to a file opened with the primary key.

Delete

Delete(LUN)key=key-value[, excp=exception address]

Delete of a record removes the record from all of the keys associated with the file. The vacant space will be re-used for a subsequent write. Any key value supplied must be a primary key. The LUN Must refer to a file opened with the primary key.

Extract and delete always refer to the record, not the key, but they require a key to be supplied. If one of these operations is attempted on a LUN opened for a sub key, an excp 53 will result. Extract and Delete MUST be performed on a LUN opened to the primary key only.

Key Order

Since there is one and only one record for each primary key, record order for files opened with the primary key is not an issue. Files opened using a sub key is a different matter. Within a given sub key value, records are ordered in the order of their position in the data file. When a file is initially created and filled with records, this will be the order that the records were written to the file. However, if any records are deleted, the space they occupied is saved to be assigned to any new records that may be written later. That will affect the order in which the records for a particular sub key are read. Thus it is best to assume that multiple records with the same sub key value are in no particular order.

Performance

Reading a file using a sub key is just as fast as reading the same file using the primary key.

Writing is another matter. Using sub keys skews performance in an interesting way. Since a separate key tree is maintained for each sub key, writing to a file with sub keys will be slower than writing to a file with a primary key only. Our guess is that writing each sub key takes 75-80% of the time it takes to write the primary key. If we were to compare the performance of Comet sub keyed files vs. the old method of maintaining keyed only files for sub keys, the whole write procedure would be much faster since only one transaction to the file system causes all keys to be maintained, vs. multiple transactions (on multiple LUNs). Keep in mind that more sub keys will degrade performance. For example, writing a record to a file with 50 sub keys is equivalent to writing about 40 keyed files!

Personal tools