Atoms
Purpose
To represent a real-world thing in an information system context, you use atoms.
Description
An atom refers to an individual object in the real world, such as the student called "Caroline". But what if there are three different Carolines? What does it mean to say: "Caroline has passed the exam for Spanish Medieval Literature."? This sentence might be true for one Caroline, but false for the others. Clearly, to avoid ambiguous sentences, an atom must identify exactly one real-world object, no more, no less. Or rather, it suffices that the atom identifies one object within the context in which we are working: if the context is a group with only one Caroline, there will be no ambiguity. Similarly, ABBA is unique among all pop groups in the world; there ought to be only one building permit with number 5678; etcetera.
Examples
"Caroline"
, 5
, 1917-11-07
48
, 10.34
, 2.
, .001
, -125
, +5.33333
, 2.5E2
, 5E-3
Syntax and meaning
The syntax of atoms is largely taken from ISO8601 and corresponds to the syntax of SQL and Excel. (Acknowledgement: the following text was adapted from Wikipedia)
- Date and time values are ordered from the largest to smallest unit of time: year, month (or week), day, hour, minute, second, and fraction of second. The lexicographical order of the representation thus corresponds to chronological order, except for date representations involving negative years. This allows dates to be naturally sorting|sorted by, for example, file systems.
- Each date and time value has a fixed number of digits that must be padded with leading zeros.
- Representations can be done in one of two formats - a basic format with a minimal number of separators or an extended format with separators added to enhance human readability. The separator used between date values (year, month, week, and day) is the hyphen, while the colon is used as the separator between time values (hours, minutes, and seconds).
- For reduced accuracy, any number of values may be dropped from any of the date and time representations, but in the order from the least to the most significant. For example, "2004-05" is a valid ISO 8601 date, which indicates May (the fifth month) 2004. This format will never represent the 5th day of an unspecified month in 2004, nor will it represent a time-span extending from 2004 into 2005.
- If necessary for a particular application, the standard supports the addition of a decimal fraction to the smallest time value in the representation.
Atomic types
Atoms are represented in an SQL database. For this purpose, every atom has a type (sometimes called the technical type). The representation in SQL is given in the following table.
type | purpose | SQL | closed | eq |
---|---|---|---|---|
ALPHANUMERIC | to represent strings of short length, i.e. less than 255 characters | VARCHAR(255) | yes | yes |
BIGALPHANUMERIC | to represent large strings of limited length, i.e. less than 64 kb | TEXT | no | yes |
HUGEALPHANUMERIC | to represent strings of arbitrary length | MEDIUMTEXT | no | no |
PASSWORD | to represent passwords in a secure way | VARCHAR(255) | no | yes |
BINARY | to represent uninterpreted binary data of short length | BLOB | no | no |
BIGBINARY | to represent large binary data of limited length | MEDIUMBLOB | no | no |
HUGEBINARY | to represent large binary data of arbitrary length | LONGBLOB | no | no |
DATE | to represent dates compatible with ISO8601 | DATE | yes | yes |
DATETIME | to represent timestamps compatible with ISO8601 | DATETIME | yes | yes |
BOOLEAN | to represent True and False values | BOOLEAN | yes | yes |
INTEGER | to represent positive and negative whole numbers in the range [-2^63..2^63 -1] | BIGINT | yes | yes |
FLOAT | to represent floating-point numbers compatible with ISO8601 | FLOAT | no | no |
Object | to represent a key value for objects; it is not meant to be visible to end-users. | VARCHAR(255) | yes | yes |
all other atoms | VARCHAR(255) | yes | yes |
The last column, eq, tells whether Ampersand implements equality on these types. If equality is not defined, the operators \/
, /\
, -
, \
, /
, ;
, and <>
cannot be used.
The distinction between closed and open types is relevant in the following situations:
- The complement of a relation,
-r[A*B]
, is defined only if bothA
andB
are closed. - The full relation,
V[A*B]
is defined only if bothA
andB
are closed. - An interface
INTERFACE X : e
requires that the target ofe
is closed.
Inconsistencies are currently signaled at runtime, but future versions of Ampersand will signal these violations at compile time.
Miscellaneous
Every atom whose atomic type is marked "yes" in the column "eq" can be compared for equality. For all other atoms, equality is not defined.
The following Ampersand statement declares the atomic type of a concept:
REPRESENT <Concepts> TYPE <Atomic type>
e.g.
REPRESENT LegalEntity TYPE ALPHANUMERIC
If
Person
andCompany
are bothLegalEntity
, then both of them will be implicitly declared asALPHANUMERIC
too.