The fundamental unit of data storage in CycleServer is a ”record”. A ”record” is a set of name-value pairs that define an entity in CycleServer or in an external system that CycleServer is monitoring, such as a grid or other distributed environment. Records are schemaless, so they can contain any relevant attributes describing the entity as long as they have an AdType attribute to identify the type of the record.

For example, suppose we wanted to track the filesystems on a machine. We could define a record type to represent the properties of a filesystem, such as:

AdType = ”Filesystem”
Machine = ”server1”
Mount = ”/”
FSType = ”ext2”
Space = 1000000000
Usage = 434094105

Records are schema-less, so we do not need to define these attributes in advance. We do need to define information about the type (specifically, which attributes uniquely identify a record of that type.) But there is a benefit to defining additional information about the type (e.g., Filesystem), and the attributes that are associated with that type (e.g., Machine, Mount, FSType, etc). For instance, CycleServer takes advantage of record metadata to determine how to display values for a given attribute and to add help items like tooltips when appropriate. This metadata is defined in separate metadata records.

There are two main types of metadata records:

  • Type records describe the type of other records. These must be explicitly created (and stored in
    the datastore) by the plugin or component before storing records of that type.
  • Attribute records that describe the attributes of other records. These are created automatically
    by CycleServer as needed, but may also be created in advance.

ClassAd Language

Records in the CycleServer datastore are implemented as classads according to the ClassAd Specification. The ClassAd Specification may be used as a reference for the base expression language used by the CycleServer datastore.

Note

In some cases older CycleServer documentation and APIs may use the abbreviation “ad” to refer to a “ClassAd”. Because this terminology can be confusing, the more generic term “record” is preferred. Similarly, the term “adstore” may be used to refer to the datastore. The term “datastore” is preferred where possible.

ClassAds support a multi-typed evaluation language:

  • integer and reals: 123, 1.0
  • strings: "hello world"
  • booleans: true, false
  • timestamps (AbsTime): `2011-12-13T12:00:00`
  • durations (RelTime): `1.5m`, `90`
  • lists: { 123, "hello world" }
  • records: [ count = 123; message = "hello world"; 'send time'=`2011-12-13T12:00:00`; ]
  • attribute references: message, 'send time'
  • composite expressions: a + b + floor(c)
  • special values: undefined, error

Note

Note: the list and record delimiters are the opposite of Python/Javascript!

CycleServer also supports some Cycle-specific extensions:

  • blobs: up to 2GB, with an optional metadata record: blob("aGVsbG8=", [ MimeType : "text/plain"
    ])
  • identifiers: 128-bit values that can be autogenerated: #123,
    #4ee73065-0021-292c0a0f-59a9-1

The standard ClassAd comparison operators are supported:

  • Equals (==): True or false for numbers, strings, booleans, AbsTimes, or RelTimes, defined as you
    would expect; error for any other arguments. Note: this comparison is case-insensitive for
    strings, and ignores the timezone for dates.
  • Identical (is, =?=): True or false for every operator (never error), case-sensitive. Gotchas:
  • "abc" is "ABC" -> false
  • 123 is 123.0 -> false
  • { 123, "abc" } is { 123, "abc" } -> false
  • `2011-12-13T12:00:00-0500` is `2011-12-13T11:00:00-0600` -> false

In addition, CycleServer adds the following comparison operators:

  • In (in): used to test for membership in a list.
  • 123 in { 123, 456 } -> true
  • 123 in { "123" } -> false
  • { 123, "abc" } in { { 123.0, "ABC" }, { 456, "def" } } -> true
  • { 123, "abc" } in { { 123, "def" }, { 123, "def" }, } false
  • Equivalent (===): Used for group-by operations, key comparisons, and the in operator. Always
    produces true or false (like is). It is the same as == for simple value types, but lists and
    records are compared by comparing their contents piece-wise:
  • "abc" === "ABC" -> true
  • 123 === 123.0 -> true
  • [ label = "abc" ] === [ LABEL = "Abc" ] -> true
  • { 123 } === { 123 } -> true

Note

In some parts of the code, == and is are (incorrectly) implemented more like ===.

The ClassAd language is an evaluation language. Every expression is evaluated in the context of a record:

expression: a + b
context: [ a = 123; b = 456; ]
result: 579

Expressions are eagerly evaluated, down to an atomic value or collections of atomic values:

expression: [ c = a + b; d = e ]
context: [ a = 123; b = 456; ]
result: [ c = 579; d = undefined ]

Evaluation with incompatible types produces error.

You can experiment with the ClassAd language using the Jython interpreter that ships with CycleServer. As noted above, each classad expression must be evaluated in the context of a record. For that reason we must first create a record before evaluating expressions.:

$ $CS_HOME/util/jython
Jython 2.7.1b3 (, Feb 16 2016, 20:04:41)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_77
Type "help", "copyright", "credits" or "license" for more information.
>>> from application import records
>>> from application.expressions import parse, evaluate
>>> context = records.create({ "Test": "foo" })
>>> evaluate(context, parse('"a" == "A"'))
true
>>> evaluate(context, parse('"a" is "A"'))
false
>>> evaluate(context, parse('1 in {1, 2, 3}'))
true
>>> evaluate(context, parse('1 in {2, 3}'))
false

Record Specification

A record is a set of name-value pairs in which the names are strings and the values are classad expressions. Records can be typed, in which case the type specifies certain characteristics of all records of that type. This includes: a list of well-known attributes, which can have a custom formatting function for display (time representations, etc), etc, access control on who can edit records of that type, derived attributes which are computed based on the other values, etc. The only required information about the type are the key attributes, the set of attributes that uniquely define that record. Records are defined by the key for that record’s type. The key is an ordered set of attributes that, taken together, uniquely identify a record across all records in the same type. (The ordering is to allow the key to be expressed as a simple tuple of values.) It is an error if a record does not contain values for all key attributes. When records are being stored, they overwrite records with the same key value.

If a record type is keyless, it indicates that there is no natural key in the record’s attributes. In this case there must be a single key attribute, which will be a value automatically assigned to the Id attribute (or an automatically assigned GUID if the record type has the attribute AutoGenerateKey equal to true). If a record from outside does not have the key attribute present, it will be appended to the set of existing records (and it will get that attribute defined for it). This is appropriate for event-based records, or audit log entries.

A record of a given type is a collection of name-value pairs. Each attribute of that type has properties that are also represented as name-value pairs. This means that the definition of a given attribute for a given type is also an record. The AdType for an attribute record is Attribute. Attribute records are defined by the ForName and ForType attributes. The definition for an attribute can include information like the format of that attribute’s value, the name of a plugin that generates the value for that attribute, a description of that attribute, etc. These attributes (format, plugin, description, etc) are themselves described in attribute record, but this record (the attribute record for an attribute record for a standard record) is unmodifiable.

Timeline Model

CycleServer supports two forms of hidden records that are not normally visible. First, when records are deleted, they remain in the datastore, flagged with a special _Deleted attribute set to true. Normal queries do not include these records, because by default an automatic constraint of _Deleted is false is added to every query. This constraint is not added if the query includes any reference to the _Deleted attribute, in any form. Thus if you want to include deleted rows, add _Deleted is true to the query. If you want to see all rows, deleted or not, add _Deleted is _Deleted.

Second, when records are updated, the earlier record remains in the datastore, flagged with a special __Latest attribute set to false. As with _Deleted, there is an automatic constraint of __Latest is true that gets added to every query, but you can disable this by referencing __Latest in any way. Thus, to get all records, include __Latest is __Latest.

Note

_Deleted has a single underscore and __Latest has two. Attributes that start with an underscore are hidden by default, and have a system-defined meaning, but the value can be set by the user. (For instance, you can undelete a record by setting _Deleted to false.) Attributes that start with two underscores are also hidden and have a system-defined meaning, but cannot be changed by the user.

All records get the _Timestamp attribute set when they are saved. You can specify this to be any timestamp you want, but the current time is used if you do not specify any. In addition, all records get a __SystemTimestamp attribute, which is the current time of the datastore and cannot be changed.

Query Language

CycleServer supports a query/command language similar to SQL. You can run any of these at the command line:

$CS_HOME/cycle_server execute STATEMENT

For queries that return data, the -format FORMAT argument can be used to get the output in any supported format (including csv, tabular, json, xml, and the default text).

SELECT

Select queries are a text representation of views:

SELECT * FROM YourType WHERE Attr1 == "value" ORDER BY Attr2 DESC, Attr3 ASC LIMIT 10

SELECT count(*) as Count, Attr2, sum(Attr3) as Total FROM YourType WHERE Attr1 == "value" GROUP BY Attr2

SELECT Attr1, T2.Attr2 FROM Type1 JOIN Type2 T2 ON { Attr3, Attr4 } === { T2.Attr5, strcat(T2.Attr6, T2.Attr7) }

SELECT * USING your.plugin

Select queries support the following case-insensitive elements (optional unless otherwise noted):

SELECT select_list: The attributes or expressions to select from each input record. The list can be a single * or a list of attributes or expressions. Each element supports an optional AS label modifier to set the name of the attribute in the output record.

FROM type: The name of the type to select records from. If not given, it matches all types.

WHERE expression: Limits the records that are returned to those for which the expression evaluates to true.

GROUP BY expressions: Makes this query an aggregation. Records are grouped by the values given when each expression is evaluated. If the query is an aggregation, then the select list must contain a list of either aggregation functions or the group-by expressions.

ORDER BY expression [ASC|DESC]: The value to sort the returned rows by. If the DESC keyword is given, the values are sorted in descending order. Multiple expressions can be given, each with its own ASC or DESC modifier.

LIMIT number: The number of records to return. If ORDER BY is given, rows are sorted first before being limited.

[INNER|OUTER] JOIN foo ON test: Joins to another type. You can use as many joins as you want. Note: currently this does one-to-one joins only. In general, the ON condition is a list of expressions evaluated against the main type compared to expressions evaluated against the join type, using ===.

USING datasource: Calls the given datasource plugin and gets rows from that instead of from querying stored data. The plugin can get data in any manner it wants. In this case, the filter, order by and other aspects of the view are given to the plugin, which can use them (eg for filtering) or not.

Timeline Queries

In addition to the standard SQL-style select query, CycleServer supports extended timeline queries that aggregate data over time. For instance, suppose you have a Machine type with Platform, Role, and Memory attributes, and you update it every minute with the memory currently in use on each machine. You could then run a timeline aggregation like this:

SELECT avg@(sum(Memory)) FROM Machine WHERE Role == "compute" && @duration(`1d`) GROUP BY Platform, @intervals(`1h`))

This performs a two-level aggregation over the last 24 hours (i.e., 1d) by grouping all machines with a “compute” Role by their Platform attribute and computing the sum of the memory of all machines in each group, at each instant over the 24 hours. The values for each group are averaged over each 1-hour interval in that 24 hours, and the result is output.

Every timeline aggregation is in the form temp@(agg(expression)). The expression is any supported expression evaluated on each record. The inner aggregation function can be any standard supported aggregation (eg, count(), avg(), sum(), min(), max(), etc), and aggregates the values across records at any point in time. The outer temporal aggregation aggregates values across time (that is, it downsamples or collapses adjacent samples). The following temporal aggregations are supported:

avg@(): Computes the average of the values over the interval, weighted by the time it has that value.

sum@(): Computes the sum of the values over the interval. This is appropriate for values that represent an accumulated value that is reset each time, for instance BytesTransferred.

min@(): Takes the lowest value seen over the interval.

max@(): Takes the highest value seen over the interval.

The amount of time included in the output is determined by the temporal filter in the WHERE clause. This is included with the rest of the constraint via the AND operator (&&). The following temporal filters are supported:

@timerange(T1, T2): Selects the time between T1 (inclusive) and T2 (exclusive)

@duration(D): Selects the last D amount of time (equivalent to @timerange(now()-D, now()))

@anytime(): Selects all time for which there is any stored data

@now(): Selects only the latest (ie, the default, not a timeline query at all)

The @intervals() function in the GROUP BY clause indicates that the query is a timeline aggregation. Alternatively, if the query has no GROUP BY clause but includes a temporal filter, the raw data is returned over the time given.

DELETE and PURGE

Delete statements perform a “soft delete” of data:

DELETE FROM YourType WHERE Attr1 == "value"
  • All the existing rows in the timeline remain
  • A new row is added to the timeline with _Deleted set to true so it is normally hidden
  • Deleted records are not matched by default, so “re-deleting” data has no effect

Purge statements perform a “hard delete” of data:

PURGE FROM YourType WHERE Attr1 == "value"
  • Data is removed from disk
  • For each matching record, all rows in the timeline are removed
  • Deleted records are matched by default

This means that PURGE FROM YourType removes all data from disk for YourType.

Delete and purge statements support the following elements:

FROM type: The name of the type to select records from. If not given, it matches all types.

WHERE expression: Only deletes records for which the expression evaluates to true.

Note that statements that do not include a FROM keyword must reference AdType in the WHERE clause. This allows queries that affect multiple types:

DELETE WHERE AdType in { "YourType1", "YourType2" }

UPDATE

Update statements change the values for existing records:

UPDATE YourType SET Attr2 = "value2", Attr3 = "value3" WHERE Attr1 == "value"

Update statements support the following elements:

WHERE expression: Only updates records for which the expression evaluates to true.

SET name=expression: For each record, evaluates expression in the context of the existing record and sets the value of name to the result.

STORE

Store statements add new records or overwrite existing data:

STORE [AdType="YourType"; Attr1="value1"; Attr2="value2"], [AdType="YourType"; Attr1="other1"; Attr2="other2"]

Each listed record is saved. If there is an existing one with the same key attributes, the attributes given are replaced, but other attributes on it are not affected.