The Datastore API includes a RESTful interface to load, store and delete records from the system.

/db Method Description
GET Returns an overall summary of all records of all types
POST Stores the supplied records

The /db URL gives an overall summary of the records defined in CycleServer, categorized by type. It always includes the two built-in types Type and Attribute, plus any additional types defined. For instance, with 342 Condor slots and 100 Condor jobs stored, the response might be as follows:

<?xml version="1.0"?>
<classads>
  <c>
     <a n=”ForType”><s>Type</s></a>
     <a n=”KeyAttributes”><l><s>ForType</s></l></a>
     <a n=”Count”><i>4</i></a>
  </c>
  <c>
     <a n=”ForType”><s>Attribute</s></a>
     <a n=”KeyAttributes”><l><s>ForType</s><s>ForName</s></l></a>
     <a n=”Count”><i>290</i></a>
  </c>
  <c>
     <a n=”ForType”><s>Condor.Slot</s></a>
     <a n=”KeyAttributes”><l><s>Name</s></l></a>
     <a n=”Count”><i>342</i></a>
  </c>
  <c>
     <a n=”ForType”><s>Condor.Job</s></a>
     <a n=”KeyAttributes”><l><s>cycleJobKey</s></l></a>
     <a n=”Count”><i>100</i></a>
  </c>
</classads>
Note: For a POST, the records must have the AdType attribute defined because it cannot be inferred

from the URL. To control the format, you can add the format parameter to any request that accepts or returns records.

format=xml|native|text|csv|json

Here is an example of how to query for for AdTypes and to view the output in the “text” format.

When querying the REST interface with curl, you specify the format using a querystring

$ curl 'http://CYCLE_SERVER/db/Application.Timer?format=text'

Note that you must use & instead of ? in front of every parameter after the first. Here are a few examples of that showing format=” with a few other parameter that you will meet shortly:

# show the names of Timers known to Cycle Server
$ curl 'http://CYCLE_SERVER/db/Application.Timer?attr=Name&format=text'

# find all running Timers
$ curl 'http://CYCLE_SERVER/db/Application.Timer?f=(CurrentState===%22Running%22)&format=text'

Note: authentication is required to query the REST interface

$ curl -u "username:passwd" 'http://CYCLE_SERVER/db/Application.Timer'

First, there is the default XML format used above:

<?xml version="1.0"?>
<classads>
 <c>
  <a n=”AdType”><s>Filesystem</s></a>
  <a n=”Machine”><s>server1</s></a>
  <a n=”Mount”><s>/</s></a>
  <a n=”FSType”><s>ext2</s></a>
  <a n=”Space”><i>1000000000</i></a>
  <a n=”Usage”><i>434034034</i></a>
 </c>
</classads>

There is support for CSV:

AdType,Machine,Mount,FSType,Space,Usage
Filesystem,server1,/ext2,1000000000,434034034

Here is the text format

AdType = ”Filesystem”
Machine = ”server1”
Mount = ”/”
FSType = ”ext2”
Space = 1000000000
Usage = 434034034

Finally, also in 4.0 there is a native classad format:

{
  [
    AdType = “Filesystem”;
    Machine = “server1”;
    Mount = “/”;
    FSType = “ext2”;
    Space = 1000000000;
    Usage = 434034034;
  ]
}
/db/{T} Method Description for Given AdType
GET Returns all records of Type T
PUT Synchronizes the supplied records with all existing records of type T
POST Stores the supplied records (which must be of type T)
DELETE Removes all records of type T

In general, for a given /db/{T}/… URL, a GET returns the current records that correspond to that URL. A DELETE to that URL deletes records that match that URL. A PUT to that URL overwrites all existing records that correspond to that URL with the records given in the body. This is a synchronize operation: records supplied in the body are stored as-is, and “missing” records (existing records that match the URL but are not in the body) are deleted from CycleServer. The net effect is that a PUT to a URL will cause a subsequent GET to that same URL to download only those records that were PUT.

A POST is a relaxed form of PUT: records given in the body are stored, but no records are deleted as a result of the operation (although the existing version of the records supplied are overwritten).

Let’s walk through creating and deleting a record. Here are two records describing background jobs for CycleServer to execute periodically.

<?xml version="1.0"?>
<classads>
<c>
  <a n="AdType"><s>Application.Timer</s></a>
  <a n="Name"><s>Boil the Oceans</s></a>
  <a n="Plugin"><s>cycle.application.task.cleanup</s></a>
  <a n="Interval"><rt>1:00</rt></a>
</c>
<c>
  <a n="AdType"><s>Application.Timer</s></a>
  <a n="Name"><s>Cool the Seas</s></a>
  <a n="Plugin"><s>cycle.application.task.cleanup</s></a>
  <a n="Interval"><rt>1:00</rt></a>
</c>
</classads>

To create the record in the datastore, first write the above to timers.xml and then execute the following command:

curl -u user:pass -X POST -H 'Content-Type: application/xml' 
  'http://localhost:8080/db/Application.Timer?format=xml' --data-binary @timers.xml

Note

Do not forget to set the Content-Type. Omitting it is a frequent source of errors.

Because the URL specifies the type, the AdType attribute is not required in the definition of the ad. (In fact, it is overwritten if it is.) Furthermore, records in the body of a PUT or POST that do not match the URL are ignored.

For instance, to get all Condor slot records, make a GET to http://CYCLE_SERVER/db/Condor.Slot, which will return something like the following:

<?xml version="1.0"?>
<classads>
<c>
  <a n=”AdType”><s>Condor.Slot</s></a>
  <a n=”Name”><s>slot1@server1</s></a>
  <a n=”Activity”><s>Idle</s></a>
  <a n=”LoadAvg”><r>0.0</r></a>
</c>
<c>
 <a n=”AdType”><s>Condor.Slot</s></a>
 <a n=”Name”><s>slot2@server1</s></a>
 <a n=”Activity”><s>Busy</s></a>
 <a n=”LoadAvg”><r>1.2</r></a>
</c>
</classads>

To limit the records returned, you can add a filter constraint:

/db/{T}[filter_constraints] Description
GET Returns all records of type T where the filter is true
PUT Synchronizes the supplied records with all existing
records where the filter is true
POST Stores the supplied records (which must be of type T), if
the filter is true for the supplied records
DELETE Removes all records of type T where the filter is true

The filter constraint takes two forms. The simpler is a list of name-value pairs separated by semicolons:

;attr1==value1;attr2==value2

This is equivalent to string(attr1)==”value1” && string(attr2)==”value2” but it does not need URL-encoding unless the value itself contains characters like “=”, “;”, etc. For more general classad expressions, you can add an ?f= parameter. For example, to use a constraint of attr1==attr2||attr3>0, you would URL encode it and supply it as the f parameter: ?f=attr1%3D%3Dattr2%7C%7Cattr3%3E0. You can use a website like dencoder to encode your url when using curl to interact the Datastore.

f=constraint

Alternately, you can use the Python library urllib to URL encode your classad expressions:

import urllib
url = 'http://CYCLE_SERVER/db/resource?f=%s' % urllib.quote('attr1==attr2||attr3>0')

There can be as many name-value constraints as needed, but these must all come before the “?” (if any). There can only be one f= parameter, and it must come after a ?. There are other parameters supported as well, and all of these must come after a single “?”.:

attr=attr1,attr2,...attrN

To limit the downloaded data, you can add the attr= parameter, which is a comma- separated list of attribute names. If this attribute is present, GET will only return those attributes, and PUT, POST and DELETE are not allowed. Here is an example to show instance ids for all AWS instances:

http://CYCLE_SERVER/db/AWS.Instance?attr=InstanceId&format=text

Here is an example returning the instance id, public hostname, and public IP of every AWS instance::

http://CYCLE_SERVER/db/AWS.Instance?attr=InstanceId,PublicHostname,PublicIp&format=text

In some cases, the records being stored will not include all the attributes needed. For those cases, you can add a set_A=V parameter to set or overwrite attribute A to V on a PUT or POST. (GET and DELETE are not allowed.) For example, POSTing slot records to the following URL would cause every ad posted to include a Disabled attribute with a value of TRUE::

http://CYCLE_SERVER/db/Condor.Slot?set_Disabled=true

For practice, let’s delete the Application.Timer we created earlier. Note that the classad expression Name==”Boil the Oceans” has been encoded so that it can be a legal part of the URL:

curl -u user:pass -X DELETE 
  'http://localhost:8080/db/Application.Timer?f=Name%3D%3D%22Boil%20the%20Oceans%22'

To check that the “Boil The Oceans” task has been deleted, execute:

curl -u user:pass 'http://localhost:8080/db/Application.Timer?f=Name%3D%3D%22Boil%20the%20Oceans%22'

The response should be empty. For the sake of completeness, let’s make sure that our “Cool the Seas” task is still there.:

curl -u user:pass 'http://localhost:8080/db/Application.Timer?f=Name%3D%3D%22Cool%20the%20Seas%22&format=json'

The response should be:

[
{
  "AdType" : "Application.Timer",
  "Id" : {
  "$id" : "#571a4140-0031-524738736a1c-0"
  },
  "Name" : "Cool The Seas"
}
]

Queries

/exec/query [filter constraints] Description
GET Runs the query in the URL and returns the resulting records
POST Runs the query in the body and returns the resulting records

The /exec/query URL runs a text-based SELECT … query. You can specify it as the q parameter for a GET, or in the body (as text/plain) for a POST. It supports the standard f parameter, attr parameter, and ;name=param filtering, as well as the policy_* and policyexpr_* parameters.

The example above to return the instance id, public hostname, and public IP of every AWS instance could instead be written as::

http://CYCLE_SERVER/exec/query?q=select InstanceId,PublicHostname,PublicIp from AWS.Instance&format=text

An /exec/query URL is often easier to assemble than the corresponding /exec/aggregation URL.

Updates

/exec/statement Description
POST Runs the given modification statement

The /exec/statement URL runs a text-based UPDATE …, DELETE … or PURGE … query. Because it modifies data, it only supports POST (as text/plain). If the URL contains an s parameter, that is used instead of the body. It supports the policy_* and policyexpr_* parameters.

This warrants special emphasis. You can delete a record with a POST request. This works differently than using the HTTP DELETE verb to delete a resource.:

curl -u user:pass -H 'Content-Type: text/plain' -X POST 
  http://localhost:8080/exec/statement --data-binary 
  'DELETE FROM Application.Timer where Name=="Boil the Oceans"'

Sequences

/exec/sequence/{name} Description
POST Returns the next value for a sequence

The /exec/sequence URL returns the next value or values from a sequence. The values are returned as unquoted plain text, with one value per line. If name is not supplied, it pulls from the default sequence. If the count=N parameter is included, it returns the next N values.

Aggregation

CycleServer allows you to group records by common attributes and compute any combination of aggregation functions on each ad’s attributes. For instance, to determine how much compute power each user is currently using, you would group slot records by the RemoteOwner attribute and add up the slot’s KFlops attribute. If there were two users on the grid, then the result might be the following::

RemoteOwner = ”jsmith”
TotalKFlops = 152504148

RemoteOwner = ”bjones”
TotalKFlops = 9004148

The /exec/aggregation URL pattern lets you do this.

/exec/aggregation/{T}/{G1},..
{Gn}/C1=F1(A1),..Cn=Fn(An)
HTTP Method
GET Groups all records of type T by attributes G1…Gn,
computes the aggregation functions F1…Fn on
attributes A1…An, and returns the results as
records with attributes G1…Gn and C1…Cn

The following aggregation functions are available:

function description
sum(attribute) The total value of that attribute in each group
avg(attribute) The average value of that attribute in each group
min(attribute) The minimum value of that attribute in each group
max(attribute) The maximum value of that attribute in each group
count(attribute) The number of records that contain that attribute
in each group
count(*) The number of records in each group
any(attribute) The value of the attribute for one of the records
in the group, typically used to get text values
that do not vary within the group

The aggregation supports a filter to limit the records that are aggregated, with the same specification as the /db URL. In addition, any of the values being computed can be qualified with an additional filter that determines whether the current record should be included in that value.

For the above slot example, you would make an HTTP GET to:

http://CYCLE_SERVER/exec/aggregation/Condor.Slot/RemoteOwner/TotalKFlops=sum(KFlops)?format=text

The body of the response would be the above two records.

As another example, suppose you wanted to find out how many jobs a user had submitted to each scheduler, and how many of them were running. First, the group-by attributes are Owner and Queue. To limit the jobs to just the ones for a single user jsmith, you would add a constraint of ;Owner=jsmith to the URL. The first function, to count all jobs, is just count(*). The second function is also a count(*), but to count just running jobs requires a constraint on a single aggregation function, rather than a constraint on the whole query. This is done with the “if” qualifier (expressed as a ?): count(*)?JobStatus==2, which URL-encodes to count(*)%3FJobStatus%3D%3D2.

The complete URL is thus:

/exec/aggregation/CondorJob/Owner,Queue/total=count(*),running=count(*)%3FJobStatus%3D%3D2;Owner=jsmith?format=text

This would return the following results::

total = 1000
running = 251

total = 100
running = 19

Note that the group-by attributes are not included in the result. To include the group-by attributes, add and any() to the URL. For example:

/exec/aggregation/CondorJob/Owner,Queue/Owner=any(Owner),Queue=any(Queue),total=count(*),running=count(*)%3FJobStatus%3D%3D2;Owner=jsmith?format=text

Using Python’s Requests to interact with the Datastore

As you have more sophisticated interactions with the ReSTful interface, you will more likely spend more time using Python’s Requests library to interact with the DataStore. As noted earlier, it is very important to set the Content-Type HTTP header to the appropriate value. Requests handles many low-level HTTP details for you but that this is not one of them. Use the application/json mime-type when submitting JSON data and application/xml for xml.

Here we create the same Application.Timer records using only Python

import requests
import json
session = requests.session()
session.auth = ('user', 'pass')
session.headers['Content-Type'] = 'application/json'

timers = [
           {
           "AdType": "Application.Timer",
           "Name": "Boil the Oceans",
           "Plugin": "cycle.application.task.cleanup",
           "Interval": "1:00"
           },
           {
           "AdType": "Application.Timer",
           "Name": "Cool the Seas",
           "Plugin": "cycle.application.task.cleanup",
           "Interval": "1:00"
           }
         ]

response = session.post('http://localhost:8080/db/Application.Timer?input_format=json', json.dumps(timers))

Now let’s delete the “Boil the Oceans” task. Note that HTTP DELETEs do not explicitly support submitting a JSON or XML document in the body of the request so we must encode our filter into the URL.

import requests
import urllib
session = requests.session()
session.auth = ('user', 'pass')
session.headers['Content-Type'] = 'application/json'
# We must `quote` the special characters in expression so that it can be part of a url
expression = urllib.quote("Name=="Boil the Oceans"")
response = session.delete('http://CYCLESERVER/db/Application.Timer?f=%s' % expression)
assert response.status_code == 200

To verify the record has been deleted:

response = session.get('http://CYCLESERVER/db/Application.Timer?f=%s' % expression)
# no records returned
assert response.content == ''