[RIL] New Versa draft
Mike Olson
Mike.Olson@fourthought.com
18 Jan 2002 21:16:33 -0700
On Mon, 2001-12-31 at 00:07, Uche Ogbuji wrote:
> I've attached (sorry not to post it on the net yet) an updated draft of Versa,
> the query portion of RIL.
>
> Some examples:
>
> * all resources of type "h:Person"
>
> type(h:Person)
>
> or using a traversal expression
>
> all() - rdf:type -> h:Person
>
>
> * the name of all people
>
> type(h:Person) - h:formattedName -> *
>
> or using abbreviated traversal expressions
>
> h:formattedName(type(h:Person))
We did decide to support this? I thought there was a lot of opposition
to it? I like it though so I'm not going argue against it :)
Mike
>
>
> * all people named "Ezra Pound"
>
> type(h:Person) - h:formattedName -> eq("Ezra Pound")
>
>
> The spec still needs work, which I hope to finish up next week, and put up the
> spec and an issue tracker. Until then, if you're inclined, just post prelim
> comments here.
>
>
> --
> Uche Ogbuji Principal Consultant
> uche.ogbuji@fourthought.com +1 303 583 9900 x 101
> Fourthought, Inc. http://Fourthought.com
> 4735 East Walnut St, Boulder, CO 80301-2537, USA
> XML strategy, XML tools (http://4Suite.org), knowledge management
>
> ----
>
Versa
Mike Olson (Fourthought, Inc.)Mike Olson (Fourthought, Inc.)
Revision (Initial release) [MO]
________________________________________________________________________
________________________________________________________________________
Versa is a specialized language for addressing and querying nodes and
arcs in a Resource Description Framework (RDF) model. It uses a simple
and expressive syntax, designed to be incorporated into other expression
systems, including XML, where, for instance, Versa can be used in
extension functions or attributes of extension elements that provide
RDF-related capabilities. Versa operates on the abstract graph model of
RDF, and not any particular serialization.
Where used in this document, the keywords "SHOULD", "MUST", and "MUST
NOT" are to be interpreted as described in RFC 2119 [RFC2119]. However,
for readability, these words do not appear in all uppercase letters in
this specification.
Versa uses constructs from XML namespaces for convenient abbreviation of
URIs. Within this document, in examples and other discussion, some
prefixes are commonly used without being expanded. These prefixes are to
be considered bound to the following namespaces:
* h: http://rdfinference.org/eg/humanitas
* dc: http://purl.org/dc/elements/1.1
* daml: http://www.daml.org/2001/03/daml+oil#
And it should be noted that any other prefixes can be used equivalently
in such expressions as long as they are bound to the same model.
DARPA Agent Markup Language (DAML), while mostly based on RDF 1.0, does
include some slight modifications and enhancements to RDF semantics.
None of these changes appear to affect the abstract model, but Versa is
designed to work with syatems that use RDF, RDF Schema (RDFS) and DAML.
Versa operates on the abstract graph model of RDF. As such it operates
on labeled nodes and arcs. In support of this processing, Versa defines
a small set of standard data types.
A resource is a special string-like object that represents the URI of a
resource in the model. Its literal expression can be in one of two
forms. The first is as a simple QName (as defined in [XMLNS]), in which
case the URI mapped to the prefix used in the QName is expanded to a
string and concatenated to the local portion of the QName to derive the
resource object's URI. Versa does not define a mechanism for mapping
prefixes to URIs. Such a facility must be provided by a Versa
implementation. For instance, when Versa is expresed within an XML
document, prefixes might be mapped according to the namespace
declarations in scope of the relevant element. A resource can also be
expressed using the full URI in string form within curly braces. This
form is not a true literal: the curly braces are actually a conversion
operator that take a string and return a resource object (See the
section on conversions below.)
The following are examples of literal resources:
spam:eggs
if the prefix spam is mapped to the
URI http://python.org/, the
resulting resource has the URI
http://python.org/eggs.
myobj:oute66
if the prefix myobj is mapped to the
URI
urn:oid:this.is.not.really.a.valid.oid.r, the resulting resource has the
URI urn:oid:this.is.not.really.a.valid.oid.route66.
{"http://rdfinference.org"}
A resource with URI
http://rdfinference.org
Note that the lexical rules of XML QNames may limit the situations in
which this abbreviation may be used in Versa.
A sequence of zero or more characters, each character of which is
defined as in the XML 1.0 recommendation. Versa strings are the same as
XPath strings.
Literal strings again are are expressed as in XPath: using either single
or double quotes. The following are examples of literal strings:
"What thou lovest well remains, the
rest is dross"
'What thou lovest well remains, the
rest is dross'
Equivalent to the above
"Use character entities
One can use character entities for
clarity or to express forbissen
characters according to XML 1.0
rules
"Embedded'Apostrophe"
If ythe string contains an
apostrophe or quotaion mark, you
must use the other to delimit the
string.
Note that it is impossible to represent a Versa string literal with both
a quotation mark and an apostrophe. Such a string would have to be
computed somehow, most commonly using the concat function (see below).
Versa numbers are the same as XPath numbers: positive or negative
floating-point numbers, based on the rules and semantics for double
precision, 64-bit numbers in IEEE 754.
The following are examples of literal numbers:
2
Note that this is not stored or
processed as an integer, since it is
actually a floating poing number
expressed in abbreviated form.
3.14
pi to three decimal places
6.022e23
Avogadro's number: an example of
using scientific notation
Boolean types represent logical truth or falsehood. As such there are
two boolean literals: true and false. * is provided as a synonym for
true: a more readable form in such cases as traversal expressions.
An homogeneous ordered collection of any data type (including other
lists or sets). Duplicates are allowed.
["W. B. Yeats", "T. S. Eliot"]
[2000, 2001, 2000]
[x:epound, x:tseliot,
{"http://rdfinference.org/eg/versa/wyeats"}]
[]
An empty list
An homogeneous unordered collection of any data type (including other
sets or lists), with no duplicate values.
There is strictly no list literal. Lists can be expressed most simply
using the set() conversion function (see below) with simple values as
the arguments. The following are examples of sets:
set("J. Alfred Prufrock", "Hugh
Selwyn Mauberley")
set(4.7)
Just one item
When implicit conversions between data types are needed, the following
matrix defines the operations applied.
From/To
Resource
String
Number
Boolean
List
Set
Resource
Identity
A resource
with a URI
as given
by the
string,
using
escaping
as
required
The result
of
converting
the
number,
and then
to a
resource *
false if
the
resource
is the nil
resource,
otherwise
true
List of
length one
with the
resource
in it.
Set of
length one
with the
resource
in it.
String
A resource
with a URI
as given
by the
string,
using
escaping
as
required
Identity
The number
that is
represented by the string, or NaN
false if
the string
is empty,
otherwise
true
List of
length one
with the
string in
it.
Set of
length one
with the
string in
it.
Number
A resource
with a URI
as given
by the
string,
using
escaping
as
required.
String
representation of the number
Identity
false if
the number
is
positive
or
negative
0,
otherwise
true
List of
length one
with the
number in
it.
Set of
length one
with the
number in
it.
List
The result
of
conversion
to
resource
of the
first item
in the
list, or
nil if the
list is
empty
The result
of
conversion
to string
of the
first item
in the
list, or
nil if the
list is
empty
The result
of
conversion
to number
of the
first item
in the
list, or
nil if the
list is
empty
false if
the list
is empty,
otherwise
true
Identity
A set with
the same
entries as
the list,
except
that if
there are
duplicate
values,
any
equivalent
values
following
the first
are
omitted
(e.g.
set(list(1,2,1)) = set(1,2))
Set
The result
of
conversion
to
resource
of the
first item
in the
set, or
nil if the
list is
empty
The result
of
conversion
to string
of the
first item
in the
set, or
nil if the
list is
empty
The result
of
conversion
to number
of the
first item
in the
set, or
nil if the
list is
empty
false if
the set is
empty,
otherwise
true
A list
with the
same
entries as
the set,
in
arbitrary
order
Identity
* Note that conversions from non-string literals to URIs are subject to
change as the RDFCore working group works on URI-based datatype literal
representations
Versa allows queries themselves to be trated as first class objects. In
particular, query objects can be passed to functions where they are
dynamically evaluated as needed. Query objects are dynamically evaluated
using a context that is determined by the semantics of the function in
which they are used. See each function's specification for details.
Query objects can be assigned to variables, and used as arguments in
certain functions, but there are no conversions to or from query
objects. Query object literals are of the form
q(query)
Conversion functions are special functions that take a single argument,
which is an object of any data type, and return the conversion of this
object to a particular data type, using the conversion rules described
above.
list(expression[, expression, [...]])
Create a list comprising each of the arguments in order.
set(expression[, expression, [...]])
Create a set comprising each of the arguments with duplicate values
removed.
Return the boolean value of the argument
Versa defines queries. A query is either a traversal or a function. A
traversal is an expression that matches patterns in the RDF model by
specifying the valid starting points, ending points and arcs to be
traversed. If the query is a traversal the results of the query will be
a list or set. If the query is a function the results can be any data
type.
Many Versa constructs are evaluated with regard to a context. The
context is a value of any data type, and it can always be referred to in
an expression using the token "."
Note that there are always a set of variable bindings in effect, and a
set of function definitions in scope, but these are not formally
considered part of the context, as they are in XPath. Variables and
functions are discussed below.
Traversal expressions are the core of Versa. They provide a system for
matching patterns in an RDF model by specifying desired nodes and arcs
in the graph representing the model. The traversal operator is the basis
of a traversal expression, and results in a set or list.
The forward traversal operator allows matches patterns based on given
subjects and predicates. It returns a set of resulting objects. It takes
the following form:
set-expression - set-expression -> boolean-expression
The first set-expression is a set of resources which are the subjects of
statements, the predicates of which are given by the resources in the
second set expression. This results in a set of matching statements, and
the object of each statement is evaluated as the context of the boolean
expression. If the result, after conversion to boolean type, is true,
the object is added to the set of the results. Conversions are
automatically applied as necessary for the set expressions: any Versa
expression can actually be used, the final operation being the
conversion of whatever data the expression returned to a set. In
addition, each member of the resulting set is converted to a resource
for pattern checking within the model.
Editor's note: when data types are formally incorporated into the Versa
model, the treatment of object set expressions will certainly change
An abbreviated syntax is allowed for the forward traversal operator. It
is a resource literal in the form of a QName and followed by
parentheses, as follows:
qname(set-expression)
which is equivalent to the following traversal expression:
set-expression - qname -> *
The set expression can be omitted, in which case it defaults to ".". An
abbreviated forward traversal thus retrieves a set of all objects of
statements with one of the resources in the set from the expression, and
the predicate given by the resource literal.
The backward traversal operator is similar to the forward traversal
operator, but it is used to match patterns using the inverses of
predicates. A backward traversal expression takes the following form:
set-expression <- set-expression - boolean-expression
The first set-expression is a set of resources which are the objects of
statements, the predicates of which are given by the resources in the
second set expression. This results in a set of matching statements, and
the subject of each statement is evaluated as the context of the boolean
expression. If the result, after conversion to boolean type, is true,
the subject is added to the set of results. Conversions are
automatically applied, as with forward traversal expressions.
[1] query ::= expression
[2] expression ::= traversal | base-expression
[3] base-expression ::= '( expression ')' | variable-reference | literal | function-call
[4] traversal = forward-traversal | backward-traversal | abbreviated-forward-traversal
[5] forward-traversal = set-expression "-" set-expression "->" boolean-expression
[6] abbreviated-forward-traversal = qname "(" set-expression ")"
[7] backward-traversal = set-expression "<-" set-expression "-" boolean-expression
[8] set-expression ::= expression
[9] boolean-expression ::= expression
[10] function-call ::= function-name '(' ( expression ( ',' expression ) * ) ? ')'
[11] literal = string-literal | number-literal | list-literal | boolean-literal
[12] predicate = '[' boolean-expression ']'
[13] variable-reference ::= '$' variable-name
Versa defines a core function library. Extension functions can be
defined using the same mechanism as provided by XPath.
Return true if the number value of argument 1 is less then the number
value of argument 2.
Return true if the number value of argument 1 is greater then the number
value of argument 2.
Return true if the number value of argument 1 is less then or equal the
number value of argument 2.
Return true if the number value of argument 1 is greater then or equal
the number value of argument 2.
First, convert argument 2 to the same type as argument 1. Then compare
them and return true if they are equal. Resource, String, Number and
Date compare as values. List and Set are equal if len(intersection(a,b))
== len(a) == len(b)
not(eq(a,b))
Set and list functions are functions operate on or return sets and
lists.
distribute(list-expression, query-object, [query-object, [...]])
distribute converts the first argument into a list. The second and
subsequent arguments (the query arguments) are Versa queries. It uses
each item in the list as the context for evaluating each of the query
arguments. The result is a list of lists; each entry in the outer list
is a list containing the results from evaluating each of the query
arguments in order using the Nth list item as context.
For example, the query:
distribute(list({"http://4suite.org"}, {"http://rdfinference.org"}), q(.), q(string-length()), q(substring-after(., ":")))
returns
[[{"http://4suite.org"}, 17, "4suite.org"], [{"http://rdfinference.org"}, 23, "rdfinference.org"]]
The outer list is of length two because there are two items in the first
argument. Each inner list has three items because there are so many
query arguments.
map(query-object, list-expression, [list-expression, [...]])
map takes the Versa query given as the first argument and evaluates it
with one or more lists as the context. These lists are constructed as
follows: The first item from each of the list expressions in the second
and subsequent arguments are gathered into a list, as long as at least
one list from the list expression arguments has an item. Then the second
item if taken from each list expression, if at least one of them has two
or more items, and so on, with as many iterations as the longest list
from the list expression arguments. If the lists from the list
expression arguments are of differing lengths, then all lists that are
shorter than the longest are padded with nil resources (daml:nil).
The result is a list of values, as long as the longest item in the list
from the list expression arguments.
As an example, the query:
map(q(concat-string()), ["A", "B", "C"], ["1", "2", "3"])
Will return a list of length 3:
["A1", "B2", "C3"]
And the query:
map(q(h:formatted-name()), h:author(h:principia))
Returns the formatted name of the author of the book identified as
"h:principia", and thus in our sample model would return
["Isaac Newton"]
This is equivalent to the chained traversal expression
h:principia - h:author -> * - h:formatted-name -> *
filter(list-expression, query-object, [query-object, [...]])
filter converts the first argument into a list. Then each item is used
as the context for evaluating all the boolean expressions. If all of
these evaluations return true, then the resource is added to the result
list.
sort(list-expression[, conversion-indicator[, direction-indicator[, query-object ]]])
The argument is converted to a list or a set. The result is the list
obtained by sorting according to the given criteria. The second
parameter indicates the conversion that should be applied to each item
before sorting, and the style of the resulting sort. It must be a
resource with one of the following URIs:
* http://rdfinference.org/versa/sort/string: convert to string
and sort according to unicode sorting conventions [provide
reference]
* http://rdfinference.org/versa/sort/number: convert to number
and sort according to the magnitude of the number
The default is http://rdfinference.org/versa/sort/string. The third
parameter indicates the direction of sorting. It is converted to
resource and must have one of the following URIs:
* http://rdfinference.org/versa/sort/ascending: sort in
ascending order
* http://rdfinference.org/versa/sort/descending: sort in
descending order
The default is http://rdfinference.org/versa/sort/ascending.
max(list-expression[, conversion-indicator[, query-object ]])
The argument is converted to a list or a set. The result is the maximum
value in the list according to the given criteria. The second parameter
indicates the conversion that should be applied to each item before
sorting, and the style of the resulting sort. It must be a resource with
one of the following URIs:
* http://rdfinference.org/versa/sort/string: convert to string
and sort according to unicode sorting conventions [provide
reference]
* http://rdfinference.org/versa/sort/number: convert to number
and sort according to the magnitude of the number
The default is http://rdfinference.org/versa/sort/string.
max($a, $b, $c) is equivalent to head(sort($a, $b, v:descending, $c))
min(list-expression[, conversion-indicator[, query-object ]])
The argument is converted to a list or a set. The result is the minimum
value in the list according to the given criteria. The second parameter
indicates the conversion that should be applied to each item before
sorting, and the style of the resulting sort. It must be a resource with
one of the following URIs:
* http://rdfinference.org/versa/sort/string: convert to string
and sort according to unicode sorting conventions [provide
reference]
* http://rdfinference.org/versa/sort/number: convert to number
and sort according to the magnitude of the number
The default is http://rdfinference.org/versa/sort/string.
min($a, $b, $c) is equivalent to head(sort($a, $b, v:ascending, $c))
union(set-expression, set-expression)
Both arguments are converted to sets, and the result is a set consisting
of all items that are in either argument set.
intersection(set-expression, set-expression)
Both arguments are converted to sets, and the result is a set consisting
of all items that are in both argument sets.
difference(set-expression, set-expression)
Both arguments are converted to sets, and the result is a set consisting
of all items that are in neither argument set.
concat(list-expression[, list-expression, [...]])
Each argument is converted to a list, and the result is a list which
consists of the concatenation of all the argument lists in order.
head(list-expression, [number-expression])
Converts the first argument to a list L, and the second to a number N.
Returns a list consisting of the first N items in L. N defaults to 1. If
N is negative, or exceeds the length of the list, the entire list is the
result.
rest(list-expression, [number-expression])
Converts the first argument to a list L, and the second to a number N.
Returns a list consisting of all items in L after position N. N defaults
to 1. If N is negative, or exceeds the length of the list, an empty list
is the result. The following expression returns the same list as L,
regardless of the value of N:
concat(first(L, N), rest(L, N))
tail(list-expression, [number-expression])
Converts the first argument to a list L, and the second to a number N.
Returns a list consisting of the last N items in L. N defaults to 1. If
N is negative, or exceeds the length of the list, an empty list is the
result.
length(list-expression)
Converts the argument to a list and returns the number of items in the
list.
Create a list of lists [clarify the semantics of this...]
Return a sub list. [clarify the semantics of this...]
all([query-object, [query-object, [...]]])
Without any arguments, all returns a list of all resources in the model.
If there are arguments, they are treated as query objects, evaluated and
the results converted to boolean such that , it is a short cut for "
all(qo1, qo2, ..., qoN)
Is equivalent to
filter(all(), qo1, qo2, ..., qoN)
type(resource-expression)
Returns a list of all resources of a specified type, as defined by RDFS
and optionally DAML schema specifications. This function is essentially
a short cut for:
all() - rdf:type -> *
Number Functions are functions that work with numbers. All return number
types
String Functions are functions that work with strings. All return string
types
Boolean Functions are functions that work with booleans. All return
number types of 0 or 1
Return true if the boolean value of argument one and argument two are
true
Return true if boolean value of argument one or argument two are true
If the boolean value of argument is true, return false. If the boolean
value of argument is false return true
Return true if the argument is a resource. The argument defaults to .
Return true if the argument is not a resource. The argument defaults to
.
Return true if argument1 is in argument 2.
Definition of all expressions that can be used to generate sets (or
lists). Set functions all return lists, traversals always return a list.
Resource Expressions return a single resource which is converted to a
list.
Definition of all expressions that can be used to as filters. Boolean
Functions all return a boolean value. Relational Functions all return a
boolean value. and the special "*" character is used to represent that
no filter is needed.
Resource Expressions are a way to represent a single unique resource in
the model. If a qname is used, the name is expanded using an available
prefix to namespace mapping. If the "{}" operators are used, then the
string literl must represent the full URI. In both cases, the resulting
string must be a resource in the model. "." and "current()" are ways to
access the current resource from the context list.
In order to guide the development of Versa some use cases for RDF query
have been developed. This section presents these use cases, as well as
how they can be addressed using the current specification of Versa
Often one wants to simply check a model for all resources with a given
RDF type
Versa provides the type function to deal with this common case
conveniently:
type(h:Person)
returns
set(h:epound, h:teliot, h:wyeats)
Or the using traversal expressions (and giving the same result):
all() - rdf:type -> h:Person
type(h:Person) - h:formattedName -> eq("Ezra Pound")
which results in
set(h:epound)
There are alternative ways to express this. For instance, using
abbreviated forward traversal and filters:
filter(h:formattedName(type(h:Person)), eq("Ezra Pound"))
Remember that we have provided no range constraint on the h:author
predicate, so we must explicitly check that objects are of type person.
Note that we only want to get one of the people, so if we use a
traversal expression, which results in a set, we must then extract one
of the entries from the set. We shall use a backward traversal
expression:
head(h:formattedName(type(h:Poem) - dc:title -> eq("The Love Song of J Alfred Prufrock") - h:author -> *))
distribute(max(all() - h:author -> *, v:number, h:age()), q(h:age()), q(h:formattedName()))
A DAML-aware Versa implementation will allow easy querying of explicitly
transitive properties, but often one needs to interpret properties
tarnsitively without help from the schema.
--
Mike Olson Principal Consultant
mike.olson@fourthought.com +1 303 583 9900 x 102
Fourthought, Inc. http://Fourthought.com
4735 East Walnut St, http://4Suite.org
Boulder, CO 80301-2537, USA
XML strategy, XML tools, knowledge management