ABCL-web: RDF bindings

RDF introduction|Value accessors| Queries|HTML example

Overview

RDF bindings/wrappers, covering 'triple store' level of functionality, can act as a data layer for web applications -- they appear to be very convenient to be use in ABCL-web's web-engine. Bindings cover both basic triple access and a wrapper for SPARQL query language, that makes possible to make complex queries directly from Lisp syntax. Current implementation is at 'proof-of-concept' level, that means lots of functionality is missing, however, it's straightforward to be added.

Jena2 Java library is used as an underlying RDF/SPARQL implementation, that is quite mature, so it should provide most functionality Semantic Web Framework is supposed to have. It is possible to use SQL database as backend in Jena and have persistence this way, but currently bindings do not aim to provide persistence.

RDF introduction

RDF basically deals with triples (subject predicate object), or (subject property value), for example:

("Mary" :has-father "John")

Lots of triples can be organized into 'triple store' to make a database. (There's also notion of document in RDF, but we won't use it here). Triples can be formed from a literals (strings, integer and floating-point numbers, etc) and resources. Resources can be either named by an URI -- Uniform Resource Indentifier, which is typically URL (Uniform Resource Locator, for example, http://example.com/someResource) or URN (Uniform Resource Name, for example, urn:oid:2.16.840), in this case it's possible to refer to resource externally, or it can be an anonumous resource called 'blank node', that can be used locally (for example, compared), within single model (database) or document, but they cannot be referenced from outside. W3C prefers to use HTTP URLs, but it appears to be somewhat strange to use http URL for an object that cannot be retrieved via HTTP (for example, to denote a person, see a Jena's tutorial for an example), thus we'd better use URNs.
In example above, "Mary" and "John" are literals, but property name is a resource -- it's very unusual to have property that is literal. :has-father can be a short form of full URN urn:x-ibash:prop:has-father, that looks like Lisp's keyword symbol, by the way. It's not a good idea to make has-father property defined for literals, it's better to introduce Mary and John as resources, for example:

		urn:x-ibash:person:Mary :has-father urn:x-ibash:person:John.
		urn:x-ibash:person:Mary :first-name "Mary".
		urn:x-ibash:person:Mary :last-name "Johnson".
		urn:x-ibash:person:John :first-name "John".
		urn:x-ibash:person:John :last-name "Johnson".
	

In this example, we are using meaningful URN just for easiness of reading, it does not have any meaning of it's own, and we could replace it with some random uuid or just blank nodes.

		_1 :has-father _2.
		_1 :first-name "Mary".
		_1 :last-name "Johnson".
		_2 :first-name "John".
		_2 :last-name "Johnson".
	

Value accessors

So, at this point we can see that resources can represent some objects, like CLOS objects, and they can have some properties. We have function rdf-prop, that is roughly same as slot-value function for CLOS -- it provides access to values. For example:

		(defparameter *Mary* (rdf-resource "urn:x-ibash:person:Mary"))
		(rdf-prop :first-name *Mary*) ; => "Mary" ;reader
		(setf (rdf-prop :last-name *Mary*) "Kennedy") ; change value with writer
		(rdf-prop :last-name *Mary*) ; => "Kennedy"
		(rdf-prop :has-father *Mary*) ; => we'll see an object representing resource here
		(rdf-prop :first-name 
			(rdf-prop :has-father *Mary*)) ; => "John", we can see Mary's father first name this way
	

As you can see, we have found resource of Mary with rdf-resource function, and then we used rdf-prop function to read and write values.

We can programmatically construct object of Mary, we'll use blank node (made with rdf-new-blank-object):

		(let ((mary (rdf-new-blank-object))
		      (john (rdf-new-blank-object)))

		    (setf (rdf-prop :first-name mary) "Mary"
		          (rdf-prop :last-name mary) "Johnson"
		          (rdf-prop :has-father mary) john
		          (rdf-prop :first-name john) "John"
		          (rdf-prop :last-name john) "Johnson"))
	

But how will we find resource of Mary then?

Queries

We were using rdf-prop that is able to query for a value from a triple, given triple's predicate and subject. While CLOS is restricted to such kind of queries, RDF is not -- we can perform arbitrary queries, and those queries can return more than one object (or, possibly, zero). We can use rdf-query-1 to find Mary:

		(defvar *Mary* (rdf-query-1 (? :first-name "Mary")))
		(rdf-query-1 (*Mary* :first-name ?)) ;=> "Mary"
	

You can see that we can use rdf-query-1 to find first name of Mary, so it replace rdf-prop, but rdf-prop might be more convenient.

In more advanced queries, we can have more than one triple -- it will match whole pattern. For example, we can find child of John this way:

		(rdf-query-1 (? :has-father ?john)
		             (?john :first-name "John")
		             (?john :last-name "Johnson"))
	

You can see here a pattern of three triples, each triple being a constraint. ? and ?john are variables here. Query will try to find variables satisfying all constraints, ?john will be a temporary variable here, only ? (that is a child) will be returned.
But what if John has more that one child? We can use rdf-query macro that will return all matching objects. Moreover, we can at the same time find child's first names:

		(rdf-query		(?child :has-father ?john)
					(?child :first-name ?)
					(?john :first-name "John")
					(?john :last-name "Johnson"))
	

This will return a list of first-names of all children of John.

rdf-query can result more than one value not only for subject place -- it's possible to have more than one value, however in this case we'll need to add this values with rdf-add-statement function:

		(rdf-add-statement *John* :has-child *Mary*)
		(rdf-add-statement *John* :has-child *Jane*)
	

Powerful query macro

But what if we'd like to query multiple variables at same time, for example, both first name and last name of children?. We can fetch resources and then get first and last names:

		(loop for child in
				(rdf-query		(? :has-father ?john)						
							(?john :first-name "John")
							(?john :last-name "Johnson"))
			do (print (rdf-prop :first-name child))
			do (print (rdf-prop :last-name child)))
	

But it's easier to do it with rdf-do-query macro that can bind multiple variables at once:

		(rdf-do-query
			((?child :has-father ?john)
			 (?john :first-name "John")
			 (?john :last-name "Johnson")
			 (?child :first-name ?c-fname)
			 (?child :last-name ?c-lname))

			 (print ?c-fname)
			 (print ?c-lname))
	

rdf-do-query macro binds all variables found in pattern and iterates over all results.

Better property access

In examples above, we were using properties made as Lisp keywords, By default they are prefixed by default namespaces, that is urn:x-ibash:prop: (you can override that via *rdf-default-prop-namespace* special variable). But what if you'd like to use some other properties? You can use rdf-bind-property-to-symbol function to do this -- bind a property with arbitrary namespace and name to a Lisp keyword:

		(rdf-bind-property-to-symbol :first-name 
			'("http://www.w3.org/2001/vcard-rdf/3.0#" "FN"))
	

This will bind :first-name to a real vCard first name propery.

Also, you can create accessor function for a property with def-rdf-property macro. For example, if you have property :first-name, functions rdfp:first-name and (setf rdfp:first-name) will be introduced (rdfp stands for "RDF property"). Additionally, you can bind property as defined above:

		(def-rdf-property :first-name "http://www.w3.org/2001/vcard-rdf/3.0#" "FN")

		(rdfp:first-name *Mary*) ; => "Mary"
		(setf (rdfp:first-name *Mary*) "Merry")
	

Types

So far we were using these RDF bindings like that was native Lisp thing, like CLOS. But actually it's just bindings to Java library, and, moreover it uses somewhat different data model. Bindings try to do all conversions automatically, but sometimes there's a need to know underlying data conversions.

Functions like rdf-prop treat string values as string literals -- if you have a resource denoted by an URL, for example, you'll need to explicitly specify it resolving it via rdf-resource function: (rdf-resource "http://www.udo.com/idio"). Literals (strings and numbers) are automatically boxed/unboxed, however, when you use resources, you'll need not only to explicitly box it, but to unbox it -- it's returned as Java object. You might want to extract it's URI with Resource.getURI function.

System will also try to convert symbols to resources, but unfortunately for now it's useable only for properties, using symbols as subjects and values might be somewhat confusing.

Use in HTML templates

It's quite natural to use RDF bindings together with LML2 HTML generation facilities:

(:p
     (:table
      (:tr (:th "username") (:th "department") (:th "salary"))
      (rdf-do-query
       ((?user :username ?uname)
	(?user :salary ?salary)
	(?user :department ?dept)
	(?dept :name ?dept-name))
       (html
	(:tr
	 (:td (:princ ?uname))
	 (:td (:princ ?dept-name))
	 (:td (:princ ?salary))
	 )))))
	

SourceForge.net Logo