KRSMS Semantic Annotation Language

This is version 92. It is not the current version, and thus it cannot be edited.
[Back to current version] [Restore this version]

Intended audience

This document is intended for SEEK and Kepler developers. It is a DRAFT DESIGN DOCUMENT and does not reflect functionality as it currently exists in Kepler or SEEK. Comments and feedback are appreciated.

Introduction

This page describes an interchange syntax that can be used to express semantics types.

KR/SMS Semantic Types

A semantic type classifies and constrains the semantic, as opposed to structural interpretation of a resource. Datasets, actors (also known as services), and actor input and output ports are examples of resources that may have semantic types within SEEK.

A semantic type is expressed as a set of semantic annotations. A semantic annotation assigns objects of a resource a "meaning" via ontology expressions (that is, using ontology terms), thus serving to "link" or "glue" a portion of a resource to a portion of an ontology. In this way, the semantic interpretation of a resource (its semantic type) is built from the annotations of its parts.

Semantic types can be expressed using the following XML representation:

<sms:SemanticType id="..." xmlns:sms="http://seek.ecoinformatics.org/sms">
 
   <sms:Label name="..." resource="..."/>

   ...

   <sms:Annotation object="..." meaning="..."/>

   ...

</sms:SemanticType>

A semantic type is required to have a unique identifier, which is given by the id attribute. The identifier should (preferably) be represented as an LSID, where the semantic type is managed as an LSID data object.

Labels

Labels within a semantic-type description provide a mechanism to identify and name the resources and ontology terms used in the corresponding annotations. A Label element assigns the value of the name attribute to the associated resource identified by the resource attribute value. Each Label element is required to have exactly one name and resource attribute. A SemanticType element must contain at least two Label elements: one identifying an actor or dataset and the other identifying an ontology term. Further, no two Label elements within a semantic type may have the same name attribute value.

The first label shown below associates a dataset to the name crops and the second label associates an ontology concept to the name Biodiversity.

<sms:Label name="crops" resource="KBS019-003"/>

<sms:Label name="Biomass" resource="http://seek.ecoinformatics.org/seek/ontos/DefaultOnto#Biomass"/>

Annotations

An annotation asserts that an object of a resource has a particular meaning according to definitions within an ontology. The object and meaning attributes of an Annotation element relate the associated object and ontology expressions, respectively. We provide a uniform annotation language for identifying resource objects and specifying ontology expressions.

Some resources (in particular, data sets and actors with input/output ports) can have complex data structures. For example, a data set typically is structured according to a schema, which specifies among other things a relation name (that is, the name of the table) and names for each attribute of the relation. Actor ports can also have complex structure, including arbitrarily nested relations. The annotation language facilitates the selection of the various (sub-) objects of structured resources.

The annotation language has two forms: a simple "shorthand" syntax, and a more complex, full syntax. For resources and their objects, the simple syntax permits the following expressions given a resource label T and attributes A1 to An:

T

T.A1

T.A1.A2. ... .An

The expression T selects corresponding objects of the resource. For example, if the resource is a data set, T selects the tuple objects of the resource. If the resource is an actor, T selects instances of the actor. Similarly, the expression T.A1 selects the nested A1 objects of T. For T representing a data set, this expression would select the values of attribute A1. The last expression selects nested attributes for complex structures that can occur in actor input/output ports. For example, if T represented an input port to some actor, the expression T.A1.A2 selects the A2 objects nested within A1 objects of T instances.

In the simple-version of the annotation syntax ontology expressions consist of single concept labels C defined as resources.

To illustrate, consider the following semantic-type definition for a data-set resource.

<sms:SemanticType id="mySemType" xmlns:sms="http://seek.ecoinformatics.org/sms" xmlns:ont="http://seek.ecoinformatics.org/seek/ontos/DefaultOnto#">

  <sms:Label name="Crops"         resource="KBS019-003"/>
  <sms:Label name="Measurement"   resource="ont:Measurement"/>
  <sms:Label name="Biomass"       resource="ont:Biomass"/>
  <sms:Label name="Species"       resource="ont:Species"/>
  <sms:Label name="Year"          resource="ont:Year"/>
  <sms:Label name="Location"      resource="ont:Location"/>

  <sms:Annotation object="Crops"         meaning="Measurement"/>
  <sms:Annotation object="Crops.bm"      meaning="Biomass"/>
  <sms:Annotation object="Crops.spp"     meaning="Species"/>
  <sms:Annotation object="Crops.yr"      meaning="Year"/>
  <sms:Annotation object="Crops.station" meaning="Location"/>

</sms:SemanticType>

In this simple example, we (1) associate the label Crops to the data set resource, (2) associate the remaining labels to corresponding ontology concepts (note that we simplify their definitions via an XML namespace), (3) state with the first annotation that each Crops tuple is a Measurement, (4) state with the second annotation that each bm attribute value is a Biomass, (5) state with the thrid annotation that each spp attribute value is a Species, and so on.

For convenience, we also permit ontology concept expressions to be "embedded" within labels. The purpose of this features is to allow one to specialize certain concepts to more accurately annotate objects, without having to go through the process of creating a new ontology, or editing an existing one. These concept definitions are expressed using OWL[1].

To illustrate, the previous semantic type is shown below, but containing an embedded concept. (Note that the definition below violates the namespace usage in OWL to simplify the definition).

<sms:SemanticType id="mySemType" xmlns:sms="http://seek.ecoinformatics.org/sms" 
                                 xmlns:ont="http://seek.ecoinformatics.org/seek/ontos/DefaultOnto#"
                                 xmlns:owl="http://www.w3.org/2002/07/owl#"
                                 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
                                 xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">

   <sms:Label name="Crops" resource="KBS019-003"/>

   <sms:Label name="MyMeasurement">
      <sms:Resource sms:resourceType="OWL">
         <owl:equivalentClass>
            <owl:intersectionOf rdf:parseType="Collection">
              <owl:Class rdf:resource="ont:Measurement"/> 
              <owl:Class rdf:resource="ont:SubjectiveObservation"/>
            </owl:intersectionOf>
         </owl:equivalentClass>
      </sms:Resource>
   </sms:Label>

   ...

   <sms:Annotation object="Crops" meaning="MyMeasurement"/>

   ...

</sms:SemanticType>

Examples

[#1] Perhaps we also can offer Sparrow syntax as well.

Comments

Go to top More info... Attach file...

This particular version was published on 03-Mar-2005 19:40:44 PST by SDSC.bowers.