|
|||
|
This is version 134.
It is not the current version, and thus it cannot be edited. Intended audienceThis document is intended for SEEK and Kepler developers. It is a DRAFT DESIGN DOCUMENT and does not reflect functionality as it currently exists in Kepler or SEEK. Comments and feedback are appreciated.
IntroductionThis page describes an interchange syntax that can be used to express semantics types.
KR/SMS Semantic TypesA semantic type classifies and constrains the semantic, as opposed to structural interpretation of a resource. Datasets, actors (also known as services), and actor input and output ports are examples of resources that may have semantic types within SEEK. A semantic type is expressed as a set of semantic annotations. The purpose of a semantic annotation is to assign objects of a resource a "meaning" using ontology terms. A semantic annotation serves to "link" a portion of a resource to a portion of an ontology. In this way, the semantic interpretation of a resource (its semantic type) is built from the annotation of its parts. Semantic types can be expressed using the following XML representation.
<sms:SemanticType id="..." xmlns:sms="http://seek.ecoinformatics.org/sms"> <sms:Label name="..." resource="..."/> ... <sms:Annotation object="..." meaning="..."/> ... <sms:Definitions> ... </sms:Definitions> </sms:SemanticType> Semantic types can be uniquely identified. The unique identifier of a semantic type can be stated using the id attribute of the SemanticType element. An identifier is (preferably) expressed as a Life-Science Identifer (LSID) in which the semantic type is managed as an LSID data object. Alternatively, if a semantic type is embedded within a document, the semantic-type id can be expressed as a fragment identifier (for example, when used within EML). As shown above, a semantic type consists of a set of labels and annotations as well as an optional set of definitions. The rest of this page describes these components.
Semantic-Type LabelsLabels within a semantic-type description provide a mechanism to identify and name the resources and ontology terms used in the corresponding annotations. In a Label element, the value of the name attribute is assigned to the associated resource identified by the value of the resource attribute. Each Label element is required to have exactly one name and resource attribute. A SemanticType element must contain at least two Label elements: one identifying an actor or dataset and the other identifying an ontology term. Further, no two Label elements within a semantic type may have the same name attribute value. The first label shown below associates a dataset to the name crops and the second label associates an ontology concept to the name Biomass.
<sms:Label name="crops" resource="KBS019-003"/> <sms:Label name="Biomass" resource="http://seek.ecoinformatics.org/seek/ontos/DefaultOnto#Biomass"/>
Semantic AnnotationsAn annotation asserts that an object of a resource has a particular meaning according to definitions within an ontology. The object and meaning attributes of an Annotation element relate the object and ontology expressions, respectively. We provide a uniform annotation language for identifying resource objects and specifying ontology expressions.
Some resources (in particular, data sets and actors with input/output ports) can have complex data structures. For example, a data set typically is structured according to a schema, which specifies among other things a relation name (that is, the name of the table) and names for each attribute of the relation and their data types. Actor ports can also have complex structure, including arbitrarily nested relations. The annotation language facilitates the selection of the various (sub-) objects of structured resources. The entire resource itself can also be selected using the annotation language. The annotation language has two forms: an abbreviated syntax, and a more complex, full syntax.
The Abbreviated Annotation-Language SyntaxFor expressing annotation objects, the abbreviated syntax permits the following atoms given a resource label T and attributes A1 to An.
T T.A1 T.A1.A2. ... .An
The atom T selects corresponding objects of the resource. For example, if the resource is a data set, T selects the tuple objects of the resource. If the resource is an actor, T selects instances of the actor. The expression T.A1 selects the nested A1 objects for objects of T. For T representing a data set, T.A1 selects the values of attribute A1 for tuples of T. The last atom selects nested attributes for complex structures occuring, for example, in actor input/output ports. For instance, if T represents an input port to some actor[2], T.A1.A2 selects the A2 objects nested within A1 objects for T objects. Atoms can be combined to form expressions. In particular, an expression is composed of: (a) a single atom or (b) a comma-separated list atoms of the form T.A1 or T.A1.A2. ... An. In the abbreviated syntax, ontology expressions can only consist of a single concept label C. To illustrate, consider the following semantic-type definition for a data-set resource.
<sms:SemanticType id="mySemType" xmlns:sms="http://seek.ecoinformatics.org/sms" xmlns:ont="http://seek.ecoinformatics.org/seek/ontos/DefaultOnto#"> <sms:Label name="crops" resource="KBS019-003"/> <sms:Label name="Measurement" resource="ont:Measurement"/> <sms:Label name="Biomass" resource="ont:Biomass"/> <sms:Label name="Species" resource="ont:Species"/> <sms:Label name="Year" resource="ont:Year"/> <sms:Label name="Location" resource="ont:Location"/> <sms:Annotation object="crops" meaning="Measurement"/> <sms:Annotation object="crops.bm" meaning="Biomass"/> <sms:Annotation object="crops.spp" meaning="Species"/> <sms:Annotation object="crops.yr" meaning="Year"/> <sms:Annotation object="crops.station" meaning="Location"/> </sms:SemanticType> In this simple example, we (1) associate the label crops to the data-set resource identifed as KBS019-003, (2) associate the remaining labels to corresponding ontology concepts (simplifying their identifiers using XML namespaces), (3) state with the first annotation that each crops tuple is a Measurement instance, (4) state with the second annotation that each bm attribute value is a Biomass instance, (5) state with the thrid annotation that each spp attribute value is a Species instance, and so on.
Semantic-Type DefinitionsFor convenience, we permit ontology concept expressions to be included in a sematnic type. The purpose of this features is to allow one to specialize certain concepts to more accurately annotate objects, without having to go through the process of creating a new ontology, or editing an existing one. These concept definitions are expressed using OWL[1]. To illustrate, the previous semantic type is shown below with an embedded concept. (Note that to simplify the definition below we take liberty with the use of namespaces in OWL). This embedded concept definition states that MyMeasurement is both a Measurement and a SubjectiveObservation.
<sms:SemanticType id="mySemType" xmlns:sms="http://seek.ecoinformatics.org/sms" xmlns:ont="http://seek.ecoinformatics.org/seek/ontos/DefaultOnto#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <sms:Label name="Crops" resource="KBS019-003"/> <sms:Label name="MyMeasurement"> <sms:Resource sms:resourceType="OWL"> <owl:equivalentClass> <owl:intersectionOf rdf:parseType="Collection"> <owl:Class rdf:resource="ont:Measurement"/> <owl:Class rdf:resource="ont:SubjectiveObservation"/> </owl:intersectionOf> </owl:equivalentClass> </sms:Resource> </sms:Label> ... <sms:Annotation object="Crops" meaning="MyMeasurement"/> ... </sms:SemanticType>
Full Annotation-Language SyntaxThe full annotation-language syntax provides access to various parts of a complex structure and the ability to assign those parts to ontology expressions. To support a wide variety of complex structures -- the primary ones including relational, XML, and Ptolemy types -- we consider a generic data model consisting of nested-relational-style constructs. In addition, we permit multi-valued attributes in which an attribute can have an associated collection of values. The abbreviated annotation-language syntax is shorthand for a subset of the full syntax. In the full syntax, resource expressions consist of lists of atoms (separated by commas) taking one of the following forms.
x:T x[A1=y] Here, symbols x and y denote either constants or variables. Variables are prefixed with a $ sign. Constants that contain spaces must be delimited using single quotes. For x and y constants, the atom x:T is true if x is a T object, and the atom x[A1=y] is true if x is an object that has y as one of its A1 attribute values. Atoms can be composed to form more complex expressions as follows. Atoms x:T and x[A1=y] can be composed to form the expression x:T[A1=y]. Atoms x[A1=y] and x[A2=z can be composed to form the expression x[A1=y, A2=z]. In a similar way, atoms and expressions (or multiple expressions) can be composed to form additional expressions. The same syntax is used to describe the meaning of an annotation. In paricular, T must be a concept label, and A1 a property label. The meaning of a full annotation can be interpreted as follows. Define var function. Then we have forall var(...) ... -> exists var(...) ... For instance, ... give the one from above and notice the difference in clarity.
[#1] Perhaps originally converted from a Sparrow expression. [#2] We note that actor ports may not always be represented as an identifiable resource, and instead may be modeled as components of an actor. For example, consider an actor A having two ports P1 and P2. For the case where P1 and P2 are not separate resources, we can define the structural type of A as having two attributes P1 and P2 where A.P1 denotes port P1 and A.P2 denotes port P2.
Comments
|
This material is based upon work supported by the National Science Foundation under award 0225676. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF). Copyright 2004 Partnership for Biodiversity Informatics, University of New Mexico, The Regents of the University of California, and University of Kansas |