Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



KRSMS Semantic Annotation Language

Difference between version 138 and version 137:

At line 1 added 1 line.
+
Lines 4-5 were replaced by lines 5-8
- It is a __DRAFT DESIGN DOCUMENT__ and does not reflect functionality as it
- currently exists in Kepler or SEEK. Comments and feedback are appreciated.
+ It is a __DRAFT DESIGN DOCUMENT__ and does not reflect functionality
+ as it currently exists in Kepler or SEEK. Comments and feedback are
+ appreciated (see [Comments Page|KRSMSSemanticAnnotationComments]).
+
Line 9 was replaced by lines 12-13
- This page describes an interchange syntax that can be used to express semantic types.
+ This document describes an interchange syntax that can be used to express
+ semantic types.
Line 14 was replaced by lines 18-28
- A __semantic type__ classifies and constrains the semantic, as opposed to structural interpretation of a __resource__. Datasets, actors (also known as services), and actor input and output ports are examples of resources that may have semantic types within SEEK.
+ A __semantic type__ classifies and constrains the semantic, as opposed
+ to structural interpretation of a __resource__. Datasets, actors
+ (services), and actor input and output ports are examples of resources
+ that may have semantic types within SEEK.
+
+ A semantic type is expressed as a set of __semantic annotations__. The
+ purpose of a semantic annotation is to assign objects of a resource a
+ "meaning" using ontology terms. A semantic annotation serves to "link"
+ a portion of a resource to a portion of an ontology. In this way, the
+ semantic interpretation of a resource (its semantic type) is built
+ from the annotation of its parts.
Line 16 was replaced by lines 30-31
- A semantic type is expressed as a set of __semantic annotations__. The purpose of a semantic annotation is to assign objects of a resource a "meaning" using ontology terms. A semantic annotation serves to "link" a portion of a resource to a portion of an ontology. In this way, the semantic interpretation of a resource (its semantic type) is built from the annotation of its parts.
+ Semantic types can be expressed using the following XML
+ representation (see below [3] for the DTD).
Removed line 18
- Semantic types can be expressed using the following XML representation.
Line 31 was replaced by line 45
- <sms:Definitions> ... </sms:Definitions>
+ <sms:OntologyDefinitions> ... </sms:OntologyDefinitions>
Lines 36-39 were replaced by lines 50-61
- Semantic types can be uniquely identified. The unique identifier of a semantic type can be stated using the {{id}} attribute of the {{SemanticType}} element. An identifier is (preferably) expressed as a Life-Science Identifer (LSID) in which the semantic type is managed as an LSID data object. Alternatively, if a semantic type is embedded within a document, the semantic-type id can be expressed as a fragment identifier (for example, when used within EML).
-
- As shown above, a semantic type consists of a set of labels and annotations as well as an optional set of definitions. The rest of this page describes these components.
-
+ To be used within the SEEK architecture, semantic types must be
+ uniquely identified. The unique identifier of a semantic type can be
+ stated using the {{id}} attribute of the {{SemanticType}} element. An
+ identifier is (preferably) expressed as an LSID in which the semantic
+ type is managed as an LSID data object. Alternatively, if a semantic
+ type is embedded within a document, the semantic-type id can be
+ expressed as a fragment identifier (for example, when used within
+ EML).
+
+ As shown above, a semantic type consists of a set of labels, a set of
+ annotations, and an optional ontology definition section. The rest of
+ this page describes these components.
At line 42 added 16 lines.
+ !!! Semantic-Type Labels
+
+ Labels within a semantic-type description provide a mechanism to
+ identify and name the resources and ontology terms used in the
+ corresponding annotations. In a {{Label}} element, the the {{name}}
+ attribute is assigned to the resource identified by the {{resource}}
+ attribute. Each {{Label}} element is required to have exactly one
+ {{name}} and {{resource}} attribute. A {{SemanticType}} element must
+ contain at least two {{Label}} elements: one identifying the resource
+ to be annotated and the other identifying an ontology
+ concept. Further, no two {{Label}} elements within a semantic type may
+ have the same {{name}} attribute.
+
+ The first label shown below associates a dataset to the name {{crops}}
+ and the second label associates an ontology concept to the name
+ {{Biomass}}.
Removed lines 44-49
- !! Semantic-Type Labels
-
- Labels within a semantic-type description provide a mechanism to identify and name the resources and ontology terms used in the corresponding annotations. In a {{Label}} element, the value of the {{name}} attribute is assigned to the associated resource identified by the value of the {{resource}} attribute. Each {{Label}} element is required to have exactly one {{name}} and {{resource}} attribute. A {{SemanticType}} element must contain at least two {{Label}} elements: one identifying an actor or dataset and the other identifying an ontology term. Further, no two {{Label}} elements within a semantic type may have the same {{name}} attribute value.
-
- The first label shown below associates a dataset to the name {{crops}} and the second label associates an ontology concept to the name {{Biomass}}.
-
Lines 59-61 were replaced by line 91
- !! Semantic Annotations
-
- An annotation asserts that an object of a resource has a particular meaning according to definitions within an ontology. The {{object}} and {{meaning}} attributes of an {{Annotation}} element relate the object and ontology expressions, respectively. We provide a uniform __annotation language__ for identifying resource objects and specifying ontology expressions.
+ !!! Semantic Annotations
At line 62 added 25 lines.
+ An annotation asserts that an object of a resource has a particular
+ meaning according to definitions within an ontology. The {{object}}
+ and {{meaning}} attributes of an {{Annotation}} element relate the
+ object and ontology expressions, respectively. We provide a uniform
+ __annotation language__ for identifying resource objects and
+ specifying ontology expressions.
+
+ Some resources (in particular, data sets and actors with input/output
+ ports) can have complex data structures. For example, a data set
+ typically is structured according to a schema, which specifies among
+ other things a relation name (that is, the name of the table) and
+ names for each attribute of the relation and their data types. Actor
+ ports can also have complex structure, including arbitrarily nested
+ relations. The annotation language facilitates the selection of the
+ various (sub-) objects of structured resources. The entire resource
+ itself can also be selected using the annotation language.
+
+ The annotation language has two forms: an abbreviated syntax, and a
+ more complex, full syntax.
+
+ !! The Abbreviated Annotation-Language Syntax
+
+ For expressing annotation objects, the abbreviated syntax permits the
+ following atoms given a resource label {{T}} and attributes {{A1}} to
+ {{An}}.
Removed lines 64-71
- Some resources (in particular, data sets and actors with input/output ports) can have complex data structures. For example, a data set typically is structured according to a schema, which specifies among other things a relation name (that is, the name of the table) and names for each attribute of the relation and their data types. Actor ports can also have complex structure, including arbitrarily nested relations. The annotation language facilitates the selection of the various (sub-) objects of structured resources. The entire resource itself can also be selected using the annotation language.
-
- The annotation language has two forms: an abbreviated syntax, and a more complex, full syntax.
-
- ! The Abbreviated Annotation-Language Syntax
-
- For expressing annotation objects, the abbreviated syntax permits the following atoms given a resource label {{T}} and attributes {{A1}} to {{An}}.
-
Line 82 was replaced by lines 129-143
- The atom {{T}} selects corresponding objects of the resource. For example, if the resource is a data set, {{T}} selects the tuple objects of the resource. If the resource is an actor, {{T}} selects instances of the actor. The expression {{T.A1}} selects the nested {{A1}} objects for objects of {{T}}. For {{T}} representing a data set, {{T.A1}} selects the values of attribute {{A1}} for tuples of {{T}}. The last atom selects nested attributes for complex structures occuring, for example, in actor input/output ports. For instance, if {{T}} represents an input port to some actor[1], {{T.A1.A2}} selects the {{A2}} objects nested within {{A1}} objects for {{T}} objects.
+ The first atom {{T}} selects corresponding objects of the
+ resource. For example, if the resource is a data set, {{T}} selects
+ the tuple objects of the resource. If the resource is an actor, {{T}}
+ selects instances of the actor. The second atom {{T.A1}} selects
+ {{A1}} objects contained within {{T}} objects. For {{T}} representing
+ a data set, {{T.A1}} selects the values of attribute {{A1}} for tuples
+ of {{T}}. The last atom selects nested attributes for complex
+ structures occuring, for example, in actor input/output ports. For
+ instance, if {{T}} represents an input port to some actor[1],
+ {{T.A1.A2}} selects the {{A2}} objects nested within {{A1}} objects
+ contained in {{T}} objects.
+
+ Atoms can be combined to form expressions. In particular, an
+ expression is either: (a) a single atom or (b) a comma-separated
+ list of atoms of the form {{T.A1}} or {{T.A1.A2. ... An}}.
Line 84 was replaced by lines 145-146
- Atoms can be combined to form expressions. In particular, an expression is composed of: (a) a single atom or (b) a comma-separated list atoms of the form {{T.A1}} or {{T.A1.A2. ... An}}.
+ In the abbreviated syntax, ontology expressions only consist of a
+ single concept label {{C}}.
Line 86 was replaced by lines 148-149
- In the abbreviated syntax, ontology expressions can only consist of a single concept label {{C}}.
+ To illustrate, consider the following semantic-type description for
+ the {{crops}} data-set resource.
Removed line 88
- To illustrate, consider the following semantic-type definition for a data-set resource.
Removed line 109
- In this simple example, we (1) associate the label {{crops}} to the data-set resource identifed as {{KBS019-003}}, (2) associate the remaining labels to corresponding ontology concepts (simplifying their identifiers using XML namespaces), (3) state with the first annotation that each {{crops}} tuple is a {{Measurement}} instance, (4) state with the second annotation that each {{bm}} attribute value is a {{Biomass}} instance, (5) state with the thrid annotation that each {{spp}} attribute value is a {{Species}} instance, and so on.
At line 110 added 8 lines.
+ In this simple example, we (1) associate the label {{crops}} to the
+ data-set resource identifed as {{KBS019-003}}, (2) associate the
+ remaining labels to corresponding ontology concepts (simplifying their
+ identifiers using XML namespaces), (3) state with the first annotation
+ that each {{crops}} tuple is a {{Measurement}} instance, (4) state
+ with the second annotation that each {{bm}} attribute value is a
+ {{Biomass}} instance, (5) state with the thrid annotation that each
+ {{spp}} attribute value is a {{Species}} instance, and so on.
Line 115 was replaced by lines 184-197
- ! Semantic-Type Definitions
+ !! Semantic-Type Ontology Definitions
+
+ For convenience, we permit ontology concept definitions to be directly
+ included within a semantic type using the {{OntologyDefinitions}}
+ element. The purpose of this features is to allow specialized concept
+ definitions to more accurately annotate objects, without having to go
+ through the process of creating a new ontology, or editing an existing
+ one. These concept definitions are expressed using OWL[2].
+
+ To illustrate, part of the previous semantic type is shown below with an
+ embedded concept. (Note that to simplify the definition below we take
+ liberty with the use of namespaces in OWL). This embedded concept
+ definition states that {{MyMeasurement}} is both a {{Measurement}} and
+ a {{SubjectiveObservation}}.
Removed line 117
- For convenience, we permit ontology concept definitions to be directly included within a semantic type using the {{Definitions}} element. The purpose of this features is to allow one to specialize certain concepts to more accurately annotate objects, without having to go through the process of creating a new ontology, or editing an existing one. These concept definitions are expressed using OWL[2].
Removed lines 119-121
- To illustrate, the previous semantic type is shown below with an embedded concept. (Note that to simplify the definition below we take liberty with the use of namespaces in OWL). This embedded concept definition states that {{MyMeasurement}} is both a {{Measurement}} and a {{SubjectiveObservation}}.
-
-
Line 129 was replaced by lines 207-211
- <sms:Label name="Crops" resource="KBS019-003"/>
+ <sms:Label name="crops" resource="KBS019-003"/>
+
+ <sms:Label name="Measurement" resource="MyMeasurement"/>
+
+ <sms:Annotation object="crops" meaning="MyMeasurement"/>
Lines 131-132 were replaced by lines 213-215
- <sms:Label name="MyMeasurement">
- <sms:Resource sms:resourceType="OWL">
+ <sms:OntologyDefinitions>
+
+ <owl:Class rdf:ID="MyMeasurement">
Lines 135-136 were replaced by lines 218-219
- <owl:Class rdf:resource="ont:Measurement"/>
- <owl:Class rdf:resource="ont:SubjectiveObservation"/>
+ <owl:Class rdf:resource="ont:Measurement"/>
+ <owl:Class rdf:resource="ont:SubjectiveObservation"/>
Removed lines 140-144
- </sms:Label>
-
- ...
-
- <sms:Annotation object="Crops" meaning="MyMeasurement"/>
Line 146 was replaced by line 224
- ...
+ </sms:OntologyDefinitions>
Lines 152-154 were replaced by line 230
- ! Full Annotation-Language Syntax
-
- The full annotation-language syntax provides access to various parts of a complex structure and the ability to assign those parts to ontology expressions. To support a wide variety of complex structures -- the primary ones including relational, XML, and Ptolemy types -- we consider a generic data model consisting of nested-relational-style constructs. In addition, we permit multi-valued attributes in which an attribute can have an associated collection of values.
+ !! The Full Annotation-Language Syntax
Line 156 was replaced by lines 232-244
- The abbreviated annotation-language syntax is shorthand for a subset of the full syntax. In the full syntax, resource expressions consist of lists of atoms (separated by commas) taking one of the following forms.
+ The full annotation-language syntax provides more access to various
+ parts of a complex structure and the ability to assign those parts to
+ more complex ontology expressions. To support a wide variety of
+ structural models -- the primary ones including relational, XML, and
+ the Ptolemy type system -- we consider a generic model consisting
+ of nested-relational-style constructs. In addition, we permit
+ multi-valued attributes in which an attribute can have an associated
+ collection of values.
+
+ The abbreviated annotation-language syntax is shorthand for a subset
+ of the full syntax (we give more details of the relationship below).
+ In the full syntax, resource expressions consist of lists
+ of atoms (separated by commas) taking one of the following forms.
Line 165 was replaced by lines 253-331
- Here, symbols {{x}} and {{y}} denote either constants or variables. Variables are prefixed with a $ sign. Constants that contain spaces must be delimited using single quotes. For {{x}} and {{y}} constants, the atom {{x:T}} is true if {{x}} is a {{T}} object, and the atom {{x[[A1=y]}} is true if {{x}} is an object that has {{y}} as one of its {{A1}} attribute values.
+ Here, symbols {{x}} and {{y}} denote either constants, variables, or
+ skolem terms. Variables are prefixed with a '$' sign. Constants that
+ contain spaces must be delimited using single quotes. A skolem term
+ takes the form {{f(z1, ..., zn)}} for symbols {{z1}} to {{zn}} and {{n
+ > 0}}.
+
+ For {{x}} and {{y}} constants, the atom {{x:T}} is true if {{x}} is a
+ {{T}} object, and the atom {{x[[A1=y]}} is true if {{x}} is an object
+ that has {{y}} as one of its {{A1}} attribute values.
+
+ Complex expressions are constructed as follows: each atom is an
+ expression; expressions {{x:T}} and {{x[[A1=y]}} can be composed to
+ form the expression {{x:T[[A1=y]}}; expressions {{x[[A1=y]}} and
+ {{x[[A2=z}} can be composed to form the expression {{x[[A1=y, A2=z]}};
+ expressions {{y:T1[[A2=z]}} and {{x:T[[A1=y]}} can be composed to form
+ the expression {{x:T[[A1=y:T1[[A2=z]]}}; and so on.
+
+ This same syntax is used to describe ontology expressions, where {{T}}
+ can be replaced with a concept label {{C}} and {{A1}}} represents a
+ property label. For {{x}} and {{y}} constants, the atom {{x:C}} is
+ true if {{x}} is an instance of concept {{C}}, and the atom
+ {{x:[[A1=y]}} is true if {{x}} has {{y}} as one of its {{A1}} property
+ values.
+
+ The meaning of an annotation using the full syntax can be interpreted
+ as follows. Assume we have an annotation {{A}} such that {{R}} is the
+ expression selecting resource objects (the expression in the
+ {{object}} attribute) and {{O}} is the expression selecting ontology
+ objects (the expression in the {{meaning}} attribute). The annotation
+ is a constraint that says whenever the {{object}} attribute is true,
+ the {{meaning}} attribute is true. Let {{Vo}} be the set of variables
+ in the {{object}} expression and {{Vm}} be the set of variables in the
+ {{meaning}} expression not in {{Vm}}. We interpret {{A}} as:
+ {{(forall Vo) R => (exists Vm) O}}. That is, the annotation asserts
+ that for each variable assignment making {{R}} true there are variable
+ assignments for {{Vm}} that make {{O}} true.
+
+ For instance, consider the semantic type below, which is a more
+ detailed version of the previous semantic type.
+
+ {{{
+ <sms:SemanticType id="mySemType" xmlns:sms="http://seek.ecoinformatics.org/sms" xmlns:ont="http://seek.ecoinformatics.org/seek/ontos/DefaultOnto#">{{{
+ <sms:Annotation object="T.A1" meaning="C"/>
+
+ <sms:Annotation object="$x:T[A1=$y]" meaning="$y:C"/>
+ }}}
+
+ <sms:Label name="crops" resource="KBS019-003"/>
+ <sms:Label name="Measurement" resource="ont:Measurement"/>
+ <sms:Label name="Biomass" resource="ont:Biomass"/>
+ <sms:Label name="Species" resource="ont:Species"/>
+ <sms:Label name="Year" resource="ont:Year"/>
+ <sms:Label name="Location" resource="ont:Location"/>
+ <sms:Label name="measProp" resource="ont:measurementProperty"/>
+ <sms:Label name="measItem" resource="ont:measurementItem"/>
+ <sms:Label name="measContext" resource="ont:measurementContext"/>
+
+ <sms:Annotation object="$x:crops" meaning="$x:Measurement"/>
+ <sms:Annotation object="$x:crops[bm=$y]" meaning="$x[measProp=$y:Biomass]"/>
+ <sms:Annotation object="$x:crops[spp=$y]" meaning="$x[measItem=$y:Species]"/>
+ <sms:Annotation object="$x:crops[yr=$y]" meaning="$x[measContext=$y:Year]"/>
+ <sms:Annotation object="$x:crops[station=$y]" meaning="$x[measContext=$y:Location]"/>
+
+ </sms:SemanticType>
+ }}}
+
+ The advantage of using full syntax here is that we can properly
+ connect the attributes of a given tuple to its proper semantic
+ components.
+
+ Another advantage of using the full syntax is that it can provide
+ support for data sets that have "promoted" data to schema. Consider
+ the following semantic-type description for a data set with attributes
+ {{station}}, {{MEDSA}}, and {{GLYMX}}, where {{MEDSA}} and {{GLYMX}}
+ are species codes whose values are biomass measurements.
+
+
+ {{{
+ <sms:SemanticType id="mySemType" xmlns:sms="http://seek.ecoinformatics.org/sms" xmlns:ont="http://seek.ecoinformatics.org/seek/ontos/DefaultOnto#">
Line 167 was replaced by lines 333-345
- Atoms can be composed to form more complex expressions as follows. Atoms {{x:T}} and {{x[[A1=y]}} can be composed to form the expression {{x:T[[A1=y]}}. Atoms {{x[[A1=y]}} and {{x[[A2=z}} can be composed to form the expression {{x[[A1=y, A2=z]}}. In a similar way, atoms and expressions (or multiple expressions) can be composed to form additional expressions.
+ <sms:Label name="ds" resource="..."/>
+ <sms:Label name="Measurement" resource="ont:Measurement"/>
+ <sms:Label name="Biomass" resource="ont:Biomass"/>
+ <sms:Label name="Species" resource="ont:Species"/>
+ <sms:Label name="Location" resource="ont:Location"/>
+ <sms:Label name="measProp" resource="ont:measurementProperty"/>
+ <sms:Label name="measItem" resource="ont:measurementItem"/>
+ <sms:Label name="measContext" resource="ont:measurementContext"/>
+
+ <sms:Annotation object="$x:ds[site=$y, MEDSA=$z]"
+ meaning="f1($x):Measurement[measContext=$y:Location, measProp=$z:Biomass, measItem=MEDSA]"/>
+ <sms:Annotation object="$x:ds[site=$y, GLYMX=$z]"
+ meaning="f2($x):Measurement[measContext=$y:Location, measProp=$z:Biomass, measItem=GLYMX]"/>
Line 169 was replaced by lines 347-348
- The same syntax is used to describe the meaning of an annotation. In paricular, {{T}} must be a concept label, and A1 a property label.
+ </sms:SemanticType>
+ }}}
Line 171 was replaced by lines 350-354
- The meaning of a full annotation can be interpreted as follows. Define {{var}} function. Then we have {{forall var(...) ... -> exists var(...) ...}}
+ Here, each tuple of the dataset represents two distinct measurements
+ of biomass: one for the MEDSA species and the other for the GLYMX
+ species. The skolem terms {{f1($x)}} and {{f2($x)}} distinguish these
+ two observations given a tuple {{$x}}, that is, the skolem terms can be
+ seen as an creating new objects from the original object {{$x}}.
Removed line 173
- For instance, ... give the one from above and notice the difference in clarity.
At line 174 added 1 line.
+ The abbreviated syntax has a natural "translation" to the full syntax. In particular, the following two annotations are equivalent.
At line 175 added 2 lines.
+ {{{
+ <sms:Annotation object="T" meaning="C"/>
At line 176 added 2 lines.
+ <sms:Annotation object="$x:T" meaning="$x:C"/>
+ }}}
At line 177 added 1 line.
+ For atoms {{T.A1}}, the following two annotations are equivalent.
Line 179 was replaced by lines 367-368
- ----
+ {{{
+ <sms:Annotation object="T.A1" meaning="C"/>
Line 181 was replaced by lines 370-391
- [#1] We note that actor ports may not always be represented as an identifiable resource, and instead may be modeled as components of an actor. For example, consider an actor ''A'' having two ports ''P1'' and ''P2''. For the case where ''P1'' and ''P2'' are not separate resources, we can define the structural type of ''A'' as having two attributes {{P1}} and {{P2}} where {{A.P1}} denotes port ''P1'' and {{A.P2}} denotes port ''P2''.
+ <sms:Annotation object="$x:T[A1=$y]" meaning="$y:C"/>
+ }}}
+
+ For atoms {{T.A1.A2. ... .An}}, the following two annotations are equivalent.
+
+ {{{
+ <sms:Annotation object="T.A1.A2. ... .An" meaning="C"/>
+
+ <sms:Annotation object="$x:T[A1=$y1], $y2:[A2=$y3] ... $yn-1:[An=$yn]" meaning="$yn:C"/>
+ }}}
+
+ And finally, atoms of the form {{T.A1, T.A2, ..., T.Am}}, the
+ following two annotations are equivalent, where {{f}} is a unique
+ skolem symbol.
+
+ {{{
+ <sms:Annotation object="T.A1, T.A2, ..., T.Am" meaning="C"/>
+
+ <sms:Annotation object="$x:T, $x[A1=$y1], $x[A2=$y2], ..., $x[Am=$ym]" meaning="f($x, $y1, $y2, ..., $ym):C"/>
+ }}}
+
+ ----
Line 183 was replaced by lines 393-399
- [#2] Perhaps originally converted from a Sparrow expression.
+ [#1] We note that actor ports may not always be represented as an
+ identifiable resource, and instead may be modeled as components of an
+ actor. For example, consider an actor ''A'' having two ports ''P1''
+ and ''P2''. For the case where ''P1'' and ''P2'' are not separate
+ resources, we can define the structural type of ''A'' as having two
+ attributes {{P1}} and {{P2}} where {{A.P1}} denotes port ''P1'' and
+ {{A.P2}} denotes port ''P2''.
At line 184 added 1 line.
+ [#2] Perhaps originally converted from a Sparrow expression.
At line 185 added 4 lines.
+ [#3] The semantic type interchange syntax DTD is:
+ {{{
+ ...
+ }}}

Back to KRSMS Semantic Annotation Language, or to the Page History.