SEEK-Home
About SEEK
Tools
Education
Publications
Opportunities
Community
About This Site
Calendar

KRSMS Semantic Annotation Language

Difference between version 143 and version 142:

Lines 5-7 were replaced by line 5

- It is a __DRAFT DESIGN DOCUMENT__ and does not reflect functionality

- as it currently exists in Kepler or SEEK. Comments and feedback are

- appreciated (see [Comments Page|KRSMSSemanticAnnotationComments]).

+ It is a __DRAFT DESIGN DOCUMENT__ and does not reflect functionality as it currently exists in Kepler or SEEK. Comments and feedback are appreciated (see [Comments Page|KRSMSSemanticAnnotationComments]).

Lines 12-13 were replaced by line 10

- This document describes an interchange syntax that can be used to express

- semantic types.

+ This document describes an interchange syntax that can be used to express semantic types.

Lines 18-28 were replaced by line 15

- A __semantic type__ classifies and constrains the semantic, as opposed

- to structural interpretation of a __resource__. Datasets, actors

- (services), and actor input and output ports are examples of resources

- that may have semantic types within SEEK.

- A semantic type is expressed as a set of __semantic annotations__. The

- purpose of a semantic annotation is to assign objects of a resource a

- "meaning" using ontology terms. A semantic annotation serves to "link"

- a portion of a resource to a portion of an ontology. In this way, the

- semantic interpretation of a resource (its semantic type) is built

- from the annotation of its parts.

+ A __semantic type__ classifies and constrains the semantic, as opposed to structural interpretation of a __resource__. Datasets, actors (services), and actor input and output ports are examples of resources that may have semantic types within SEEK.

Lines 30-31 were replaced by line 17

- Semantic types can be expressed using the following XML

- representation (see below [3] for the DTD).

+ A semantic type is expressed as a set of __semantic annotations__. The purpose of a semantic annotation is to assign objects of a resource a "meaning" using ontology terms. A semantic annotation serves to "link" a portion of a resource to a portion of an ontology. In this way, the semantic interpretation of a resource (its semantic type) is built from the annotation of its parts.

At line 32 added 1 line.

+ Semantic types can be expressed using the following XML representation (see below [3] for the DTD).

At line 33 added 1 line.

Lines 50-61 were replaced by lines 38-41

- To be used within the SEEK architecture, semantic types must be

- uniquely identified. The unique identifier of a semantic type can be

- stated using the {{id}} attribute of the {{SemanticType}} element. An

- identifier is (preferably) expressed as an LSID in which the semantic

- type is managed as an LSID data object. Alternatively, if a semantic

- type is embedded within a document, the semantic-type id can be

- expressed as a fragment identifier (for example, when used within

- EML).

- As shown above, a semantic type consists of a set of labels, a set of

- annotations, and an optional ontology definition section. The rest of

- this page describes these components.

+ To be used within the SEEK architecture, semantic types must be uniquely identified. The unique identifier of a semantic type can be

+ stated using the {{id}} attribute of the {{SemanticType}} element. An identifier is (preferably) expressed as an LSID in which the semantic type is managed as an LSID data object. Alternatively, if a semantic type is embedded within a document, the semantic-type id can be expressed as a fragment identifier (for example, when used within EML).

+ As shown above, a semantic type consists of a set of labels, a set of annotations, and an optional ontology definition section. The rest of this page describes these components.

Lines 67-80 were replaced by lines 47-50

- Labels within a semantic-type description provide a mechanism to

- identify and name the resources and ontology terms used in the

- corresponding annotations. In a {{Label}} element, the the {{name}}

- attribute is assigned to the resource identified by the {{resource}}

- attribute. Each {{Label}} element is required to have exactly one

- {{name}} and {{resource}} attribute. A {{SemanticType}} element must

- contain at least two {{Label}} elements: one identifying the resource

- to be annotated and the other identifying an ontology

- concept. Further, no two {{Label}} elements within a semantic type may

- have the same {{name}} attribute.

- The first label shown below associates a dataset to the name {{crops}}

- and the second label associates an ontology concept to the name

- {{Biomass}}.

+ Labels within a semantic-type description provide a mechanism to identify and name the resources and ontology terms used in the

+ corresponding annotations. In a {{Label}} element, the the {{name}} attribute is assigned to the resource identified by the {{resource}} attribute. Each {{Label}} element is required to have exactly one {{name}} and {{resource}} attribute. A {{SemanticType}} element must contain at least two {{Label}} elements: one identifying the resource to be annotated and the other identifying an ontology concept. Further, no two {{Label}} elements within a semantic type may have the same {{name}} attribute.

+ The first label shown below associates a dataset to the name {{crops}} and the second label associates an ontology concept to the name {{Biomass}}.

Lines 93-108 were replaced by line 63

- An annotation asserts that an object of a resource has a particular

- meaning according to definitions within an ontology. The {{object}}

- and {{meaning}} attributes of an {{Annotation}} element relate the

- object and ontology expressions, respectively. We provide a uniform

- __annotation language__ for identifying resource objects and

- specifying ontology expressions.

- Some resources (in particular, data sets and actors with input/output

- ports) can have complex data structures. For example, a data set

- typically is structured according to a schema, which specifies among

- other things a relation name (that is, the name of the table) and

- names for each attribute of the relation and their data types. Actor

- ports can also have complex structure, including arbitrarily nested

- relations. The annotation language facilitates the selection of the

- various (sub-) objects of structured resources. The entire resource

- itself can also be selected using the annotation language.

+ An annotation asserts that an object of a resource has a particular meaning according to definitions within an ontology. The {{object}} and {{meaning}} attributes of an {{Annotation}} element relate the object and ontology expressions, respectively. We provide a uniform __annotation language__ for identifying resource objects and specifying ontology expressions.

Lines 110-111 were replaced by line 65

- The annotation language has two forms: an abbreviated syntax, and a

- more complex, full syntax.

+ Some resources (in particular, data sets and actors with input/output ports) can have complex data structures. For example, a data set typically is structured according to a schema, which specifies among other things a relation name (that is, the name of the table) and names for each attribute of the relation and their data types. Actor ports can also have complex structure, including arbitrarily nested relations. The annotation language facilitates the selection of the various (sub-) objects of structured resources. The entire resource itself can also be selected using the annotation language.

At line 112 added 2 lines.

+ The annotation language has two forms: an abbreviated syntax, and a more complex, full syntax.

Lines 115-117 were replaced by line 71

- For expressing annotation objects, the abbreviated syntax permits the

- following atoms given a resource label {{T}} and attributes {{A1}} to

- {{An}}.

+ For expressing annotation objects, the abbreviated syntax permits the following atoms given a resource label {{T}} and attributes {{A1}} to {{An}}.

Lines 128-140 were replaced by line 82

- The first atom {{T}} selects corresponding objects of the

- resource. For example, if the resource is a data set, {{T}} selects

- the tuple objects of the resource. If the resource is an actor, {{T}}

- selects instances of the actor. The second atom {{T.A1}} selects

- {{A1}} objects contained within {{T}} objects. For {{T}} representing

- a data set, {{T.A1}} selects the values of attribute {{A1}} for tuples

- of {{T}}. The last atom selects nested attributes for complex

- structures occuring, for example, in actor input/output ports. For

- instance, if {{T}} represents an input port to some actor[1],

- {{T.A1.A2}} selects the {{A2}} objects nested within {{A1}} objects

- contained in {{T}} objects. Atoms can be combined to form expressions. In particular, an

- expression is either: (a) a single atom or (b) a comma-separated

- list of atoms of the form {{T.A1}} or {{T.A1.A2. ... An}}.

+ The first atom {{T}} selects corresponding objects of the resource. For example, if the resource is a data set, {{T}} selects the tuple objects of the resource. If the resource is an actor, {{T}} selects instances of the actor. The second atom {{T.A1}} selects {{A1}} objects contained within {{T}} objects. For {{T}} representing a data set, {{T.A1}} selects the values of attribute {{A1}} for tuples of {{T}}. The last atom selects nested attributes for complex structures occuring, for example, in actor input/output ports. For instance, if {{T}} represents an input port to some actor[1], {{T.A1.A2}} selects the {{A2}} objects nested within {{A1}} objects contained in {{T}} objects. Atoms can be combined to form expressions. In particular, an expression is either: (a) a single atom or (b) a comma-separated list of atoms of the form {{T.A1}} or {{T.A1.A2. ... An}}.

Lines 142-143 were replaced by line 84

- In the abbreviated syntax, ontology expressions only consist of a

- single concept label {{C}}.

+ In the abbreviated syntax, ontology expressions only consist of a single concept label {{C}}.

Lines 145-146 were replaced by line 86

- To illustrate, consider the following semantic-type description for

- the {{crops}} data-set resource.

+ To illustrate, consider the following semantic-type description for the {{crops}} data-set resource.

Lines 169-176 were replaced by lines 109-110

- In this simple example, we (1) associate the label {{crops}} to the

- data-set resource identifed as {{KBS019-003}}, (2) associate the

- remaining labels to corresponding ontology concepts (simplifying their

- identifiers using XML namespaces), (3) state with the first annotation

- that each {{crops}} tuple is a {{Measurement}} instance, (4) state

- with the second annotation that each {{bm}} attribute value is a

- {{Biomass}} instance, (5) state with the thrid annotation that each

- {{spp}} attribute value is a {{Species}} instance, and so on.

+ In this simple example, we (1) associate the label {{crops}} to the data-set resource identifed as {{KBS019-003}}, (2) associate the

+ remaining labels to corresponding ontology concepts (simplifying their identifiers using XML namespaces), (3) state with the first annotation that each {{crops}} tuple is a {{Measurement}} instance, (4) state with the second annotation that each {{bm}} attribute value is a {{Biomass}} instance, (5) state with the thrid annotation that each {{spp}} attribute value is a {{Species}} instance, and so on.

Lines 183-194 were replaced by lines 117-119

- For convenience, we permit ontology concept definitions to be directly

- included within a semantic type using the {{OntologyDefinitions}}

- element. The purpose of this features is to allow specialized concept

- definitions to more accurately annotate objects, without having to go

- through the process of creating a new ontology, or editing an existing

- one. These concept definitions are expressed using OWL[2].

- To illustrate, part of the previous semantic type is shown below with an

- embedded concept. (Note that to simplify the definition below we take

- liberty with the use of namespaces in OWL). This embedded concept

- definition states that {{MyMeasurement}} is both a {{Measurement}} and

- a {{SubjectiveObservation}}.

+ For convenience, we permit ontology concept definitions to be directly included within a semantic type using the {{OntologyDefinitions}} element. The purpose of this features is to allow specialized concept definitions to more accurately annotate objects, without having to go through the process of creating a new ontology, or editing an existing one. These concept definitions are expressed using OWL[2].

+ To illustrate, part of the previous semantic type is shown below with an embedded concept. (Note that to simplify the definition below we take liberty with the use of namespaces in OWL). This embedded concept definition states that {{MyMeasurement}} is both a {{Measurement}} and a {{SubjectiveObservation}}.

Lines 229-241 were replaced by line 154

- The full annotation-language syntax provides more access to various

- parts of a complex structure and the ability to assign those parts to

- more complex ontology expressions. To support a wide variety of

- structural models -- the primary ones including relational, XML, and

- the Ptolemy type system -- we consider a generic model consisting

- of nested-relational-style constructs. In addition, we permit

- multi-valued attributes in which an attribute can have an associated

- collection of values.

- The abbreviated annotation-language syntax is shorthand for a subset

- of the full syntax (we give more details of the relationship below).

- In the full syntax, resource expressions consist of lists

- of atoms (separated by commas) taking one of the following forms.

+ The full annotation-language syntax provides more access to various parts of a complex structure and the ability to assign those parts to more complex ontology expressions. To support a wide variety of structural models -- the primary ones including relational, XML, and the Ptolemy type system -- we consider a generic model consisting of nested-relational-style constructs. In addition, we permit multi-valued attributes in which an attribute can have an associated collection of values.

At line 242 added 2 lines.

+ The abbreviated annotation-language syntax is shorthand for a subset of the full syntax (we give more details of the relationship below). In the full syntax, resource expressions consist of lists of atoms (separated by commas) taking one of the following forms.

Lines 249-284 were replaced by lines 164-173

- Here, symbols {{x}} and {{y}} denote either constants, variables, or

- skolem terms. Variables are prefixed with a '$' sign. Constants that

- contain spaces must be delimited using single quotes. A skolem term

- takes the form {{f(z1, ..., zn)}} for symbols {{z1}} to {{zn}} and {{n

- > 0}}.

- For {{x}} and {{y}} constants, the atom {{x:T}} is true if {{x}} is a

- {{T}} object, and the atom {{x[[A1=y]}} is true if {{x}} is an object

- that has {{y}} as one of its {{A1}} attribute values.

- Complex expressions are constructed as follows: each atom is an

- expression; expressions {{x:T}} and {{x[[A1=y]}} can be composed to

- form the expression {{x:T[[A1=y]}}; expressions {{x[[A1=y]}} and

- {{x[[A2=z}} can be composed to form the expression {{x[[A1=y, A2=z]}};

- expressions {{y:T1[[A2=z]}} and {{x:T[[A1=y]}} can be composed to form

- the expression {{x:T[[A1=y:T1[[A2=z]]}}; and so on.

- This same syntax is used to describe ontology expressions, where {{T}}

- can be replaced with a concept label {{C}} and {{A1}} represents a

- property label. For {{x}} and {{y}} constants, the atom {{x:C}} is

- true if {{x}} is an instance of concept {{C}}, and the atom

- {{x:[[A1=y]}} is true if {{x}} has {{y}} as one of its {{A1}} property

- values.

- The meaning of an annotation using the full syntax can be interpreted

- as follows. Assume we have an annotation {{A}} such that {{R}} is the

- expression selecting resource objects (the expression in the

- {{object}} attribute) and {{O}} is the expression selecting ontology

- objects (the expression in the {{meaning}} attribute). The annotation

- is a constraint that says whenever the {{object}} attribute is true,

- the {{meaning}} attribute is true. Let {{Vo}} be the set of variables

- in the {{object}} expression and {{Vm}} be the set of variables in the

- {{meaning}} expression not in {{Vm}}. We interpret {{A}} as:

- {{(forall Vo) R => (exists Vm) O}}. That is, the annotation asserts

- that for each variable assignment making {{R}} true there are variable

- assignments for {{Vm}} that make {{O}} true.

+ Here, symbols {{x}} and {{y}} denote either constants, variables, or skolem terms. Variables are prefixed with a '$' sign. Constants that contain spaces must be delimited using single quotes. A skolem term takes the form {{f(z1, ..., zn)}} for symbols {{z1}} to {{zn}} and {{n > 0}}.

+ For {{x}} and {{y}} constants, the atom {{x:T}} is true if {{x}} is a {{T}} object, and the atom {{x[[A1=y]}} is true if {{x}} is an object that has {{y}} as one of its {{A1}} attribute values.

+ Complex expressions are constructed as follows: each atom is an expression; expressions {{x:T}} and {{x[[A1=y]}} can be composed to

+ form the expression {{x:T[[A1=y]}}; expressions {{x[[A1=y]}} and {{x[[A2=z}} can be composed to form the expression {{x[[A1=y, A2=z]}}; expressions {{y:T1[[A2=z]}} and {{x:T[[A1=y]}} can be composed to form the expression {{x:T[[A1=y:T1[[A2=z]]}}; and so on.

+ This same syntax is used to describe ontology expressions, where {{T}} can be replaced with a concept label {{C}} and {{A1}} represents a property label. For {{x}} and {{y}} constants, the atom {{x:C}} is true if {{x}} is an instance of concept {{C}}, and the atom {{x:[[A1=y]}} is true if {{x}} has {{y}} as one of its {{A1}} property values.

+ The meaning of an annotation using the full syntax can be interpreted as follows. Assume we have an annotation {{A}} such that {{R}} is the expression selecting resource objects (the expression in the {{object}} attribute) and {{O}} is the expression selecting ontology objects (the expression in the {{meaning}} attribute). The annotation is a constraint that says whenever the {{object}} attribute is true, the {{meaning}} attribute is true. Let {{Vo}} be the set of variables in the {{object}} expression and {{Vm}} be the set of variables in the {{meaning}} expression not in {{Vm}}. We interpret {{A}} as: {{(forall Vo) R => (exists Vm) O}}. That is, the annotation asserts that for each variable assignment making {{R}} true there are variable assignments for {{Vm}} that make {{O}} true.

Lines 286-287 were replaced by line 175

- For instance, consider the semantic type below, which is a more

- detailed version of the previous semantic type.

+ For instance, consider the semantic type below, which is a more detailed version of the previous semantic type.

Lines 312-313 were replaced by line 200

- The advantage of using full syntax here is that we can properly

- connect the attributes of a given tuple to its proper semantic

+ The advantage of using full syntax here is that we can properly connect the attributes of a given tuple to its proper semantic

Lines 316-320 were replaced by line 203

- Another advantage of using the full syntax is that it can provide

- support for data sets that have "promoted" data to schema. Consider

- the following semantic-type description for a data set with attributes

- {{station}}, {{MEDSA}}, and {{GLYMX}}, where {{MEDSA}} and {{GLYMX}}

- are species codes whose values are biomass measurements.

+ Another advantage of using the full syntax is that it can provide support for data sets that have "promoted" data to schema. Consider the following semantic-type description for a data set with attributes {{station}}, {{MEDSA}}, and {{GLYMX}}, where {{MEDSA}} and {{GLYMX}} are species codes whose values are biomass measurements.

Lines 348-352 were replaced by line 231

- Here, each tuple of the dataset represents two distinct measurements

- of biomass: one for the MEDSA species and the other for the GLYMX

- species. The skolem terms {{f1($x)}} and {{f2($x)}} distinguish these

- two observations given a tuple {{$x}}, that is, the skolem terms can be

- seen as an creating new objects from the original object {{$x}}.

+ Here, each tuple of the dataset represents two distinct measurements of biomass: one for the MEDSA species and the other for the GLYMX species. The skolem terms {{f1($x)}} and {{f2($x)}} distinguish these two observations given a tuple {{$x}}, that is, the skolem terms can be seen as an creating new objects from the original object {{$x}}.

Lines 379-381 were replaced by line 258

- And finally, atoms of the form {{T.A1, T.A2, ..., T.Am}}, the

- following two annotations are equivalent, where {{f}} is a unique

- skolem symbol.

+ And finally, atoms of the form {{T.A1, T.A2, ..., T.Am}}, the following two annotations are equivalent, where {{f}} is a unique skolem symbol.

Back to KRSMS Semantic Annotation Language, or to the Page History.