Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge




Semantic Data Integration

Difference between current version and current version:

At line 0 added 95 lines.
+ Semantic Data Integration
+ The area of scientific data integration provides a number of
+ challenges in addition to the traditional ones in data integration and
+ database mediation. While data volume can and often is a problem, in a
+ number of disciplines (e.g., ecology, life sciences in general,
+ geosciences, etc.) the **semantic heterogeneity and complexity** of
+ the data can be a significant impediment in itself. For example, an
+ ecologist may want to combine a number of different data sources as
+ part of an analytical pipeline or scientific workflow. In order to
+ facilitate data integration, additional semantic information is often
+ necessary, for example, on the unit type of a measurement, the
+ protocol by which data was created or derived, or simply to provide
+ additional information at the conceputal/ontological level about the
+ data.
+ The purpose of a semantic mediation system is to utilize semantic
+ annotations, e.g., for smarter (ontology-enabled) data discovery,
+ semantic type checking and conversion when linking analytical steps to
+ one another or when binding data sets to analysis steps.
+ The goal of this meeting is to identify common techniques and
+ procedures for semantic mediation and to explore opportunities for
+ collaborations between UK and US research groups.
+ In particular, we plan to concentrate on Semantic registration of data sets, provenance sets, parameter sets, workflow sets, services sets. We will attempt to address questions like the following:
+ * What kind of resources should be semantically typed (datasets, databases, services), and what is the semantic typing language for those?
+ * How is semantic typing employed for data discovery, query rewriting, and scientific workflow planning?
+ * What does a registry of semantic types look like, and how are data and services registered to it? What is the semantic registration procedure?
+ * What tools exists to support semantic registration, querying, and reasoning with ontologies, schemas, etc. ?
+ * What standards can be employed and extended? In particular, what support does EML already provide, and what else is needed?
+ Prior to the meeting attendees will be given some **homework**, i.e., to illustrate their approach of semantic registration of data sets, web services, and workflows using two examples. One of them (tentative) can be found here:
+ {{{
+ The presentations will specificially address requirements and current
+ architecture in the context of the SEEK Semantic Mediation System
+ (SMS):
+ 1. How to register data sets, ontologies, workflows, and associations
+ between them (semantic registration).
+ 2. How to put the above to good use, e.g., for "smart data discovery",
+ semantics-enhanced data integration and mediation, semantics-enhanced
+ workflow design and execution (this also includes, e.g., required
+ reasoning services).
+ Hopefully we will also be able to report on the linkage between the
+ SEEK EcoGrid and the SEEK SMS: what are the "structural and semantic
+ commitments" of the EcoGrid that can be used by SMS.
+ In a sense, SMS uses the EcoGrid to do (1). Conversely, applications
+ such as the SEEK workflow system (AMS/Kepler) use both the EcoGrid and
+ SMS. This overall picture should also be fleshed out to some extent as
+ part of this session.
+ 10:30-11:00 TEA AND COFFEE
+ Same as above for SEEK, but now for MyGrid! As part of the "homework
+ assignment" both SEEK and MyGrid folks use the same running example(s)
+ to illustrate their approaches.
+ 12:30-14:00 LUNCH
+ This 2.5 hour session might be parallelized into break-out
+ sessions. The goal is to flesh out an interoperable semantic
+ registration approach that will work across SEEK, MyGrid, and related
+ "semantics-aware" systems. A MyGrid semantically registered service
+ should be usable from a semantics-aware SEEK workflow. Conversely, a
+ semantics-aware MyGrid workflow should be able to invoke SEEK services
+ and take advantage of semantic types.
+ Part of this discussion should also deal with non-procedural data
+ integration, i.e., based on declarative views as opposed to procedural
+ workflows.
+ 15:30-16:00 TEA AND COFFEE
+ 18:00 CLOSE
+ }}}
+ -----------------------------------------------
+ [Back to the meeting agenda | EdinburghMeeting]

Back to Semantic Data Integration, or to the Page History.