E Science Link Up Oct 04

This is version 30. It is not the current version, and thus it cannot be edited.
[Back to current version] [Restore this version]

Meeting notes and updates on the e-Science Link-Up Meeting

Semantic Registration in Taverna (Pinar Alper)

Feta Architecture

Ontologist (Chris Wroe) -> Ontology Editor -> DL Reasoner -> Classification (in RDF(S)) -> obtain classification -> Feta, PeDRo
Store WSDL Descriptions (in special XML schema), then annotate, and give to Feta
The ontology, classified, and the annotated wsdl are merged into a single graph
Taverna Workflow Workbench issues "semantic discovery via conceptual descriptions" against feta ... a set of canned queries

Feta Engine

Feta Loader uses myGrid service onto and domain onto
use Jena, e.g., to do RDQL queries, etc.

Feta Data Model

Operation (name, description, task -- from a bio service ontology, method -- particular type of algo/codes also from onto but not used much, resource, application, hasInput : Parameter, hasOutput : Parameter)
Parameter (name, desc, semantic type, format, transport type, collection type, collection format)
Service (name, description, author, organizations)
WSDL based operation is a subclass of Operation
WSDL based Web Service is a subclassof Service (hasOperation : WSDL based operation)
workflow, bioMoby service, soaplab service, local java code subclasses of Service and Operation
seqHound service is an operation

Each parameter can have a semantic type, stating that the parameter is an instance of a class, and the operation can have a "task" which is also a "semantic type" and "method"

SHIM breakout (Jim leads discussion)

SHIM (need acronym)

semantically compatible, syntactically incompatible services
uniprot database (uniprot_record) -> parser and filter shim -> blastp analysis (protein_sequence)
working definition: a software component who's main purpose is to syntactically match otherwise incompatible resources. it takes some input, performs some task and produces an output. depending on usage, a shim can be semantically neutral ...
in myGrid, basically doing type manipulations (map between abstract types to concrete types), e.g., embl, genbank, fasta concrete types, dna_sequence is an abstract type
examples:

parser / filter
de-referencer
syntax translator
mapper
iterator

dereferencer

service a (genbank id) -> dereferencer -> service b (genbank record)
retreives information from a URL

syntax translator

service a (dna seq; bsml) -> syntax translator -> service b (dna seq; agave)

mapper

service a (genbank id) -> mapper -> service b (embl id)

iterator

service a (collection of x) -> iterator -> service b (a single x)

seven steps to shim "nirvana"
recognize 2 services are not compatible (syntactically, possibly semantically)
recognize the degree of mismatch

everything connected to everything

identify what type of shiim(s) is/are needed
find or manufacture the shim
advise user on "semantic safety" of the shim

not clear what this means ...

invoke the shim
record provenance
my (Shawn's) proposal: a shim is an actor/service whose input semantic type is the same or more general than the output semantic type

Workflow management and AI Planning (Jim Blythe)

Motivation

workflows in grid-using communities
challenges in supporting workflow management

research on workflow planning at usc/isi

using ai techniques in Pegasus to generate executable grid workflows

using metadata descriptions as first step, to get away from the file encodings of VDL and Pegasus
an operator is specified generally as an (if preconditions then add <stuff>) form, in Lisp/Scheme syntax

example: user can say: I want the results of a pulsar search at this time and location

the generation of the operation defs are done by hand ... began looking at how to construct them automatically

Access Grid Meeting

The information model

Organization of people, projects, experiments, and so on
Operations, ... (Pinar)
every data item can be annotated with various type information ... some slides
mime types
primary objective is to model escience processes, not the domain -- capturing the process provides added value: facilitates contextualization, data-model contracts between components, visualize integrated result object (as a result of a workflow), ...
data fusion/integration not guided by this model

The aim

providing more direct support for the implementation of e-Science processes by:

increasing the synergy between components
facilitating data-model contracts between myGrid components
defining a coherent myGrid architecture

Some benefits:

automatically capturing provenance and context information that is relevant to the interpretation and sharing of the results of the e-science experiments
facilitating personalization and collaboration

Implementation

a database with a web service interface ... as canned queries
generic interface, i.e., sql query
performance penality -- overhead, access calls, etc.

Questions

Does the model support "synthetic" versus "raw/natural" data?
What about the set-up and callibration of tools
Also, predicted data versus experimentally observed
The model is based on CCRC model
There are also a lot of standards that should be incorporated, so need some kind of extensibility
There needs to be place-holders for these within the information model
Related issue is where the results should be stored
three stores: one is the third-party databases (e.g., arrayexpress gene expression database ...) and link back
this is encompassed by the MIR -- myGrid Info. Repository; like a notebook

First thing done with information model

Workbench: MIR browser, metadata browser, WF model editor/explorer, feta search gui
Taverna execution environment: freefluo, and various plug-ins for MIR, Metadata Storage, and Feta
MIR extenral
Interestingly, the information model is "viewed" through a tree browser

The Mediator

Application oriented

directly supports the e-Scientist by:

providing pre-configured e-Science processes templates (i.e., system level worlkflows)
helping capturing and maintaining context information that is relevant to the interpretation and sharing of the results of the e-science experiments
facilitating personalization and collaboration

middleware-oriented

contributes to the synergy between mygrid services by

acting as a sink for e-Science events initiated by myGrid components
interpreting the intercepted events and triggering interactions w/ other related components entailed by the semantics of those events
compensating for possible impedence mismatches with other services both in terms of data types and interaction protocols

not really an issue -- won't do much here -- but might be some other components that want to participate, and would need to have this service

Go to top More info... Attach file...

This particular version was published on 20-Oct-2004 09:04:38 PDT by SDSC.bowers.