This is version 30.
It is not the current version, and thus it cannot be edited.
[Back to current version]
[Restore this version]
- Feta Architecture
- Ontologist (Chris Wroe) -> Ontology Editor -> DL Reasoner -> Classification (in RDF(S)) -> obtain classification -> Feta, PeDRo
- Store WSDL Descriptions (in special XML schema), then annotate, and give to Feta
- The ontology, classified, and the annotated wsdl are merged into a single graph
- Taverna Workflow Workbench issues "semantic discovery via conceptual descriptions" against feta ... a set of canned queries
- Feta Engine
- Feta Loader uses myGrid service onto and domain onto
- use Jena, e.g., to do RDQL queries, etc.
- Feta Data Model
- Operation (name, description, task -- from a bio service ontology, method -- particular type of algo/codes also from onto but not used much, resource, application, hasInput : Parameter, hasOutput : Parameter)
- Parameter (name, desc, semantic type, format, transport type, collection type, collection format)
- Service (name, description, author, organizations)
- WSDL based operation is a subclass of Operation
- WSDL based Web Service is a subclassof Service (hasOperation : WSDL based operation)
- workflow, bioMoby service, soaplab service, local java code subclasses of Service and Operation
- seqHound service is an operation
- Each parameter can have a semantic type, stating that the parameter is an instance of a class, and the operation can have a "task" which is also a "semantic type" and "method"
- SHIM (need acronym)
- semantically compatible, syntactically incompatible services
- uniprot database (uniprot_record) -> parser and filter shim -> blastp analysis (protein_sequence)
- working definition: a software component who's main purpose is to syntactically match otherwise incompatible resources. it takes some input, performs some task and produces an output. depending on usage, a shim can be semantically neutral ...
- in myGrid, basically doing type manipulations (map between abstract types to concrete types), e.g., embl, genbank, fasta concrete types, dna_sequence is an abstract type
- examples:
- parser / filter
- de-referencer
- syntax translator
- mapper
- iterator
- dereferencer
- service a (genbank id) -> dereferencer -> service b (genbank record)
- retreives information from a URL
- syntax translator
- service a (dna seq; bsml) -> syntax translator -> service b (dna seq; agave)
- mapper
- service a (genbank id) -> mapper -> service b (embl id)
- iterator
- service a (collection of x) -> iterator -> service b (a single x)
- seven steps to shim "nirvana"
- recognize 2 services are not compatible (syntactically, possibly semantically)
- recognize the degree of mismatch
- everything connected to everything
- identify what type of shiim(s) is/are needed
- find or manufacture the shim
- advise user on "semantic safety" of the shim
- not clear what this means ...
- invoke the shim
- record provenance
- my (Shawn's) proposal: a shim is an actor/service whose input semantic type is the same or more general than the output semantic type
- Motivation
- workflows in grid-using communities
- challenges in supporting workflow management
- research on workflow planning at usc/isi
- using ai techniques in Pegasus to generate executable grid workflows
- using metadata descriptions as first step, to get away from the file encodings of VDL and Pegasus
- an operator is specified generally as an (if preconditions then add <stuff>) form, in Lisp/Scheme syntax
- example: user can say: I want the results of a pulsar search at this time and location
- the generation of the operation defs are done by hand ... began looking at how to construct them automatically
- The information model
- Organization of people, projects, experiments, and so on
- Operations, ... (Pinar)
- every data item can be annotated with various type information ... some slides
- mime types
- primary objective is to model escience processes, not the domain -- capturing the process provides added value: facilitates contextualization, data-model contracts between components, visualize integrated result object (as a result of a workflow), ...
- data fusion/integration not guided by this model
- The aim
- providing more direct support for the implementation of e-Science processes by:
- increasing the synergy between components
- facilitating data-model contracts between myGrid components
- defining a coherent myGrid architecture
- Some benefits:
- automatically capturing provenance and context information that is relevant to the interpretation and sharing of the results of the e-science experiments
- facilitating personalization and collaboration
- Implementation
- a database with a web service interface ... as canned queries
- generic interface, i.e., sql query
- performance penality -- overhead, access calls, etc.
- Questions
- Does the model support "synthetic" versus "raw/natural" data?
- What about the set-up and callibration of tools
- Also, predicted data versus experimentally observed
- The model is based on CCRC model
- There are also a lot of standards that should be incorporated, so need some kind of extensibility
- There needs to be place-holders for these within the information model
- Related issue is where the results should be stored
- three stores: one is the third-party databases (e.g., arrayexpress gene expression database ...) and link back
- this is encompassed by the MIR -- myGrid Info. Repository; like a notebook
- First thing done with information model
- Workbench: MIR browser, metadata browser, WF model editor/explorer, feta search gui
- Taverna execution environment: freefluo, and various plug-ins for MIR, Metadata Storage, and Feta
- MIR extenral
- Interestingly, the information model is "viewed" through a tree browser
- The Mediator
- Application oriented
- directly supports the e-Scientist by:
- providing pre-configured e-Science processes templates (i.e., system level worlkflows)
- helping capturing and maintaining context information that is relevant to the interpretation and sharing of the results of the e-science experiments
- facilitating personalization and collaboration
- middleware-oriented
- contributes to the synergy between mygrid services by
- acting as a sink for e-Science events initiated by myGrid components
- interpreting the intercepted events and triggering interactions w/ other related components entailed by the semantics of those events
- compensating for possible impedence mismatches with other services both in terms of data types and interaction protocols
- not really an issue -- won't do much here -- but might be some other components that want to participate, and would need to have this service
|