Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



E Science Link Up Oct 04

This is version 33. It is not the current version, and thus it cannot be edited.
[Back to current version]   [Restore this version]



Meeting notes and updates on the e-Science Link-Up Meeting

(The following notes taken by S. Bowers)

Semantic Registration in Taverna (Pinar Alper)

    • Feta Architecture
      • Ontologist (Chris Wroe) -> Ontology Editor -> DL Reasoner -> Classification (in RDF(S)) -> obtain classification -> Feta, PeDRo
      • Store WSDL Descriptions (in special XML schema), then annotate, and give to Feta
      • The ontology, classified, and the annotated wsdl are merged into a single graph
      • Taverna Workflow Workbench issues "semantic discovery via conceptual descriptions" against feta ... a set of canned queries
    • Feta Engine
      • Feta Loader uses myGrid service onto and domain onto
      • use Jena, e.g., to do RDQL queries, etc.
    • Feta Data Model
      • Operation (name, description, task -- from a bio service ontology, method -- particular type of algo/codes also from onto but not used much, resource, application, hasInput : Parameter, hasOutput : Parameter)
      • Parameter (name, desc, semantic type, format, transport type, collection type, collection format)
      • Service (name, description, author, organizations)
      • WSDL based operation is a subclass of Operation
      • WSDL based Web Service is a subclassof Service (hasOperation : WSDL based operation)
      • workflow, bioMoby service, soaplab service, local java code subclasses of Service and Operation
      • seqHound service is an operation
        • Each parameter can have a semantic type, stating that the parameter is an instance of a class, and the operation can have a "task" which is also a "semantic type" and "method"

SHIM breakout (Jim leads discussion)

    • SHIM (need acronym)
      • semantically compatible, syntactically incompatible services
      • uniprot database (uniprot_record) -> parser and filter shim -> blastp analysis (protein_sequence)
      • working definition: a software component who's main purpose is to syntactically match otherwise incompatible resources. it takes some input, performs some task and produces an output. depending on usage, a shim can be semantically neutral ...
      • in myGrid, basically doing type manipulations (map between abstract types to concrete types), e.g., embl, genbank, fasta concrete types, dna_sequence is an abstract type
      • examples:
        • parser / filter
        • de-referencer
        • syntax translator
        • mapper
        • iterator
      • dereferencer
        • service a (genbank id) -> dereferencer -> service b (genbank record)
        • retreives information from a URL
      • syntax translator
        • service a (dna seq; bsml) -> syntax translator -> service b (dna seq; agave)
      • mapper
        • service a (genbank id) -> mapper -> service b (embl id)
      • iterator
        • service a (collection of x) -> iterator -> service b (a single x)
      • seven steps to shim "nirvana"
      • recognize 2 services are not compatible (syntactically, possibly semantically)
      • recognize the degree of mismatch
        • everything connected to everything
      • identify what type of shiim(s) is/are needed
      • find or manufacture the shim
      • advise user on "semantic safety" of the shim
        • not clear what this means ...
      • invoke the shim
      • record provenance
      • my (Shawn's) proposal: a shim is an actor/service whose input semantic type is the same or more general than the output semantic type

Workflow management and AI Planning (Jim Blythe)

  • Motivation
    • workflows in grid-using communities
    • challenges in supporting workflow management
  • research on workflow planning at usc/isi
    • using ai techniques in Pegasus to generate executable grid workflows
  • using metadata descriptions as first step, to get away from the file encodings of VDL and Pegasus
  • an operator is specified generally as an (if preconditions then add <stuff>) form, in Lisp/Scheme syntax
    • example: user can say: I want the results of a pulsar search at this time and location
  • the generation of the operation defs are done by hand ... began looking at how to construct them automatically

Access Grid Meeting

  • The information model
    • Organization of people, projects, experiments, and so on
    • Operations, ... (Pinar)
    • every data item can be annotated with various type information ... some slides
    • mime types
    • primary objective is to model escience processes, not the domain -- capturing the process provides added value: facilitates contextualization, data-model contracts between components, visualize integrated result object (as a result of a workflow), ...
    • data fusion/integration not guided by this model
  • The aim
    • providing more direct support for the implementation of e-Science processes by:
      • increasing the synergy between components
      • facilitating data-model contracts between myGrid components
      • defining a coherent myGrid architecture
  • Some benefits:
    • automatically capturing provenance and context information that is relevant to the interpretation and sharing of the results of the e-science experiments
    • facilitating personalization and collaboration
  • Implementation
    • a database with a web service interface ... as canned queries
    • generic interface, i.e., sql query
    • performance penality -- overhead, access calls, etc.
  • Questions
    • Does the model support "synthetic" versus "raw/natural" data?
    • What about the set-up and callibration of tools
    • Also, predicted data versus experimentally observed
    • The model is based on CCRC model
    • There are also a lot of standards that should be incorporated, so need some kind of extensibility
    • There needs to be place-holders for these within the information model
    • Related issue is where the results should be stored
    • three stores: one is the third-party databases (e.g., arrayexpress gene expression database ...) and link back
    • this is encompassed by the MIR -- myGrid Info. Repository; like a notebook
  • First thing done with information model
    • Workbench: MIR browser, metadata browser, WF model editor/explorer, feta search gui
    • Taverna execution environment: freefluo, and various plug-ins for MIR, Metadata Storage, and Feta
    • MIR extenral
    • Interestingly, the information model is "viewed" through a tree browser
  • The Mediator
    • Application oriented
      • directly supports the e-Scientist by:
        • providing pre-configured e-Science processes templates (i.e., system level worlkflows)
        • helping capturing and maintaining context information that is relevant to the interpretation and sharing of the results of the e-science experiments
        • facilitating personalization and collaboration
    • middleware-oriented
      • contributes to the synergy between mygrid services by
        • acting as a sink for e-Science events initiated by myGrid components
        • interpreting the intercepted events and triggering interactions w/ other related components entailed by the semantics of those events
        • compensating for possible impedence mismatches with other services both in terms of data types and interaction protocols
          • not really an issue -- won't do much here -- but might be some other components that want to participate, and would need to have this service
        • inspired, etc., by WSMF, WSMO, WSMX, WSML, ..., Deri web-services -- Deter Fensel, et al.
  • Supporting the e-Scientist
    • recurring use-cases can be captured
    • find workflows use-case
    • etc.
  • mediating between services
    • fully service based approach
      • the whole myGrid as a service
      • all communication done through web services (the mediator acts as the front door / gateway)
    • the name mediator taken from Gang of Four pattern with the same name



Go to top   More info...   Attach file...
This particular version was published on 20-Oct-2004 09:17:04 PDT by SDSC.bowers.