Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



E Science Link Up Oct 04

Difference between version 53 and version 47:

Line 302 was replaced by lines 302-375
- *
+ * Williams workflow B ...
+ ** large amounts of data (or datatypes)
+ ** data implicitly linked within itself
+ ** data is implicitly linked outside of itself
+ ** genomic sequence is central co-ordinating point, but there are anumber of different co-ordinate systesms
+ ** some "biological", some artifacts of the workflow
+ * what's the problem
+ ** we don't ahve a domain model
+ ** we need a model for visualization
+ ** but, domain models are hard
+ ** it's not clear that the domain model should be in the middle ware
+ * what have we done!?
+ ** bioinformatics pm (pre myGrid)
+ ** one big distributed data heterogeneity and integration problem
+ ** still a big distributed data heterogeneity and integration problem
+ * how do we solve the problem
+ ** take the data, use something (perl or an MSc student) to map the data into a (partial) data model
+ ** visualize this ...
+ ** but what if the workflow changes?
+ * second solution
+ ** large quantities of data are already available with rich mark up in a visualizable form
+ ** this is unparsable, so also get the flat file rep
+ ** start to build visualization information into the workflow using beanshell
+ ** linked data from output -- domain model = scripts that hack these things together
+ * summary
+ ** domain models are hard
+ ** workflows can obfuscate the model
+ ** visualization requires one
+ ** we can build some knowledge of a domain model into the workflow and steal the rest.
+ ** is there a better way?
+
+ !Breakout: myGrid "Data Model" (schema) for capturing Metadata and Semantics
+
+ * common.xsd
+ ** service description
+ *** serviceName
+ *** organisation
+ **** UDDI fields, e.g., organization name, etc.
+ *** author
+ *** locationURL
+ *** interfaceWSDL
+ *** serviceDescriptionText
+ *** operations (units of funtionality)
+ **** service operation
+ ***** operation name
+ ***** portName
+ ***** operationDescriptionText
+ ***** operationInputs
+ ****** parameter
+ ******* parameterName
+ ******* messageName
+ ******* parameterDescription
+ ******* defaultValue
+ ******* semanticType
+ ******* XMLSchemaURI
+ ******* isConfigurationParameter
+ ***** operationOutputs
+ ***** operationTask (the "what", i.e., what the operation does -- the verb or action -- e.g., "aligning, ncbi_blast_local_aligning, etc.")
+ ***** operationResource (underlying resources that the operation may use, like a database, coming from an ontology...)
+ ***** operationMethod
+ ***** operationApplication (software application)
+ *** serviceType
+ **** either: "Soaplab service, WSDL service, Workflow service"
+ ** pedro
+ *** uses this schema to drive the user interface for annotation
+ *** also uses an external xml file to state that certain xml schema elements are to be filled in by semantic types, and where to look in the ontologies to fill those concepts
+ *** [http://www.cs.man.ac.uk/~penpecip/feta/misc] for files ...
+
+ !More on SHIMs and Planning
+
+ * Shims in detail: UniProt database to BLASTp analysis
+ ** UniProt produces concrete type: UniProt_record
+ ** contains protein_sequence
+ **

Back to E Science Link Up Oct 04, or to the Page History.