Line 302 was replaced by lines 302-331 |
- * |
+ * Williams workflow B ... |
+ ** large amounts of data (or datatypes) |
+ ** data implicitly linked within itself |
+ ** data is implicitly linked outside of itself |
+ ** genomic sequence is central co-ordinating point, but there are anumber of different co-ordinate systesms |
+ ** some "biological", some artifacts of the workflow |
+ * what's the problem |
+ ** we don't ahve a domain model |
+ ** we need a model for visualization |
+ ** but, domain models are hard |
+ ** it's not clear that the domain model should be in the middle ware |
+ * what have we done!? |
+ ** bioinformatics pm (pre myGrid) |
+ ** one big distributed data heterogeneity and integration problem |
+ ** still a big distributed data heterogeneity and integration problem |
+ * how do we solve the problem |
+ ** take the data, use something (perl or an MSc student) to map the data into a (partial) data model |
+ ** visualize this ... |
+ ** but what if the workflow changes? |
+ * second solution |
+ ** large quantities of data are already available with rich mark up in a visualizable form |
+ ** this is unparsable, so also get the flat file rep |
+ ** start to build visualization information into the workflow using beanshell |
+ ** linked data from output -- domain model = scripts that hack these things together |
+ * summary |
+ ** domain models are hard |
+ ** workflows can obfuscate the model |
+ ** visualization requires one |
+ ** we can build some knowledge of a domain model into the workflow and steal the rest. |
+ ** is there a better way? |