| 
      
         
      
      
         
            This is version 26.  
            It is not the current version, and thus it cannot be edited.[Back to current version]  
            [Restore this version]
 
 
 
 The problems: 
 Versioning 
 Conflicts in Dependency Chains (underlying classes and their dependencies)
 instead of class names, use lsids
  no way to uniformly manage objects (workflows, data, libraries/dependencies, etc.)
 no way to "seamlessly" transport objects
 both publish and download
  no way to manage local versus remote objects
 lsids plus object cache ("intelligent" object cache)
 dynamically load classes into current Java Classloader
 no central repository(ies)
 periodic indexing of HDD in background
  no good way to create functional groups
 e.g., the core group, plus packages for geospatial, rexpression, etc.
 a lighter weight kepler
 
 
 
 Topics
 Actor browsing/searching
 Data browsing/searching
 Unify them
 Helping understand why a result was included
 
 
 Ideas
 go with the tree approach for actors and data
 introduce a "shopping cart" 
 experiment (via mockups) of different approaches -- laura et al
 advanced search function
 separate window for searching / results
 
 
 
 For actors and datasets, we are planning on the following
 hyjack the port configuration dialog by adding a new column called "semantic type"
 keep the unit column
 add a (...) button for the semtype column, which will open up a new annotation dialog
 the new dialog will open a list of the ontology "variables", which will let one select a "class" and then show the direct relationships of that class, e.g., subclasses, superclasses, and properties
 
 
 
 User interface mockup: What should the interface look like for validating connections?
 Everytime a connection is made
 Need six (?) types of connections: datatype valid, invalid, unknown; and semtype valid, invalid, unknown. 
 Laura will look at the various ways to do this, and try to find some good approaches
 
 
 SMS backend services
 when a relation is connected to an input and ouptut port, a change event will be caught, and an sms method will be called, passing in the output actor-port pair, the input actor-port pair, and the workflow itself. 
 the sms method will reply with three states: valid, invalid, and unknown, which the even handler will use to code the relation with (to show these cases). Also, a textual description of why the match is invalid may be useful ... but we may get to this more in the next step, which is helping a user "fix" the connection
 it would be useful to do the same thing with datatypes
 
   public SemtypeValidityCheckResult isSemtypeCompatibleConnection(LSID workflowId, Connection connection); // connection within wf
 
 one issue is that if we consider the use of constraints (like in units), then one connection may affect other connections (later down in the workflow), in which case, we would need to extend the return type.  Also, this is the reason for including the workflow in the call. The Connection object has four components: OutputActorId, OutputPortName, InputActorId, InputPortName. 
 
 
 
 
 Mark presented slides (ref should go here) going over:
 Gene Ontology
 Sem. web. comm. references (CO-ODE group); common errors, practical difficulties, rules of thumb.
 terminological overload: stick to class/property/individual/restriction
 restrictions: 
 primitive/complete; partial/defined; necessary/necessary and sufficient
 existential (some) vs universal (only) quantifiers
 disjointness/overlap; exlusive vs. exhaustive
 negation not thru absence, but contradiction ("all" quantifier is trivially satisfied if "no value" for propert in question
 natural language "paraphrasing" to assist
  Rich's ontologies in terms of protege, and growl
 kr's current direction: keep ontologies simple, intuitive with clear goals
 implement deana's biodiv and productivity concept maps in OWL
 test capability of Units.OWL to do simple units conversion (X g -> .00X kg) in Kepler
 we are going to adopt Rich's approach for units which is to mimic the uml unit dictionary model
 sms will provide an operation that takes two annotations (i.e., actor metadata), and returns a properly configured expression actor that can perform the necessary unit conversion (an adapter)
 there are some special cases for "unit reasoning", in particular, when it is appropriate to do conversion (matt will find Rich's example) and precision
  develop simple actor intology for parameterizing garp and {gam/nn?}
 simple data integration use cases (measurement, experimental design)
 need a mechansism to "integrate" merge operations into search, either as part of the search, or else being able to identify results from a search and then tell kepler those are the ones to be merged
 the "merge set" should automatically be put, e.g., into a composite actor, connected to the "merge" actors
  kr postdoc successor for rich williams
  ferdinando: keep it simple, but not simpler
 put "google" into GrOWL, so one can search annotation properties, concept names, etc., 
  shawn/matt: need a CAD summarization view for users, and for giving default "views" to help summarize, simplify ontologies 
  More issues:
 deana and mark need to become competent at either protege, growl, and swoop
 need user guide to ontologies for ecologists
 growl interface suggestions (incorporate into kepler), for exploration and annotation selection for domain scientists
 biodiversity wf-simple data integration, simple dataset "transpose", r-scripts
 community wiki to discuss ontologies
 
 
 
 Jenny presented slides (ref here) on SCIA:
 the schema mapping problem
 schema matching (inter-schema correspondences) versus schema mapping (view generation)
 schemas: dtd, xml schema, relational schema, OWL-full
 automatic matching impossible
  requirements for schema mapping
 correctness depends on application
 needs to be quick
 semantics easier with interactive approaches
  approach:
 scia, semi-automatic approach
 critical points; users do critical points (hardest ones); system does the rest(simplest ones); iterate until satisfied
 ...
  critical points
 occurs wehn a core context has either no good matches, or else has more than one good one-to-one match
 core contexts are most important contextualizing elements for tags within subtrees
 correctness of core context matches greatly affects correctness of matches for all nodes under them
  examples
 (n:1) order contact might match contact for billing, shipping and/or suplier
 (n:1) concat of first name and lat name; species count divided by are for density
 if bib/book/author is found as best match candiate to arts/arcticle/author
  incorporate similarity flooding, but make small change
 use cases
 bibliography use case
 Note: the examples use a thesaurus, which derives name matches, e.g., site -> station; the name matches are weighted at provide 2/10 of the accuracy in the overall matching accuracy calculation
 ...
 
 
 
 
 
 
SMS backend services
User interface mockup
Ontology needs assessment
Tasks and milestones
 
 
 |