At line 0 added 300 lines. |
+ Present: |
+ * Jim Beach |
+ * Jessie Kennedy |
+ * Bob Peet |
+ * Martin Pullan |
+ * Susan Gauch |
+ * Dave Thau |
+ * Nico Franz |
+ * 2 guys from Paris |
+ * Aimee Stewart |
+ |
+ Agenda |
+ # potential seek role in TDWG concept standard activities |
+ a. Nico - couple of slides relating to TDWG concepts |
+ # Dave Thau finish SMS/Taxon possibilities - ontologies |
+ # update from Susan and Aimee and concept matching |
+ # Bob - use cases working towards overall architecture |
+ # Plan 2-3 month activities, Jan mtg |
+ Ask Matt - week of Jan 19 |
+ Bob - want use cases (maybe a subset stay longer and work on) |
+ # Things that Nico and Liu(?) will be doing |
+ # Bob - what are our goals? need a blueprint |
+ ** How does this help an ecologist looking around? |
+ ** what does a dataset markup look |
+ ** We need tools - prompt user to markup data nicely |
+ ** registry for datasources? |
+ ** how do we deal with old data |
+ ** many other projects have pieces that would work for us |
+ ** would like to talk through use case - fringed file fis |
+ ** do we do the markup for approval by ecologist, supplement |
+ |
+ Our potential role |
+ take the lead in standards development |
+ would look for additional money - 1 year time frame, |
+ in parallel with rest of our tasks |
+ Walter, GBIF, Frank Bisby are all interested |
+ Bob - concept based system that many institutions could use. institution that would serve and use concepts. Standards alone are necessary but not enough. We need something to test in one year |
+ |
+ Jim - we need to be application driven. But prototype tests to interact with other systems Vegbank, prometheus, etc. Needs to be funded |
+ |
+ Jessie - must show that it works, not just a seek model |
+ |
+ Martin - risk management - what is the risk to seek goals if we get involved with TDWG standards. |
+ |
+ Jim |
+ # money - maybe allocate 20%, but most external funded |
+ # can we rationalize this within seek goals |
+ |
+ Bob |
+ would get ecologists to markup right |
+ |
+ Martin |
+ GBIF has different immed. goals that are not nec. same as seek. |
+ |
+ Jessie - do it our way in broader context, then hope others like . can't guarantee that others will buy in. we must do it anyway |
+ |
+ Jim - others may drop out, what is our risk to spend time and money. Could be major disagreements bw different perspectives |
+ |
+ Susan |
+ can make work together with wrappers even if somewhat different |
+ |
+ Jim - need all of: |
+ schema, repr methods, documentation of schema, transfer protocol, to prove our stuff, others will want software, we are committing to having stuff for others, formalizing schema is only additional work. we needed to do this anyway |
+ |
+ Jessie - we need our own schema which is subset of federation schema. We will need to be able to anticipate others' needs for interfaces - no it's others' problem |
+ |
+ Bob - must ask for more money to interface with others |
+ |
+ Jessie - extra work - must really study/communicate with other systems to collaborate |
+ |
+ Jim - MOU with GBIF to say how far we will go - whoever is materially involved. Everyone should understand, umbrella of TDWG, but they have no material investment. understanding with TDWG, not vested interests of it. They must have buy-in. Self-nomination for standards - if no stake in it, then just noise, must have money, people, something involved. |
+ |
+ Bob - supplemental request 20% of SEEK taxon money, need to submit to stinger in 30 days (over 100K). Walter |
+ |
+ who is involved in management, if we get financial/in-kind investments, what is our commitment to them |
+ |
+ Susan - we are to develop union of existing schema and how it maps to existing datasets |
+ |
+ Bob - need buy in from datasources |
+ |
+ Susan - no just need a portal |
+ |
+ Jessie - we will contribute % of existing work to broader effort - what will others contribute - only commit year by year. Need $ for travel, indiv. meetings |
+ |
+ Jim - must meet with Frank, Walter, GBIF - we are prepared to do it, see if they will contribute materially |
+ |
+ Jessie - meet with people with models, try to extend our model to workable schema, produce something, discuss proposal with everyone, meet with everyone |
+ Need money for Jessie and one hire to come with, learn on job, write up. Discuss in Edinburgh. |
+ |
+ Susan - in year 2 or 3, money for wrapper, generic tool for dataset merging, |
+ Thau: |
+ |
+ SMS, OWL, Taxon |
+ \\mock up fringed file fish, Jessie's schema, merge with OWL |
+ \\Protege - ontology editing app |
+ \\ITIS - Species2000 |
+ \\OWL built on rdf built on xml - enhanced xml |
+ \\OWL specifies to merge |
+ \\any OWL reader will do these functions |
+ \\can query merged ontology |
+ \\instance based approach - if instances are defined, reasoning tools can merge |
+ \\GBIF - missed the point that reasoning about instances is the big missing link |
+ \\description _____ - extension of predicate logic |
+ \\Genna reasons over OWL files |
+ \\not great at probabilistic reasoning |
+ \\Susan - need to be able to put in complex rules - criteria for merging on the fly |
+ \\what is the performance with 100k nodes - does it die? |
+ \\should things be represented as subclasses vs instances |
+ \\genus is a union of these species |
+ \\taxon is resource with properties, property/values are instances |
+ \\OWL - built for people who are building ontologies and want to view them as they work. Just now, starting to look at what you can do with owl after you have ontology in OWL. Overlay rules, etc. Just now accepted as a standard. could spit stuff out into OWL. |
+ |
+ Context - why does OWL make sense in Taxon and other working groups. SMS group is using OWL and these tools, (Genna - Geon too) This is the language of the semantic web |
+ |
+ Jessie - nobody else is doing things in OWL at this scale. Taxonomists do not want computers to infer anything. Internal representations of our classes in OWL would be death. |
+ |
+ Susan - could dump out small pieces into owl, then play and reason with them |
+ |
+ Jim - would inferences be more important with other stuff? |
+ |
+ Nico - could represent with indentation - nobody looks at 1000s at a time, so not a problem |
+ |
+ Dave - when someone registers datasets and enter tax info, SMS finding datasources, could use language to constrain the way that ecologists enter info. Language for computers to talk to each other |
+ |
+ Jessie - OWL/ ontologies to represent other data (did I get this point?) don't know how we can represent everything in OWL |
+ |
+ Susan - could spit out small piece in owl then SMS can do other work to compare pieces |
+ |
+ How do you compare models to each other - how do you represent the models |
+ |
+ OWL is good: |
+ standard for describing ontologies |
+ * concept providers can share tax info stdly |
+ * has std tools to view, query, add to |
+ * mechanishms for combine, check reason over taxonomies |
+ * SMS can use the repr to |
+ ** register datasets |
+ ** plug tax info into their work flows |
+ Data providers can easily control access to data - data is updated dynamically |
+ * to publish - put a text file on web server |
+ * tax compilations point to their sources |
+ * can archive by controlling file naming |
+ * access to data is logged via the web |
+ |
+ OWL is bad: |
+ * tags are too contstraing |
+ * noneed to move behond xml |
+ * combining / reasoning, etc ...? |
+ |
+ Next: |
+ \\try doing something like GEON in tax domain? |
+ * register data to one tax (ITIS) |
+ * map itis to another tax |
+ * see if it scales |
+ and more ? |
+ |
+ Other possibilities: |
+ * work out with KR/SMS group needs from taxon (unique ids, defined ops) |
+ * focus on ontol repr of schema and compare to xml |
+ * ?? |
+ |
+ Dave- Useful aspects of OWL/ontologies that you can't get from XML schema? |
+ |
+ Jessie - taxonomic classification =? ontology (or is it very narrow subset). If we allow others to build sub-ontologies, OWL might be good tool to create then store in another way. other - need to reason b/w other hierarchies and tie in with SMS |
+ |
+ could just give SMS unique id's, use RDF |
+ |
+ RDF - subject --> property --> value, standard format for unique id, resource |
+ |
+ G - ontology is not a distribution mechanisms, |
+ |
+ Dave - yes, just makes it easy/convenient |
+ RDF schema - fundamental to semantic web |
+ |
+ Nico: |
+ TODO: GET SLIDES FROM NICO |
+ |
+ # Need clarity of vision, conceptual unity, hard problems ahead of us |
+ # In systematics, 60s-70s difference in conceptual outlook on what is systematics, related, phylogenic vs cladistic. cladistic won closely followed what happens in the real world. |
+ |
+ Need to work on YET another model |
+ |
+ Aim - 3 dim past "Genospace" / "Phenospace" soup |
+ --> Evolution --> present non-homogenouse distribution of clusters |
+ |
+ The (typically imperfect) stability of clusters of properties in space and time (kinds) is a precondition for taxonomic names. |
+ |
+ Discover properties of unique names |
+ |
+ aim of systematics is to discover and name Kinds |
+ |
+ The continuity of refernce in absence of an exact relation beween predicate (name) and referent (species) is possible because of a causal chain among the communicators (taxonomists) that stretches from the initial naming event to the present. |
+ |
+ Must model frequency or stability of use. |
+ |
+ Susan - would be good to see differences between versions of taxonomies |
+ |
+ Martin - revisions can take small portions based on groups one is familiar with - there are ways to decide what makes up the subset of revision, could be based on character data |
+ |
+ Bob - find something as example of best practice |
+ |
+ Martin - cannot do cladistics in a vacuum, no way to tell whether your patent is better/worse than another, just different |
+ |
+ Jessie - Nico should come up with what is wrong with our model |
+ Susan - give us an alternate proposal |
+ |
+ Jessie - keep in mind difference between individual and species. species is an idea. This is an individual taxonomists view of what they are trying to do. Not taxonomists as a whole. This is what the individual thinks he has achieved. |
+ |
+ Jim - How does this fit in with Tree of life , cladistic model, intermediate nodes without names, could just have diagnoses, characteristics |
+ |
+ Martin - only need to name terminal nodes, internal nodes (clades?) only need internal reference. cladogram is not a formal classification with??? |
+ |
+ Jim - Most phylogeneticists are not doing this post-classification. Can our system handle this in 10 years when ??? |
+ |
+ Martin - Concepts are in terminal taxa (cladistic classificiation) |
+ |
+ Susan - store the links between the nodes not a problem ??? can handle them as long as there is a link to something that we recognize |
+ |
+ Jim - most systemestists do not go back and name all the leaf nodes |
+ |
+ Nico - ??? clarification of our mission |
+ |
+ Jim - formal classification is fading away - breaking down on the edges, we decide whether we can deal with it with our model |
+ |
+ --------------------- |
+ |
+ Jessie - get info from ecologist what field guide is used |
+ |
+ Susan - get a 2nd hierarchy, then |
+ |
+ Implications of new taxonomic schema |
+ |
+ Must get a data source with real concepts to fill new schema - better test |
+ * Nico can do full data for weevils for test case |
+ * Walter - German mosses as test |
+ * Bob can put together some plant data w/in 2 months |
+ * New data |
+ ** tools to import programmatically |
+ ** simple user interface form to put data in our system for testing |
+ |
+ datasets - Birds of Mexico |
+ does Town have bird concepts too? |
+ |
+ Bob will export his data in XML |
+ Jessie will give us input on mapping of fields |
+ Dave will write XSLT translator to go to our latest schema |
+ |
+ January meeting: |
+ |
+ Jim- |
+ How can we interact more effectively between meetings |
+ Use cases - haven't done that |
+ |
+ Susan - |
+ Have a mission for each meeting |
+ \\ What's the goal |
+ * assign to a person |
+ * have a plan unless they are stopped |
+ |
+ Maybe we should look at a tool that takes a dataset with taxa names and produce EML (taxonomic coverage only) |
+ |
+ Jessie - Do we ever record just genus or family? If so, we should never expand it (in the EML) |
+ |
+ Susan - need great data on German mosses and datasets with great EmL detailing tax coverage of German mosses, good |
+ |
+ If we have another taxonomy - it is fine if we have |
+ |
+ Use case - |
+ ITIS has full tree of weak concepts |
+ have another tree of full concepts, small tree |
+ tools for importing conceptual data |
+ map concepts between trees |
+ |
+ Dave tasks: |
+ look at use cases? |
+ * what kind of questions will be asked by users |
+ * types of matches to be done |
+ * what matches are experts interested in |
+ * how are they represented |
+ * expansion, weighting |
+ * business rules from Bob and Nico |
+ * semantic tools can't take probabilistic stuff into account |
+ * talk to Joanna about problems, weaknesses strengths |
+ * instances vs subclasses |
+ * be able to export our stuff to OWL |
+ |
+ Jessie - will work on visualization tools to do the mapping between trees (for someone who knows the data) |
+ |
+ Nico - Treemap - maps topologies - hypothetical extensions, doubling of trees, minimizes steps between trees |
+ |
+ We should all email more often |
+ |
+ Commitment for others within management process (decision making): |
+ \\virtual project parallel to SEEK taxon |
+ \\$$$ for a |
+ \\ half-time person to follow jessie, learn and write |
+ \\ travel money |
+ \\jessie visit all the places |
+ \\then create a proposal and |
+ \\get a meeting for to get consensus |