Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



All Hands Meeting 2005 Taxon Agenda And Notes

Difference between version 37 and version 4:

At line 0 added 2 lines.
+ !! Monday Afternoon
+ * Jessie Kennedy presented SEEK Taxon update and plans (PowerPoint) will be put in CVS as part of plenary session
At line 1 added 2 lines.
+ !! Tuesday Morning
+ * Plenary Session
Line 3 was replaced by line 7
- !!! Tuesday Afternoon
+ !! Tuesday Afternoon
Line 5 was replaced by lines 9-15
- !!! Wednesday Morning
+ * Laura Downey, Going Forward Presentation (see attachment)
+ ** Reviewed Group Plans and RoadMap from Winter 2005 EstesPark meeting
+ * Xianhua Liu, Concept Mapping and authoring tool: demo and discussion
+ ** Laura identified usability issues of interest and will follow-up
+ ** Discussion of next steps, including generation of TCS and object model considerations
+ * Martin Graham, Collections visualization tool: review of prototype
+ ** Limited discussion of next steps, first impressions
Removed line 7
- * Kepler-Taxon work flow discussion including Higgins, Pennington, integration of DiGIR and Kepler in use case work flow, responsibilities, where is the user interaction? TOS Actor-how would it be used in workflows? What are the usage scenarios for TOS within Kepler? How else might the TOS be embedded in Kepler?
Line 9 was replaced by line 18
- ** High-level Taxon-Kepler Interaction Design and Engineering Issues with Dan Higgins
+ !! Wednesday Morning
Line 11 was replaced by line 20
- *** The 'selection problem' How do users *find* and then *select* the focal concept for searching, parameterizing an actor, with the concept one wants to work with. 1st generation, little interactivity, fill in a Kepler parameter file.
+ Taxon-Kepler Interaction Design and Engineering Discussion with Dan Higgins
Line 13 was replaced by lines 22-25
- Wrap Robs current work flow into a custom actor, interactivity would come later, and maybe then passing parameter files to the actor from an application outside of GARP.
+ * Integration of DiGIR and Kepler in two SEEK use case work flows
+ * Responsibilities of TOS versus those of Kepler
+ * Need for user interaction
+ * Usage scenarios for TOS within Kepler
At line 14 added 1 line.
+ The 'selection problem' How do users *find* and then *select* the focal concept for searching and parameterizing an actor, with the concept one wants to work with? 1st generation, no interactivity, beyond filling in an Actor parameter file. Rob has wrapped current work flow into a custom actor, interactivity would come later, and maybe then passing parameter files to the actor from an application outside of Kepler would be possible.
Line 16 was replaced by line 29
- ***** Problem with names being mapped to more than once concept for GARP work flow
+ Problem with names coming out TOS query being mapped to more than once of the concepts specified for DiGIR query for GARP input.
Line 18 was replaced by line 31
- ***** Merging data sets with concepts that overlap, how do we integrate them once they come back
+ Looping issue in GARP usecase workflow: going through list of concepts to find synonyms to identify overlaps (names with 2 or more concepts), Gales & Jones, Action: make a custom TOS actor instead of modifying web services actor (which came from GEON)
Line 20 was replaced by line 33
- ***** Looping issue in GARP usecase workflow: going through list of concepts to find synonyms to identify overlaps (names with 2 or more concepts), Gales & Jones, Action: make a custom TOS actor instead of modifying web services actor (which came from GEON)
+ SEEK-Taxon would like to see more possibility for an interactive UI within Kepler for TOS query and selection tasks. Alternative is to stop, restart and repeat short workflows as the way to introduce 'interactivity' for the user to test steps before running the complete workflow.
Line 22 was replaced by line 35
- ***** SEEK-Taxon would like to see more possibility for an interactive UI within Kepler for TOS query and selection tasks. Alternative is to stop, restart and repeat short workflows as the 'interactivity'
+ SEEK Use Case #2 need a taxon concept merging tool. SMS is working on merging other types of parameters across site data sets. (Get a list of species from LTER data sets, assume they are in EML, input to TOS, output unique list of merged names,
At line 23 added 1 line.
+ * Must mark up taxon names as concepts with GUIDs in the EML data sets first, GUIDs would be the concept IDs. Need a tool to do that, Morpho is the likely app. If no GUID in the data set, a call to TOS could match on data set taxon name and a taxon reference.
At line 24 added 1 line.
+ * Kepler work flow scenario: Kepler has Actors that can actually merge the data, need some user interaction with TOS to decide which level of lumping the user wants, include synonyms, concept overlaps, go up a level, etc.
At line 25 added 4 lines.
+ ** if the GUIDs match between data sets, we merge
+ ** if the names from two data sets match (derived from GUIDs in the date set) in TOS, we combine them
+ ** if concepts match in TOS (other TOS operations)
+ ** (Incomplete, Jessie and Bob discussed)
Removed line 27
- *** SEEK Use Case #2 need a taxon concept merging tool. SMS is working on merging other types of parameters across site data sets. (Get a list of species from LTER data sets, assume they are in EML, input to TOS [GetBestConcept], output unique list of merged names,
Line 29 was replaced by line 47
- **** Must mark up taxon names as concepts with GUIDs in the EML data sets first, GUIDs would be the concept IDs. Need a tool to do that, Morpho is the likely app. If no GUID in the data set, a call to TOS could match on data set taxon name and a taxon reference.
+ !! Wednesday Afternoon
Line 31 was replaced by lines 49-51
- **** Kepler work flow scenario: Kepler has Actors that can actually merge the data, need some user interaction with TOS to decide which level of lumping the user wants, include synonyms, concept overlaps, go up a level, etc.
+ * Extracting concepts from online and monographic sources, just mammals?, (Susan, Aravind)
+ ** Currently: 1600 PDF documents obtained, 100 are bat taxonomy papers, extracting data next
+ ** Desired: extract parent-child hierarchies, descriptions, synonyms,
At line 32 added 1 line.
+ * TOS Data acquisition roadblocks, process, role of software tools, getting data into TOS with an import tool for usability testers.
At line 33 added 1 line.
+ ** Currently: ITIS ("relational" concepts), Bats from MSW 2005 (MSW concepts from original pubs and synonyms without bib references), MSW 1993, FNA in TCS, German Mosses (concepts with concept maps),
Lines 35-37 were replaced by line 57
- ***** if the GUIDs match between data sets, we merge
- ***** if the names from two data sets match (derived from GUIDs in the date set) in TOS, we combine them
- ***** if concepts match in TOS (other TOS operations)
+ ** Requirements: TOS Actor could be tested with plant data,
At line 38 added 1 line.
+ ** Possible: Use Bob Peet's plant concept data (44,000 with relationships?)with PLANTS county level distribution data as a supplement to bat use case scenario, Ranunculus data set, 8 classifications, NA & Mexico.
At line 39 added 1 line.
+ ** Action: Stinger Guala to identify in 2 weeks the number of versions of PLANTS, send to KU to DiGIRize, Bob will send rich treatement of plant concepts the US Southeast. Available now: Ranunculus data, it needs to be upgraded to latest TCS, Xianhua will do that in 1 week. Jan or Feb: All USDA Plants in version 4, mapped against all plants in FNA, and also all of the Alan Weekly collaboration, his version mapped against 8 different classifications of plants.
At line 40 added 1 line.
+ * Broader TDWG and community issues for TCS and TOS. What are their expectations? Our responsibilities? Can SEEK demonstrate the utility of TCS and TOS within the next year? What tools do we need to complete and harden for the community to buy into TCS and TOS? What can we expect from the GBIF community? Short term versus ultimate objectives
At line 41 added 6 lines.
+ ** S-T TOS/TCS External Roles/Personas
+ *** S-T might be a concept provider for other people to test their concept applications,
+ *** S-T might be a concept repository for projects looking for a place to store them, might need a batch import process if requested.
+ *** S-T *must* use GUIDs because there will be multiple concept object servers
+ *** S-T if TOS is a global reference implementation, then we need to implement the whole TCS schema in TOS, we could not implement just some TCS fields, we would need to input and output 100% standard TCS documents.
+ *** Schema changes over time, do we need to maintain records in each version forever?
At line 43 added 2 lines.
+ *** S-T TOS/TCS Internal Roles/Personas
+ *** See SEEK use cases 1 & 2 and other related logic
At line 44 added 1 line.
+ Discussion of the data independence problem with DiGIR queries that have the same name for 2 or more concepts. Solutions: Give user option to allow duplicates, eliminate duplicates, combine overlaps or go interactive and give user alert that name used for multiple concepts, ot just log errors in a sideband pipeline.
Line 46 was replaced by lines 78-80
- and how to identify concepts that match user's query concept 'expectation', (e.g. query TOS, return concepts to Kepler, do a EcoGrid query). How to build a UI that is meaningful to users for such tasks.
+ * Status and review of Usability process:(Laura Downey with status powerpoint slides)
+ Discussed plans for interviewing Alan Weakly and Mouse guy.
+ Discussed plans for getting user feedback on mapping tools and on vizualization tools
Line 48 was replaced by lines 82-83
- ****
+ * May 2006 EOT Workshop Planning (Pennington, Beach)
+ Reviewed Beach draft agenda, majority felt it was too broad, too many topics with time for limited detail, and all topics would not be of interest to a diverse workshop audience. Suggestion that we pick a research theme like doing niche modelling and targeting a workshop for just niche modellers, or alternatively having a 2-day workshop on just taxon tools and concept data management.
At line 50 added 1 line.
+ !! Thursday Morning
At line 51 added 3 lines.
+ * Modifying EML to accomodate taxonomic concepts with Matt Jones
+ ** Issue: Need to have concept metadata to EML, so that Kepler Actors could work with concept data in data sets. Could be just the GUID, if available, or at least a name and a reference or a relation of the concept in the data set to a known concept in TOS.
+ ** Matt: context of current EML handling of taxon data. EML is already designed to contain species name information. 9000 EML data records now, planning version upgrade now. Matt: We need a full proposal that is well fleshed out before Matt can put the proposal changes into the EML maintenance process. US Federal Biological Data Profile (BDP) wanted classification tree in the metadata. Does not allow repeatable taxonomic classifications, but EML does to allow for taxa in different trees (e.g. animals and plants) in one record. Matt has asked them to allow for multiple classifications but the BDP committee is now largely defunct, agencies not meeting anymore on this.
At line 53 added 1 line.
+ ** 1 possible plan (Matt): Taxon would need to tell the EML group, what is minimally needed to met the needs of the SEEK project.
At line 54 added 1 line.
+ Discussion of other standards that handle names, DC, ABCD, SDD should handle concepts. Standards need to be crosswalked. ABCD overlaps with EML on collections metadata. Natureserve Observation group overlaps with occurrence data. Needs to be some coordination. Standards need the GUID bit and the human readable reference. A small common concept citation (reference) schema, for a few fields, across standards would be very useful, in addition to the taxon name
At line 55 added 1 line.
+ Two options: modify EML to support minimally required needs or wok with the community to get an agreement across projects.
At line 56 added 1 line.
+ We could really use a poster which describes the overlap and activity of these various infrastructure project.
At line 57 added 1 line.
+ Discussion about using TSNs and ITIS and PLANTS lack of versioning and consistency with their IDs, making them ess valuable.
Line 59 was replaced by line 103
- !!! Wednesday Afternoon
+ Whatever is proposed as required fields must be useful to end users. Ecologists must see added-value. Morpho has an ITIS plugin (ITIS-lib) to look up names in ITIS, and grabs the synonyms when the record is stored. Could be added functionality for the end the user at cataloging time to get the synonyms then.
Line 61 was replaced by line 105
- !!! Thursday Morning
+ Morpho: we still need a way in Morpho to add concept data. Matt, problem is how to convince people to add the data, right now code definitions (e.g. taxon name codes for a study, maps data set codes to names in the data set) can be put into EML. The section on taxon coverage in EML does not currently handle mapping information of any kind.
Line 63 was replaced by line 107
- !!! Thursday Afternoon
+ Morpho: Adding TOS lookup to Morpho (Java,Swing) 4 weeks? Simple plugin, to use GetBestConcept, from a lookup up popup in the Taxon data entry table. Xianhua might be able to work on this, his tool also needs to work with data from the TOS, that upgrade needs to be added.
At line 64 added 1 line.
+ UBER-Discussion
At line 65 added 1 line.
+ Actions:
At line 66 added 2 lines.
+ * Morpho
+ ** Nico will analyze the functional design requirement for modifying Morpho to add a TOS lookup during meta record creation. Description of placement, function, insertion, behavior of such a function (End of November)
At line 67 added 113 lines.
+ ** Xianhua will nominally consider working on the implementation of that using existing GetBestConcept service. (dependency on knowing requirements from Nico, and after other tasks)
+
+ ** Xinhua will upgrade Ranunculus data to latest TCS, (1-2 weeks) Bob will make his data available:
+
+ ** (Bob) In Jan or Feb 2006, all USDA Plants in version 4, mapped against all plants in FNA, and also all of the Alan Weekly collaboration, his version mapped against 8 different classifications of plants.
+
+ * Bat Data Subproject for Deomonstration Purposes
+
+ ** Objective: to exercise TOS and mapping, and to produce a nice demo.
+ ** Tasks and Subprojects:
+
+ *** Jessie: It would be useful to have bat treatments before 1993 to map them to 1993 treatment. Nico: we need to have a person to do that, we need a taxonomic expert to make those mappings.
+
+ *** MSW Bat data need to be mapped between 1993 and 2005 versions. Kate Jones demonstrated how they did the mapping in interview with Nico and Laura, the actual mappings can be identified from explicit annotations in the 2005 publication. But publication notes will likely not be explicit as to the exact mapping operator. We need someone to read the treatment and try to extract the mappings. We do not currently have a copy of the treatment. Diane Reeder and Don Wilson are the editors of MSW and created the source documents.
+
+ **** Beach will meet with Don Wilson November 2nd, explain our need and interest. Will ask him for a copy of the 2005 MSW bat treatment.
+
+ **** Nico will then use Xianhua's mapping tool to author the relationships. Reserves the right to pass on the task if overly complex. Nico will use the TCS-included subset of Nico's symbolic annotation codes. (dependent on receipt of MSW Bats 2005 text from Beach)
+
+
+ *** Susan's project will look at the 100 treatments she has since 1993 and will try to see if the concepts in those were adopted and/or mapped in the 2005 treatment. Then we can compare the actual 2005 tree with pieces that would have ben predicted from the treatments, then we can evaluate how close we get to truth. (January 2006)
+
+ * PLANTS (Stinger)
+ ** Stinger will send PLANTS database with county records.(October 31)
+ ** Vieglais will put the county records into a DiGIR server. (Dependent on receipt of data from Stinger.
+ ** Updated PLANTS nomenclature for 4.0 to be given to Bob.(October 31)
+ ** Stinger will get archived version of PLANTS.(October 31)
+
+ * Kepler-TOS Objectives (Gales)
+
+ ** Tasks
+ *** Rob will produce a GET-TAXA description of actors he is developing (November).
+ *** Version upgrade of TOS 1.01 (Gales and Stewart)
+ *** Rob will work with Matt to identify parameters for the ecological niche workflow.
+ *** Write one large "GET TAXA" TOS actor, to handle three actors now has (without a lot of parameters, just the basic workflow needs (January).
+ *** The GET TAXA actor will have added parameters to handle matching scenarios discussed on Wednesday. Concept overlaps, transitive links, etc.
+ *** Additional actors need to be specified based on requirements of the other workflows.
+
+ *** Setup SEEK-ITTC server for TOS.
+ *** Hibernate upgrade, TOS deployment, also see Aimee and Rob list from week before meeting.
+ ** Web application to take a GUID and output a subtree of all related concepts and descendants, for ConceptMapper queries on TOS.
+
+
+ * Concept MapperTool Objectives (Xianhua)
+
+ ** Translate Alan Weakley's Excel data into TCS (Timing UNKNOWN), sends to KU, import it into TOS
+ ** Incorporating Laura's heuristic evaluation comments, and results of the task analysis not currently in the tool (Nico and Laura have information (2 weeks)
+ ** ConceptMapper modified to import TCS documents
+ ** ConceptMapper modified to export TCS documents
+ ** ConceptMapper modified to creating concepts
+
+ ** User testing planned for 1Qr 2006 (Laura, Xianhua, Bob?, Alan, Brett, Bat Person, Nico?) Nico will organize schedule. Laura will organize the entire testing script, Xianhua will organize software, Bob will need to make local arrangments. Jim will confirm with Michener and Griego.
+
+ ** Load up Ranunculus data and prepare a scenario and demonstration for taxonomists/ecologists to compare taxa to help in the resolving of concepts.(Dependent on receiving data in TCS.)
+
+ ** Explore the effect of the granularity of matching. If you match only on concepts what you discover? If you match on names, what knowledge do you discover? Etc. Pass that document to Laura and Bob, to plan an evaluation session with Bob in early 2006.
+
+ * GUID Issues
+ ** TDWG Workshop in February, Attending: Peet, Gales or Perry, Spears, Kennedy(?)
+ ** Need to discuss how to configure GUID services for SEEK, which data get returned?
+
+
+ * Future Meetings
+
+ ** __Next SEEK-Taxon Conference Call__ November 22, 2005, 4PM GMT, 11AM EST, 10 CST, 9 MST, 8 PST
+
+
+ ** __SEEK Early Faculty Ecoinformatics Training for Ecologists__, January 2006.
+ *** Nico presenting SEEK taxon concept talk
+
+ ** __SEEK Developers Meeting__, April 30--May 5, 2006
+
+ ** __SPNHC Meeting__, May 23-27, Albuquerque
+ *** good opportunity for outreach to collections managers and museum directors
+
+ ** __Workshop/Feedback Event__
+ *** Tentatively planned for May 2006, to overlap with SPNHC? Specify, Usability interviews and testing, not a 5 day general informatics meeting
+ ** DEADLINE to decide Two Weeks
+
+ ** __Ecological Society of America Annual Meeting__, 6-11 August 2006.
+ *** 3500-4000 scientists, Memphis, Tennessee
+ *** Things to potentially demonstrate: Taxon Comparison tool, Kepler ENM use case with two different classifications, Peterson, two bird classification, biological reserve planning, demonstration, ConceptMapper. Need a very large graphic display! Try Bat data 93-05 comparison with distribution data
+
+ ** __TDWG__ October 2006, Somewhere in the U.S., St. Louis, or Durham.
+
+
+ !! Thursday Afternoon
+
+ Review of objectives and timetables (above).
+ Stinger discussed availability of concepts with character data for TOS, grasses from various projects, family lists,
+
+ !! Taxon-related Plenary Notes
+ * Need a set of scenarios for the different tasks we are trying to improve and/or enable for our various user groups. They are in peoples' heads and discussed but not really written down anywhere.
+
+ * TOS user interaction:
+ ** could be embedded within data search so user can make decision about what data set to get.
+ ** or use to extract the rows from a data set?
+ ** What do the user interfaces look like for user to interact with?
+ *** What about asking user to pick the best match?
+ *** What about asking user to rank the authoritative sources then having the system do the concept resolution based on that?
+ *** Have user specify some level of precision for matching/concept resolution etc.?
+
+ * Establish plan for showing Martin's visualization to collection managers. This is a near term activity. We will structure the feedback and conduct the feedback most likely remotely using technology. General plan is demo the product, demonstrate the current tasks it can support, then get user feedback on tasks they would like to do that it doesn't support.
+
+ * Concept mapper - change connect to DB to read and write TCS documents
+
+ * Kepler actors:
+ ** There is an actor that does the querying of data and returns concepts (for several species) so this would replace the user using the data tab and getting results and then dragging that data set on to the canvas. (Laura's question). This would then feed into the ENM workflow.
+ ** Should we have a tool outside or within Kepler for users to configure the data (based on TOS). Users could configure the actor to fire automatically or manually.
+ ** Jessie feels strongly that the searching should be part of the workflow why is there a data tab separately?
+ *** One reason is because a search returns multiple objects and then a decision needs to be made of which data sets to use. This was seen as a separate step since the workflow objects/actors are seen as configurable but not necessarily interactive.
+ *** There are some technical issues within Kepler that have prevented more interactive actors.
+

Back to All Hands Meeting 2005 Taxon Agenda And Notes, or to the Page History.