Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



All Hands Meeting 2006 Notes

SEEK-Taxon Breakout at SEEK All Hands Meeting, May 2, 2006, UNM, Albuquerque, New Mexico.

Agenda

Tuesday, May 2, 2006

Meeting Objectives:

  1. Assess status and review current activities, since last teleconference.
  2. Clarify a vision of how SEEK-Taxon could collaborate with other projects with taxon concept data.
  3. Reach an understanding of what we have not achieved so far with SEEK Taxon, things left undone and identfy priorities for the next 18 months.
  4. Understand the issues of long-term maintenance of SEEK Taxon products, Data resources, TOS, software tools.

Round the Table Updates:

  • Rob Gales, Aimee Stewart
    • TOS operational and on line, Mamnal Species of the World 2 versions in TOS now, also ITIS. Bob offered plant data sets.
    • No supporting web service to import TCS records from other applications, currently. Can do file-based imports manually.
    • Kepler Actor in the ENM workflow, but Rob has not heard from Dan Higgins on next steps with integration with Kepler.

  • Xianhua
    • Testing ConceptMapper, working with Bob and Laura on usability engineering

  • Jessie Kennedy
    • TDWG funding to research and draft a core ontology for TDWG data. Rob Gales, Robert Kukla to take that core ontology and implement a domain ontology to be used for converting an existing data set (possibly Hexacorallians, taxon names, concepts, specimens,) using LSIDs and RDF technologies to link them all, LSID community authority architecture for applications.

  • Martin Graham
    • Implemented capability to drop in Darwin Core records into the renamed TaxViz tool, from any DiGIR provider using the MaNIS schema.

  • Bob Peet
    • Scheduled to meet prior to TDWG 2006, to create standards for plot data. Taxon concept data would be embedded in the std and in the data sets.
    • FGDC Veg Committee is promoting a standard that would include concepts in the std, Mtg with USFS regarding their long-term support interests.

  • Susan Gauch
    • GRA Masters project: Extracting concept data from literature, by first doing a spider to find taxonomic documents, and then filters them to filter the data set to actual taxonomic documents. Have a Spider, still working on the classification algorithm.
    • Given a taxonomic document, extract the relationships among taxa in the document, (future work) then go back to the TOS server and ask if these are new concepts or already in TOS. If new then insert into the TOS.

  • Jim Beach
    • Released last week Specify 5.0 and have work underway for a modular Java release in 2007 which would use concepts from TOS.

Discussion of tasks to be finished from original objectives

  • Bat data mapping: Kate Jones, Stinger’s Search for bat survey data person, to obtain bat data sets,
  • Revisiting the Kepler Use Case, Does mapping bat data figure into a Kepler use case? What are the software consequences for Kepler, for using (1) concept mapping, and (2) queries that use more than string comparisons. Dependencies within the Use Case to make it work end-to-end.

Review of PowerPoint Presentation preparation for NSF review by Aimee Stewart

  • Jessie, need a strong significance statement of dealing with concepts and not with names, e.g. the slide of lumping and splitting the consequences of misinterpreting name lists. Changes in name over time can create errors for analysis that are artifacts of the choice of names and concepts used.

  • Say why this problem is important. In ecology you tend to look at data sets over a period of time, or over different geographical areas, and both dimensions introduce different names and concepts. Integrative and synthetic activities need to respect these changes and disambiguate the labels. Analyzing data through time and space.

  • TOS architecture slide, providers will be added up and to the left.

Software Demonstration

  • Jingtao demonstrated countly level occurence maps and set based spatial analysis operations at the county level of plant occurence data. Functionality closely mirrors Kartez and Meacham application for North America Flora.

  • Laura Downey revied her NSF review presentation of PowerPoints on Taxon Usability Work
    • Inquired as to SEEK-Taxon's interest in more usability analysis over the next 18 months
    • Asked re ESA Annual Meeting Booth, What will Taxon do to support, second week in August

Possible Future Collaborations for SEEK Taxon

  • Demonstration Project

  1. SEEK Workflow idea, creating a new classification, export into TCS, import to TOS, export data to Conceptmapper and to TaxViz.
  2. An application, or (Kepler?) workflo for mapping relationships between TOS and new data.
  3. Marking up data with EML and GUIDS
  4. Using GARP showing impact on different classifications on noche models with ranuculus.
  5. Adding common names and using them for queries might be good.

  • Collaborations with others
    • Other people "doing concepts" -- uBIO, Rod Paige, New Zealand research lab, GBIF plans for concepts are unclear.
    • USDA Plants, Stinger wants to do concepts.
    • Specify could implement an itnerface to the TOS for collections management using concepts.
    • Bob, authoring tools that would allow people to contribute concepts to TOS and take ownership of concepts. A way for people to author new concepts and get instant gratification.

  • Laura – what problems are we solving or whom?
    • Laura going back to four taxonomists. Prime problem is that the data they want is not online. Solution was to create a taxonomic tool to capture taxon information.
    • Wanted a collaborative tool for authoring inventories of major groups, wanted literature online.

  • Jessie, that's fine but we are working on a solution for our own problems, not those.
    • Next steps Jessie: Getting people who have concepts and manage concepts on Board and get them to start managing concepts. Letting us serve their data.

  • Bob we should use the demonstration project first to sell the vision to the community. Take something to them and convince them to use it.

Discussion about things that we have not done. (Suggested by Jessie)

  • Jessie
    • It was an oversight not to get active researchers using concepts on board with us earlier.
    • We really need a good demonstration project in the short term one that demonstrates our capabilities but not directed to solving any particular outreach problem for any particular group, it is too late in the project for that.

  • Laura, what problem is being solved? Can you tell me in three sentences.
    • Review of the four taxonomist’s comments we interviewed on usability issues in Santa Barbara
    • Jessie, the real issue for us that ecologists are our users. We need to serve them, and to convince them that what they are doing is wrong to ignore concepts. We need to convince them to take on these problems, without adding much or any other workload, then they will collaborate We have to mail their lives easier.

  • Laura, they have to see the perceived benefit to play
  • Bob, or they have to play by the rules.

  • Bob, the problem is that the data providers and the data consumers are two different communities. And that the providers are not entering enough metadata into the system for the data synthesis project.

  • Jim, still likes the planetary biodiversity inventory Catfish of the World Project as an example of a potential collaborator that must manage historical and new taxon data for project needs.

  • Jessie, likes the idea of hooking up with real world science, but thinks it is too late in the project to link up to a another project and learn how they manage data.

  • Bob suggested the Appalachian Trail project funded by the park service as a way to integrate data. Lots of data sets.

Big Unresolved SEEK-Taxon Issues as illustrated by dry-erase board diagram by Jessie. (incomplete notes here -- ed., who has the photograph?)

  1. Getting data from data providers in TCS. (a) getting concepts by scraping, e.g., and (b) mapping among concepts. Also mapping source materials into something meaningful into TOS concepts is also a problem. Mapping from the DB to the TOS Making concepts out of names essentially. (But we support nominal concepts.) Need to get data providers “On Board”
  2. Generating LSIDs do not do that yet.
  3. We CAN output TCS into ConceptMapper and into the visualization tool but it will take some work to maintain both.
  4. Need a tool for resolving concepts in TOS.
  5. Need a tool to take a TOS concept in TCS and import it into a tool that will mark them up in EML. Morpho might be extended, but maybe not. The LSIDs need to be put in the ecological data sets. Martin is going to be working ‘with one of these other guys’ to look into developing a tool for this with other grant funding.
  6. Also need to deal with the algorithms for matching. The more complete data we have in TOS the more sophisticated the algorithm can be.
  7. Questions:
    1. Who would work on these pieces to complete the picture (i.e. data flow and functionalities)?
    2. How would the process work in SEEK Taxon?

  • Matt--A good use of TOS would be to filter the data that comes back from the digger providers.
    • Right now Mephitis strings are grouped by string matching and it would be better to use the TOS to find the matching concepts and to group them that way.
    • All the sidebar does now is query all of the collections that are registered in GBIF, we should put TOS in between that do a TOS query from left panel in Kepler to filter and validate concepts.

    • EML does not have the unrestricted value space problem. The structured data are very well defined and constrained. Structured queries are not implemented in Kepler yet, the left pane just does a string search on title, abstract, etc. even though the EML has the structured data.

  • Matt-looking for someone to develop a tool for marking up EML data records. Mark, Shawn and Josh are researching a global core ontology for other kinds of concepts.

  • Jessie looking for a graduate student to develop visualization tools for comparing graphs. With any luck it would be useful for comparing ontologies.

Wednesday, May 3, 2006

  • Practice Demonstrations of NSF site review talks

Thursday, May 4, 2006, 8-11 AM.

SEEK_Taxon Breakout Session to discuss priorities and next steps

  • Make a stronger connection to Kepler
  • Need to get more data, and to get the bat data mapped, Bob offered plant concept data also
  • Jim offered to pursue the identification and engagement of a Mammalogist to help with bat concept data mapping

  • Discussion about the need for sample data to be able to show end-to-end connectivity and workflow with concepts.
    • Bob, Alan Weakly has 6500 taxa, 8 classifications; just having two classfications (MSW) with just names is likely not enough, we need comprehensive data sets with many characters, over time.
    • Bob, we have all of data for the classifications for Juglandaceae (trees). Also have complete Ranunculus data, with full concept information. Bob committed to producing some data for the TOS by about the end of June.
    • Discussion of bat data, bat people, value in finding some bat survey and observation data sets that could be marked up.
    • Jim offered to pursue some bat leads. Talk to Bob Timm (KU), Don Wilson (USNM), MSW bat treatment author, possibly Texas Tech lab.
    • Dave, what about searching EML data sets for taxonomic names? Aimmee-did that, not much there.
    • Rich Pyle once offered (we think) angel fish concept, data, we could check to see if that is still on the table.
    • Dave Thau, someone mused, likely also has ant data concepts from the Ant web project.



Go to top   Edit this page   More info...   Attach file...
This page last changed on 16-May-2006 15:08:38 PDT by KU.beach.