Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



All Hands Meeting 2006 Notes

This is version 1. It is not the current version, and thus it cannot be edited.
[Back to current version]   [Restore this version]


SEEK-Taxon Breakout at SEEK All Hands Meeting, May 2, 2006, UNM, Albuquerque, New Mexico.

Agenda

Meeting Objectives:

  1. Clearer vision of how to collaborate with projects that will be the bases for those collaborations
  2. Understanding of what we have not achieved so far with SEEK Taxon, things left undone, can they be achieved in the next 18 months.
  3. Understand the issues of long-term maintenance of SEEK Taxon products, Data resources, TOS, software tools
  4. Identfy priorities for the next 18 months.

Round the table updates

  • Rob Gales, Aimee Stewart
    • TOS operational, MSW 2 versions, ITIS, Bob offered plant data sets
    • No supporting web service to import TCS records from other applications, currently. Can do file-based imports manually.
    • Kepler Actor in the ENM workflow,
  • Xianhua
    • Testing concept mapper
  • Jessie Kennedy
    • TDWG funding to research and draft a core ontology for TDWG data. Rob Gales, Robert Kukla to take that core ontology and implement a domain ontology to be used for converting an existing data set (possibly Hexacorallians, taxon names, concepts, specimens,) using LSIDs and RDF technologies to link them all, LSID community authority architecture for applications.
  • Martin Graham
    • Able to drop in DC records into the Viz tool, e.g. from any DiGIR provider using the MaNIS schema.
  • Bob Peet
    • Meeting before TDWG 2006, to create stds for plot data. Taxon concept data would be embedded in the std and in the data sets.
    • FGDC Veg Committee is promoting a standard that would include concepts in the std, Mtg with USFS regarding their long-term support interests.
  • Susan Gauch
    • GRA Masters project: Extracting concept data from literature, by first doing a spider to find taxonomic documents, and then filters them to filter the data set to actual taxonomic documents. Have a Spider, still working on the classification algorithm.
    • Given a taxonomic document, extract the relationships among taxa in the document, (future work) then go back to the TOS server and ask if these are new concepts or already in TOS. If new then insert into the TOS.
  • Jim Beach
    • New Specify and Specify in 2007. Dave Remsen, uBIO interest
Tasks to be finished from original objectives
  • Bat data mapping: Kate Jones, Stinger’s Search for bat survey data person, to obtain bat data sets,
  • Revisiting the Kepler Use Case, Does mapping bat data figure into a Kepler use case? What are the software consequences for Kepler, for using (1) concept mapping, and (2) queries that use more than string comparisons. Dependencies within the Use Case to make it work end-to-end.

PowerPoint Presentation preparation

Beginning SEEK Taxon graphic slide Show them what names are and how they are used, illustrate that names are not unique. Talk about the problems with current usage. Matt does he say in the intro that data sets need to be integrated based on concepts? If he uses that we could reuse the slide, and point out what we are doing. This issue of data integration is not as simple as matching names, bring in a book of the rules of nomenclature to illustrate the rules.

What is the difference between the concept and a name? A name according to a reference that defines the concept.

Jessie Significance statement of dealing with concepts and not with names –the slide of lumping and splitting the consequences of misinterpreting name lists. Changes in name over time can create errors for analysis that are artifacts of the names and concepts used through time.

Say why this problem is important. In ecology you tend to look at data sets over a period of time, or over different geographical areas, and both dimensions introduce different names and concepts. Integrative and synthetic activities need to respect these changes and disambiguate the labels. Analyzing data through time and space.

TOS architecture, providers up and to the left.

Software Demonstration 3:30

EML Discussion (4 PM)

Laura’s Usability PowerPoints on Taxon Usability

Future SEEK Taxon Activities Emerging projects (next 18 months) Jessie’s new funding (P*&& C%$#) , visualization tools? Long-term vision, reason for being (beyond 18 months) See October All Hands meeting notes Usability Interactions over the next 18 months

ESA Booth, What will Taxon do to support, second week in August

SPNHC Meeting, SEEK Taxon Poster

Taxon Breakouts Laura with Bob and Xianhua review ConceptMapper usability

Future Collaborations for SEEK Taxon

 Demonstration Project

1. SEEK Workflow idea, creating a new classification, export into TCS import to TOS, export data to conceptmapper and txax viz. 2. Map relationships between TOS and new data. 3. Marking up data with EML and GUIDS

 Using GARP showing impact on different classifications on noche models with ranuculus.

1. adding common names and using them for queries might be good.

uBIO and Rod Paige, GBIF, Jessie, no idea on what GBIF or Europeans plans are with concepts. GBIF has catalog of life.

USDA Plants, Stinger wants to do concepts.

Collections Specify Concepts, should we pursue that.

Jessie, we should not look so far into the future, the broader impacts.

Bob, authoring tools that would allow people to contribute concepts to TOS and take ownership of concepts. A way for people to author new concepts and get instant gratification.

Jessie, the NHM has an EU funded project to develop two different taxonomies online. (Does not know of anyone in Europe or in the UK working with taxon concepts.)

Laura – what problems are we solving or whom?

Laura going back to four taxonomists. Prime problem is that the data they want is not online. Solution was to create a taxonomic tool to capture taxon information.

Wanted a collaborative tool for authoring inventories of major groups, want literature online.

Jessie, we have a solution for our own problems.

Next steps Jessie: Getting people who have concepts and manage concepts on Board and get them to start managing concepts. Letting us serve their data. MSW may have more data.

Bob we should use the demonstration project first to sell the vision to the community. Take something to them and convince them to use it.

Discussion about things that we have not done. (Suggested by Jessie).

Where we failed, not getting decent concept people on Board with us earlier. What does ‘on board’ mean? No clear idea.

Jessie we really need a good demonstration project in the short term one that demonstrates our capabilities but not directed to solving any particular outreach problem for any particular group, it is too late in the project for that.

Laura, what problem is being solved? Can you tell me in three sentences.

Review of the four taxonomist’s comments

Jessie the real issue for us that ecologists are our users. We need to serve them, and to convince them that what they are doing is wrong to ignore concepts. We need to convince them to take on these problems, without adding much or any other workload, then they will collaborate We have to mail their lives easier.

They have to see the perceived benefit to play -- Laura, or They have to play by the rules -- Bob.

Bob the problem is that the data providers and the data consumers are two different communities. And that the providers are not entering enough metadata into the system for the data synthesis project.

Catfish project. Jessie like the idea of doing real world science, but it is too late in the project to link up to a another project and learn how they manage data.

Bob suggested the Appalachian Trail project funded by the park service as a way to integrate data. Lots of data sets.

Jessie -- Big Unresolved SEEK-Taxon Issues (Diagram on the dry-erase Board). (incomplete notes here).

1. Getting data from data providers in TCS. (a) getting concepts by scraping, e.g., and (b) mapping among concepts. Also mapping source materials into something meaningful into TOS concepts is also a problem. Mapping from the DB to the TOS Making concepts out of names essentially. (But we support nominal concepts.) Need to get data providers “On Board” 2. Generating LSIDs do not do that yet. 3. We CAN output TCS into ConceptMapper and into the visualization tool but it will take some work to maintain both. 4. Need a tool for resolving concepts in TOS. 5. Need a tool to take a TOS concept in TCS and import it into a tool that will mark them up in EML. Morpho might be extended, but maybe not. The LSIDs need to be put in the ecological data sets. Martin is going to be working ‘with one of these other guys’ to look into developing a tool for this with other grant funding. 6. Also need to deal with the algorithms for matching. The more complete data we have in TOS the more sophisticated the algorithm can be.

Questions: 1) Who would work on these pieces? 2) How would the process work in SEEK Taxon.

Matt--A good use of TOS would be to filter the data that comes back from the digger providers. Right now Mephitis strings are grouped by string matching and it would be better to use the TOS to find the matching concepts and to group them that way.

All the sidebar does now is query all of the collections that are registered in GBIF, we should put TOS in between that do a TOS query from left panel in Kepler to filter and validate concepts.

EML does not have the unrestricted value space problem. The structured data are very well defined and constrained. Structured queries are not implemented in Kepler yet, the left pane just does a string search on title, abstract, etc. even though the EML has the structured data.

Should Specify produce TCS records for its taxon data? Should Specify produce EML for the collections databases?

Roger Hyam in the U.K. and LandCare New Zealand did an implementation of TCS.

Matt-looking for someone to develop a tool for marking up EML data records. Mark, Shawn and Josh are researching a global core ontology for other kinds of concepts.

Jessie looking for a graduate student to develop visualization tools for comparing graphs. With any luck it would be useful for comparing ontologies.

Discussion at 4:30 PM, of embedding TCS data into an EML record.



Go to top   More info...   Attach file...
This particular version was published on 16-May-2006 12:59:30 PDT by KU.beach.