Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



Seek Taxon Tools

Overview of SEEK Taxon's Editing and Visualization Tools

  • Nico Franz, Xianhua Liu & Robert Peet – November 24, 2004


→ access the very similar (slightly less polished) MSWord document here: TaxToolsOverview.doc

Purpose and structure: This document brings together our thoughts about browsing, editing, and visualization Tools that SEEK Taxon needs to build for potential users. It is related to SEEK Taxon's Use Cases (see http://seek.ecoinformatics.org/Wiki.jsp?page=TaxonUseCasesTemplatesMarch04) and also describes additional functionalities (e.g. EML-connection, peer review) we must provide. As was done for the Use Cases, the Tools are defined mainly according to the functions they perform instead of the users who will need them. This seems preferable because in any particular workflow, ecologists or taxonomists may access multiple Tools. Some of these will be functional in other workflows as well.

Each Tool should ultimately have its own scoping document, specifying the user requirements and technical demands (including assessment of any need for refinement of the TOS APIs). Our goal is to prepare all these documents for group discussion, refinement, and prioritization at the SEEK Taxon meeting of February 15-17, 2005. We anticipate considerable interaction with members of SEEK Taxon during the development of the scoping documents prior to the meeting.


I. SEEK Taxon Concept Users Manual and Tutorial

  • Potential users: Ecologists and taxonomists; anyone interacting with taxonomic concept tools and services of the SEEK infrastructure.
  • Functionality: (1) Introduces ecologists to the importance/relevance of using concepts in ecology; and specifically explains how to understand and operate the Concept Mark-Up Tool; (2) introduces taxonomist to the importance/relevance of using concepts in taxonomy; and specifically explains how to understand and operate the Concept/Relationship Creation Tool.
  • Use Cases: Partly applicable to all Use Cases/SEEK Taxon Tools.
  • Integration: (1) Could pop up as a window when a user starts several of the other Tools; and (2) should be accessible like a "help function" on the interfaces of various other Tools.
  • Design needs: Whatever is required to design a "typical" software tutorial; overview of content; search for specific functions and terms; text boxes; display boxes; even graphic animation.
  • Further details: It will be best to develop at least core portions of this document simultaneously with the design/building of the various Tools, thereby documenting the work accomplished. The draft can then be reviewed for usability and improvement towards the end of the project.
  • Priority/development: Documentation of work accomplished is high priority; but a finished, polished document is of relatively low priority at this stage in the project. This product will be developed as the functionalities of other Tools become real and ready to be released to the user community; in the meantime, SEEK Taxon's publications and personal interactions can substitute for portions of the Manual.

II. Batch-Data Import Tool

  • Potential users: Taxonomic concept providers and -aggregators; entities with larger sets of concepts/relationships stored in digital format and compatible with the TCS.
  • Functionality: Allows providers to "configure" their data holdings in such a way that they can be imported as large blocks into the TCS/TOS (SEEK's taxonomic database). Ideally we would like to use this Tool to read data in database or spreadsheet formats and map them painlessly into either the TCS format for export to other parties (XML), or directly into the TOS. There are potentially two development steps here: (1) spreadsheet/user database to XML, and (2) XML to TOS. This Tool might be a desktop application that exports to TCS or TOS only when compilation of a package is complete. How complex this Tool will be will depend on the extent to which we expect the input information to already be in a form consistent with TCS. The minimum Tool simply functions as a funnel into TOS, whereas the more complex version handles data reformatting and manipulation Tool. We also should have some form of rectification function where we check to see if similar or identical stuff is already in the TOS and send warnings so as to avoid unnecessary redundancy. The Tool should import not only concepts, but also concept relationships.
  • Use Cases: Use Case 2: acquire concept from another database location (automated).
  • Integration: Seemingly not part of other Tools or workflows; possibly linked to Map-to-TCS Tool. Substitutes for Concept creation Tool V to increase user efficiency.
  • Design needs: (need some help! from our developers here as to what it takes).
  • Further details: None yet; except we should keep in mind the kinds of software programs taxonomists use. Specifically, we might want to provide a Tool or plug-in to import trees from PAUP and equivalent software, but such a plug-in should be compatible with Tool V. Finally, we need to develop business rules for who may submit blocks of data to the TOS and under what conditions.
  • Priority/development: High priority; has been explored in the process of creating/using the TCS, and also when SEEK KU acquired data from ITIS, etc.; is necessary for SEEK Taxon to quickly acquire a critical mass of concepts. The Tool can be modularized in its construction, starting with the core function of simple mass import of well-formatted concepts, and then adding other functions as time and prioritization permit.
  • SBA Feb comment: need to pay attention to how well concepts are defined

III. Concept/Relationship/Status Exploration Tool

  • Potential users: Ecologists engaged in marking up their names with concepts and taxonomists exploring the concept holdings of the TOS.
  • Functionality: Users should be able to determine whether a particular concept or view is represented in SEEK or whether it needs to be added. If it is present, they will want to know how to reference it (obtain its GUID). The Tool should allow users to query concepts by entering a name (simple query) or any other information tied to concepts (advanced query) and receive back a list of candidates, with some flexibility as to how widely the net is cast for candidates. Options should include ability to select preferred views and dates, ability to view parents and children. Once a concept is identified, the user should be able to identify related concepts via various filters.
  • Use Cases: Use Case 3: select preferred view on concepts; Use Case 4: display concepts associated with names (isolated, statically); Use Case 5: display connections to synonymous concepts (statically); Use Case 6: display connections to parent or child concepts (statically); Use Case 9: display concepts associated with information (references, specimens, etc.) other than names (alternative query); and Use Case 10: display information (references, specimens, etc.) associated with concepts (reverse query).
  • Integration: See Potential users above – the Tool is part of the mark-up process for ecologists; and assists taxonomists in exploring concepts; can interact with the Multi-Classification Visualization Tool.
  • Design needs: Can represent the structure of the TCS; should be based on text boxes and tables, not fancy visual displays.
  • Further details: various similar tools (Euro+Med, etc.) already exists - potential design lessons.
  • Priority/development: High priority; a lot of Use Cases are represented here; and a lot of workflows depend on this Tool. Because of the range of functionality required, we anticipate that the Tool will be built in stages with functionality added incrementally.

IV. Multi-Classification Visualization Tool

  • Potential users: As with Tool III; and ecologists and taxonomists who wish to compare alternative perspectives.
  • Functionality: Represents visually the parent/child (vertical) relationships among concepts within one or more hierarchical classifications; as well as the (lateral) set-theory relations among concepts contained in alternative classifications.
  • Use Cases: Use Case 7: display concept connections dynamically – individual concept lineages; and Use Case 8: display concept connections dynamically – partial or entire classifications.
  • Integration: Integrates with Tool III.
  • Design needs: We should refer to Martin Graham's work and expertise here.
  • Further details: See http://www.dcs.napier.ac.uk/~marting/
  • Priority/development: Relatively low – only in the sense that MG has already created a Tool that works, e.g. with the moss concepts; pending resources we can expect the Tool to evolve and support the TCS structure; integration with Tool III is needed, though.

V. Concept/Relationship/Status Creation Tool

  • Potential users: Ecologists engaged in the mark-up process who realize that their concept is not adequately represented in the SEEK Taxon environment; taxonomists and aggregators contributing old or new concepts, relationships, and status assignments.
  • Functionality: Allows users to enter information concerning concepts, their relationships, and their statuses in an interactive manner. (RKP comment: this is not the Tool for dealing with unusual concepts used by field ecologists and which lack taxonomic standing, → see Tool VIII) This should include both a basic functionality where concepts and concept relationships are entered from a simple set of forms, but with advanced visualization tools for more sophisticated applications.
  • Use Cases: Use Case 1: assign status to concept; Use Case 11: acquire existing concept (manually); Use Case 12: acquire existing synonymy or parent/child connection (manually); Use Case 13: acquire new concept from expert; and Use Case 14: acquire new synonymy or parent/child connection from expert.
  • Integration: Close integration with Tools III and IV, once information has been entered → passing-on to Tools VI and VII.
  • Design needs: Xianhua's TCBE is approaching some of the needs, see also Nico's comments at http://seek.ecoinformatics.org/Wiki.jsp?page=NicoImplementationThoughts
  • Further details: Potentially this will be a sophisticated tool. It needs to support both the structure of the TCS (see Tool III) and have the ability to represent at least two classifications and relationships among their concepts visually, so that experts can "see" what concepts they are connecting to each other. However, as with Tool III the development can be staged with the simple functionality added first and the visualization component added later as time and resources permit. Also, we need to incorporate a mechanisms by which work in progress is not published (integrated into the public side fo TOS) until a project is completed. This could be accomplished by storing the work in progress on TOS and flagging it as private, or by keeping the work in progress on a desktoi client and only porting to TOS when the project is complete.
  • Priority/development: High priority for basic functionality and lower priority for the visualization process; similar to Tool III.

VI. Taxonomic Peer-Review Tool

  • Potential users: Taxonomists, other authorities who can review the merits of actions performed in Tool V.
  • Functionality: Any user can add concepts and relationships via Tool V. However, as concepts and relationships proliferate, sets of preferred views will be needed. Given the magnitude of the job, the greater taxonomic community will need to be involved. We should provide the ability for authority groups to become stewards of specific sets of taxa. Newly proposed concepts and relationships would be sent to these joint authorities for peer review and eventual inclusion in the perspective and maintained in the TOS. With this process in place, the naïve user will have the option to select from one of a few authoritative perspectives. This approach is modeled after the CI project of Crispin Wilson and might be viewed as closely related to an ITIS model with peer review. By empowering taxonomists to provide authoritative perspectives, the Tool will encourage greater participation.
  • Use Cases: None specifically, though see Tool V.
  • Integration: Is an "export option" in Tool V; once new information has been entered, it goes out for reviewfor possible incorporation as a preferred perspective.
  • Design needs: See Xianhua's efforts; also IPNI, various others on-line peer-review projects.
  • Further details: See Tool VII.
  • Priority/development: We can build on a related Tool already under development for submitted plant communities, so it should be easily completed for linkage to Tool V.

VII. GUID Request and Assignment Tool

  • Potential users: "Authorities" – this is done by whoever is allowed to issue GUIDs to taxonomic concept information.
  • Functionality: Anytime someone has made it through Tool II or V and submitted one or more, concepts, names, references, relationships, etc. for inclusion in the TOS, GUIDs need to be obtained. The GUID request Tool could be a plug-in to Tools II and V, or it could be integral to the workings of the TOS and therefore unnecessary as a stand-alone Tool. It is unclear whether we want a GUID creation Tool for concepts and relationships etc. that are not submitted to the TOS.
  • Use Cases: Use Case 15: assign GUID to taxonomic concept information (provisional).
  • Integration: Assign GUIDs to concepts and relationships entered through Tools II and V.
  • Design needs: See Dave Thau's prototype GUID server at http://www-124.ibm.com/developerworks/downloads/detail.php?group_id=124&what=rele&id=725
  • Further details: None yet.
  • Priority/development: Already under development (see above).

VIII. Concept Mark-Up Tool

  • Potential users: Ecologists using EML, submitting datasets with formal or informal taxonomic names.
  • Functionality: Assists ecologists to move from names in their datasets to concepts referenced in EML. Some close affinities with Tool III, perhaps applied in batch mode. For each name in a dataset (names with or without a reference) candidate concepts would be provided following a preferred perspective, and those names with issues of resolution would be flagged. This Tool must also provide functionality for documenting a field ecologist's concepts that lack taxonomic standing by describing them and linking them to established concepts that reside on the TOS.
  • Use Cases: See Tool III.
  • Integration: Needs to initiate as part of EML, possibly Kepler; at this point the Concept Introduction Manual (Tool I) comes into play as well; ecologists have two core choices: (1) automated mark-up, and (2) manual mark-up; for choice (1) they can select one or few preferred reference taxonomies (such as FNA) in which to search for concepts matching the names contained in their datasets; they can also interactively specify certain algorithmic preferences to perform the name/concept match; once this is done the SEEK Taxon service automatically could match the names to concepts and fill in a short-hand annotation of the best-matching concept in EML; for choice (2), the ecologist has access to Tools III and IV and, if necessary, to Tool V (add concepts/relationships, etc.), in case no stored concept in the SEEK Taxon database matches his/her concept in the dataset; once the exploration/creation phase is over and a suitable concept has been located or newly entered, then ecologist must again paste a short-hand annotation of it into EML.
  • Design needs: Basically a hub between EML/Kepler and the SEEK Taxon service; must support Tool I and assist in choosing automated vs. manual name-to-concept resolution.
  • Further details: None yet.
  • Priority/development: High priority, although there is need to have Tools III, and preferably also V in place before this Tool can be applied; in the meantime, the taxonomic/voucher section of EML might need expansion/revision for compatibility with the TCS.
  • Gathering the information necessary for the mark-up occurs by interacting with the TOS, but storing the information will happen in Morpho/EcoGRID

IX. Concept Merge Tool

  • Potential users: Ecologists who are conducting meta-analyses.
  • Functionality: Merges concepts that are similar so that ecological information can be pooled – this is actually the main functionality that ecologists need: what do the names mean?, and is it admissible to pool the information pertaining to multiple names/concepts?
  • Use Cases: None within SEEK Taxon directly, but our Use Cases and services are needed to support this scenario. We should develop more specific Use Case scenarios here!
  • Integration: Not sure!; presumably the names contained in the many independent ecological datasets have already been marked up as concepts? This might be handled through Kepler, after all.
  • Design needs: Ability to import sets of concepts and find matches; suggesting when the ecological information can be merged and how to merge ambiguous matches. This is a complex task commonly undertaken in meta-analyses. We need to develop a scoping document that sets out in some detail the kinds of decisions that need to be made and the information needed to support those decisions. We anticipate providing requirements in some detail to SMS/Kepler for implementation as part of other data integration tools.
  • Further details: Need input from ecologists here…; SEEK Taxon will design the taxon-specific workflows/integration issues, but right now we think that SMS should end up completing the Tool.
  • Priority/development: Lower priority than Tool VIII as we need datasets before we can merge them; still, supposedly high → see Mammal Use Case!

X. Map-to-TCS Tool

  • Potential users: Taxonomic concepts providers who understand and wish to document the relations of objects in their preferred database to those represented in the TCS.
  • Functionality: Assists in understanding the object relationships ("what are concepts?", etc.) according to different database structures.
  • Use Cases: None specifically (but see below).
  • Integration: Not clear, is sufficiently independent at this point.
  • Design needs: → Trevor Paterson, MapForce…
  • Further details: See above; also: this Tool could eventually plug-in to Tool II.
  • Priority/development: Is in a sense already underway at Napier, the work started when Jessie and Robert tried to understand how other databases represent concepts; now this Tool would assist in storing and visualizing that information.


Next Steps

  • Post a Wiki version + a Word document, and advertise to SEEK Taxon.
  • Franz, Liu, and Peet will continue scoping Tools II, III and V; starting with Tool III in particular.


  • SBA February Taxon meeting: Susan Gauch suggests creation of additional tool, similar to the DOI system. The idea is that once someone finds a GUID in an (on-line) article, there is a service that will guide that person to the original concept definition. Susan will fill in the details!


Attachments:
TaxToolsOverview.doc Info on TaxToolsOverview.doc 64512 bytes


Go to top   Edit this page   More info...   Attach file...
This page last changed on 15-Feb-2005 12:17:57 PST by NCEAS.franz.