|
|||
|
Attendees
Links
Crispin Wilson
Multiple taxonomy resolution with annotation. Primarily for browsing - should not be too hard to add a programmatic interface. Data stored within a single database- centralized repository.
Use CasesThere is now a TaxonUseCases page, which contains the current list of use cases for the SEEK taxon group.
Questions
Revised ScopeServices:
Architecture:
Why Distributed?
Populating the Resource:
concept = Full name + (author + publication, date) + usage reference [to taxonomic work]
Names by themselves are a useful entity Names are not substitutes for concepts
From the ProposalAn Internet taxonomic concept (assertion) resolution service employing a semantic mediation engine would exploit the SEEK architecture to enable precise species concept based data discovery and integration. This specific concept identity resolution problem is representative of a large class of problems (e.g. with classifications for biotic communities, soils, rocks, places) where there exist many-to-many relationships between concepts and names. The solutions we develop should have fundamental utility far beyond biological nomenclature and biodiversity.
IT Research Challenges
Deliverables
+++ Added during workshop +++
Milestones
Year 1
Year 2
+++
From CVS Document
ScopeIncludes any type of analysis or model in ecology and biodiversity science. Goal is to massively streamline the analysis and modeling process, and provide for archiving analyses and their outputs. Includes support for analyses in SAS, Matlab, R, SysStat and custom models written in various languages (e.g., C). The system should allow the addition of various back-end anaylytical engines as they become available or as new versions are released. The system as a whole should not be tied to any one metadata standard, back-end system or operating system/platform. Flexibility should be a major concern in the design process due to the heterogeneous makeup of the ecological scientific community. The system should include features that assist users in determining the appropriateness of combining various analytical steps and data sources based on semantic mediation. Semantic mediation should occur in three areas. First, to determine whether it is appropriate to link together particular analytic steps. Second, to mediate between multiple data sets to determine in what ways they can be combined. Third, to determine whether the selected data sources are appropriate inputs for the selected analysis.
Functional requirementsFR1: Analyses and models documented in declarative language (e.g., XML) FR2: Must support 'pipelining' of models in a graph FR3: Ability to archive analyses and their outputs FR4: Ability to version analyses and their outputs FR5: Must have an easy-to-use front end GUI to assist scientists with building and executing pipelines FR6: Allows the sharing of analytical processes amongst scientists FR7: Flexibility in input, processing and output. e.g. not binding the system to one metadata standard, back-end system or platform
Use casesUC1: Scientist can create new analytic steps UC2: Scientist can use a graphical interface to arrange analytical steps into a pipeline, save it, bind data to the inputs, and execute it UC3: Scientist can execute an analysis or model described in a declarative language UC4: Scientist can archive various intermediate and endpoint results of an analytical process UC5: Scientist can create new versions of analytical steps, and can return to old versions UC6: Scientist can share coded pipelines or sub-pipeline steps and results of pipeline analyses with other scientists UC7: Administrators can add support for additional metadata processors and back-end systems when needed UC8: Scientist can work backwards through a pipeline of interest and so by starting with knowledge of the semantics of the result of interest is able to determine the type of data needed as inputs to the pipeline UC9: Given a particular data set and set of pipelines, the scientist can use the semantic mediation system to determine the types of analyses that are possible to carry out on the data set. Software componentsSW1: Metadata language for formal description of analyses SW2: Metadata language for the formal description of data and model semantics SW3: Server-side system for execution of analyses and models SW4: Server-side system for processing semantic metadata SW5: Client interface for creating and executing analyses and models
|
This material is based upon work supported by the National Science Foundation under award 0225676. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF). Copyright 2004 Partnership for Biodiversity Informatics, University of New Mexico, The Regents of the University of California, and University of Kansas |