Beam Knowledge Rep SMS Mar 05

Date: March 7-10, 2005
Location: UC Davis, Genome Center (http://genomecenter.ucdavis.edu)

Participants

Steve Cox (Stephen.Cox@tiehh.ttu.edu)
David Chalcraft (chalcraftd@mail.ecu.edu)
Shawn Bowers (bowers@sdsc.edu)
Bertram Ludaescher (ludaesch@ucdavis.edu)
Deana Pennington (dpennington@lternet.edu)
Mark Schildhauer (schild@nceas.ucsb.edu)
Katy Suding (ksuding@uci.edu)
Evan Weiher (weiher@uwec.edu)
Dan Higgins (higgins@nceas.ucsb.edu)
Manu Jayal (manu.jayal@asu.edu)
Aimee Stewart (astewart@ku.edu)
Nico Franz (franz@nceas.ucsb.edu)
Else Cleland (elsac@stanford.edu)

Goals

High-level ecological ontologies: develop detailed ontologies for Biodiversity & Productivity-- in terms of relevant concepts and relationships from the very general theoretical level, and drilling all the way down to the operational level (algorithms and measurements used in quantifying biodiversity and productivity).
Ontologies of data sets and analyses: develop detailed ontologies of the data sets and analyses needed for some "past" Biodiversity/Productivity research, including clarifying the semantics and data transformations carried out to merge/integrate/summarize data, as well as "describing" the analytical components in ways that sufficiently expose their Inputs/Outputs and "functions" in ways that will facilitate discover and re-use of these components in alternative scientific workflows.
Ontologies of ecological methodologies: develop detailed ontologies that capture the essential features and differentiating nuances of the field and other methodologies employed in capturing the data to be dealt with in these investigations (overlaps with task 2 above). (We must not lose sight of the need for "spatiotemporal ontologies", and the specific capabilities these will provide us with regards to 1-3 here)
Scientific workflows: develop detailed scientific workflows for some "past" Biodiversity/Productivity research, to formally capture at a fine grain the steps and operations, with sufficient semantic annotation to facilitate their transparency and reusability by other researchers.
Taxonomic capabilities: identify the specific types of services that "ecologists" might need on the select use case to improve the ability to deal with taxonomic names, especially in the context of historical, long-term and globally distributed biodiversity information
Ecological research challenge: conceive of a "Challenging" Biodiversity/Productivity analyses, which could be enabled via a scientific workflow managment (Kepler), and that demonstrates the power of accessing distributed data and computing resources, and would otherwise be highly inefficient or intractable to a "typical" individual researcher.

Agenda

Monday, March 7th

8:30-9:30 Introductions, Overview of Goals, Status Update (Shawn, Deana, Mark)
9:30-10:00 Linkages with Taxon (Aimee, Deana, Mark)
10:00-10:30 Update from Manu about Spatiotemporal ontology work
10:30-10:45 Break
10:45-12:30 Work on Domain Ontologies (evaluate Rich's ontologies, GrOWL, SPARROW)
12:30-1:30 Lunch
1:30-3:30 Cont. Domain Ontologies
3:30-4:00 Break
4:30-5:30 Discussion and Preview for tomorrow (past analyses and data)

Tuesday, March 8th

8:30-10:30 Develop Scientific Workflows
10:30-10:45 Break
10:45-12:30 Cont. with Scientific Workflows
12:30-1:30 Lunch
1:30-3:30 Develop Ontologies of Scientific Methods
3:30-4:00 Break
4:00-5:30 Discussion and Preview for tomorrow (new scientific challenge; new analysis and data needs)

Wednesday, March 9th

8:30-10:30 Cont. with Ontologies for Scientific Methods (including presentation by NicoFranz about Use Case involving Taxonomic Name Resolution)
10:30-10:45 Break
10:45-12:30 Data Integration Ontologies
12:30-1:30 Lunch
1:30-3:30 Conceive of new BiodivProd challenge addressable through Kepler/KR/SMS
3:30-4:00 Break
4:00-5:30 Further define next step challenge

Thursday, March 10th

8:30-10:30 Discussion (ontologies, workflows, next steps, assignments)
10:30-10:45 Break
10:45-12:00 Continue wrapup, next meeting?, and adjourn

Notes

March 7th

Introduction, etc., Mark

Went over the SEEK acronym soup
Goals of the meeting / Agenda

Aimee on Taxon

Now trying to get data into the system
Presentation from TDWG (Taxonomic Database Working Group)

solution

connect data and ideas in dbs
allow searches and retrievals
be customizable by the user

architecture
allow alternative algorithms to weight matches
allow taxonomists to: author new ideas, make new connections

along with a pub., publish the new connections, etc.

german moss data most comprehensive so far; 24000 concepts
ITIS is shallow/incomplete concepts
How to fit in with functional traits?

Some dbs in Europe; not in US
Can't derive functional traits to taxon; but a lot of constraints are taxonomic
Characteristic data ... hard to obtain
Ancillary db mined from literature on characteristics of species, mining data from these Floras
Building an analysis that works off of gene sequencing ... e.g., by incorporating GenBank

Functional Traits Databases

USDA Plants (tons of plants, taxonomic subspecies, ...) http://plants.usda.gov/

Manu

Reconciling datasets

Why don't datasets match, how to bring them "to the same level"

Observations

entities, context, methodology, unit, ..., representation
e.g., observations of overhunting
certain "functions" are called: e.g., rescaling, reprojecting, etc.
how to represent these rules? ...

A lot of discussion on transformation
Mark wants to know how to do unit conversion

Shawn

Onto stuff

Some datasets

arc: lgcover2001
gce: PLT-GCED-0409
Niwot and Konza sites have data, e.g., that require multiple datasets to create an "integratable" source

Use case: Reconstructing an Experiment

Data Request Form:

Selected Sites with plants with only little or none secondary growth (shrubs) ... Herbacious dominated
Replicated Plots (to compute means)
Experimental manipulation of resources
Species specific abundance in some way
At first, wanted time series, but then couldn't do it -- chose the most recent year

Data integration

Do a column-by-column merge

for example r1(A, B) and r2(C, D) creates: r(A, B, C, D)
or, if you know, e.g., that A and C are similar (e.g., Area), then: r(AC, S, B, D), where S is a lineage column

Tuesday 8 March 2005

KR/SMS/Taxon Integration Breakout

Go to top Edit this page More info... Attach file...

This page last changed on 11-Mar-2005 08:03:30 PST by LTER.dpennington.