Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



Range Prediction

Use Case Niche Modeling Scenario I

Flow of Events

A scientist is interested in the native range of an oak species to discover the unknown populations of the species. (A scientist may be interested in the native range of an oak species to discover unknown populations, study patterns of distribution, prioritize areas for conservation, identify suitable areas for re-introductions, and so on.)

  1. Scientist issues query for cccurrences of 'Quercus rubrum' in California after 1950. (Space, time, taxon)
  2. Call taxon server on 'Quercus rubrum' to get the set of synonyms; convert term 'California' into bounding polygon.
  3. Query ecogrid for occurence data sources (e.g., species=S1 or S2 ..., year>1950, bbox=x1,y1,x2,y2), which returns a set of matching metadata records for available datasets (?). Scientist selects relevant datasets based on the metadata records returned. (This search process is an ecogrid-sms-kr sticking point)
  4. Concatenate each returned dataset into one single dataset. Note that these datasets may be heterogeneous: different fields, different resolution for "data points" (may contain different accuracies, and different metrics for location). We should keep all the fields.
  5. Query ecogrid for all data layers for the 'California' bounding box for time period, which returns a set of metadata records for each layer available. (Note that this set may be very large!) Now we need a mechanism that allows the scientist to select the appropriate datasets based on grouping/filtering. For example, to that there contains climate-related datasets. A controlled vocabulary can be used to group and filter the items, as well as various metadata.

  • possibly before step 5, return the types of layers available for California within time period, and allow user to filter the types before returning metadata records (which would only be returned for the types selected)

Description

The original scenario (from Matt in SB)

The scientist first creates a semantic query -- a query posed against ontological information -- requesting (ecogrid) datasets that can be used as occurrence data for a particular oak species ('quercus rubrum'), over a specific spatial footprint, and over a specific time period. (This example is expressed over the space, time, and taxa context of measurement.)

The scientist then issues the query using the semantic mediation system, which performs a series of steps to construct the necessary underlying queries (query rewritings) to the ecogrid. The underlying queries return a set of datasets. These returned datasets are then further manipulated by the mediation system. For example, the datasets returned may need to be joined (to extract the occurrence data), pruned to fit into the desired footprint, converted to the correct presence measure (for example, the value '1' for presence), and irrelevant fields removed. At this point, the scientist may wish to remove some of the candidate datasets from further analysis. The datasets are then combined (unioned) to form a single, uniform input table.

Next, the mediation system uses the implied footprint of the input table to query for (again, using the ecogrid) relevant environmental layers. The resulting layers are then integrated, which involves clipping the returned layers to the implied footprint, re-gridding the datasets to the same scale (based on the density of the presence/absence point datasets and environmental layers), and re-projecting the datasets to a common projection scheme (so that points? are correctly placed on a flat map).

Finally, the the rule set and resulting predication map are stored (in ecogrid), with appropriate metadata.

Assumptions

Notes

History

<Date>
<person>



Go to top   Edit this page   More info...   Attach file...
This page last changed on 07-Jul-2004 08:51:23 PDT by LTER.stekell.