Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



Semantic Search Examples

Semantic searches have been posited to significantly improve the capabilities of scientists to locate ecological and environmental data that are relvant to their particular purposes. This page provides some example searches that would be useful to be able to express and execute in the SEEK SMS system.

Current metadata searches

Current metadata-driven search systems allow searches against EML to retrieve relevant documents. Such searches include simple keyword matching against the full text of the metadata document, against particular metadata fields (e.g., to find the Creator of a data set), and compound searches. The compound searches use the structure of the metadata document to create a search based on space, time, biological taxonomy, or other criteria that are provided in the metadata content.

Some example searches one can conduct include:

  • Find all data sets with the term 'soil' in the metadata
  • Find all data sets with the term 'soil' in the abstract
  • Find all data sets with the term 'soil' and the term 'horizon' in the abstract
  • Find all data sets with the term 'marine' in the metadata located west of 120 degrees longitude
  • Find all data sets with the named place 'Santa Barbara County' in the geographic description
  • Find all data sets that list 'Abies lasiocarpa' as a covered taxon
  • Find all data sets that were collected between 1990 and 2000
  • Find all data sets located west of 120 degrees longitude that were collected between 1990 and 2000 that list 'Mytilus californianus' as a covered taxon

Semantic searches

The above searches are good, but tend to not be particularly precise nor very complete. A search for 'soil' in the KNB produces over 3000 data sets, many of which may or may not be relevant to a study on, for example, soil chemistry. On the other hand, a search on 'soil chemistry' might only match a few data sets, because that term only occurs in a few metadata records. Instead, relevant meatdata records might include terms such as 'acidity', 'salt content', 'Nitrogen', 'Potassium', 'K', 'N', etc. A semantic search system that allowed one to recognize that 'N' and 'Nitrogen' are the same concept, and that they are both related to 'SoilChemistry' would be a huge improvement. In addition, scientists are frequently searching for data with particular measurements have been made, such as 'Density estimates for abalone' or 'Solar radiation at 1m above ground'. These searches are not possible with the current metadata-based search system, and instead require a far more semantically-aware search system. Some of the searches that would be useful include (an example dataset that should match the query is included in parentheses):

  • Find all data sets with measurements of DissolvedCarbon concentration (knb-lter-gce.259.16)
  • Find all data sets with measurements of WaterChemistry (knb-lter-gce.259.16, knb-lter-gce.199.7)
  • Find all data sets of Abundance of Invertebrate organisms (knb-lter-gce.24.16)
  • Find all data sets in the intertidal zone of the Atlantic coast of the United States (knb-lter-gce.24.16, many others)
  • Find all data sets with LeafArea and OrganicMass measurements in Grasses (knb-lter-gce.4.8)
  • Find all data sets with Height measurements for 'Armases cinereum' sensu Abele 1992 (knb-lter-gce.23.16)
  • Find all data sets with Abundance measurements for crabs where sampling occurred within sampling units with a minimum size of 0.5 square meters (knb-lter-gce.23.16)
  • Find all data sets with Temperature and Salinity measurements of Water were observed in the same Observation (knb-lter-gce.139.8)

Hopefully these represent a good discussion starter on semantic searches.



Go to top   Edit this page   More info...   Attach file...
This page last changed on 17-Jan-2008 11:31:37 PST by uid=jones,o=NCEAS.