Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



Beam Knowledge Rep Sept 04

This is version 48. It is not the current version, and thus it cannot be edited.
[Back to current version]   [Restore this version]


Beam Knowledge Representation Meeting, Sept. 21-23, 2004


Participants

September 21st

  • Introductions and Presentations
    • Mark presentation on SEEK background
    • Deana presentation on Biodiversity, etc.
    • Discussion
      • Species traits
      • Data sets, availability
      • Integration
    • Chad / Dan Presented on Kepler
      • Much interface discussion
      • Showed Pred/Prey and Bio index models
    • Rich Williams on ontologies
      • Restricted to certain axis; spatial patterns (naturally occuring gradients); abundance; temporal
      • In a particular control plot, how are things changing, and are those changes in the same trajectory (in the same gradient change)?
      • Ton of data in range management
      • Q: Are most of these datasets freely available on the web; are people sharing them? A: Few available on web ... Q: But is it possible to get them? A: It is a very divergent / diverse community ...
    • Shawn Bowers on semantic integration

September 22nd

  • Agenda Setting
    • Some overlapping stuff on every biodiversity analysis
    • Methods / Design focus
    • Analysis focus
    • Mark: Goal is to compartmentalize so as to provides general utility for next project, for some other analysis; capture standard functions that we want to capture and describe to use for others
    • Bob: At one point there was a database with various scripts that provided the full spectrum of data integration
      • They aren't there anymore (they were on KNB bio)
      • Everthing that was done using scripts, no manual work
      • From 16 or so grasslands
      • Raw data that read in the scripts
      • Project technically still ongoing
      • Aug 2002 first working group
      • Then six months of work after that
      • It would be a good test case / use case to look into
      • Steve: Jornado would be the test case

  • Data Integration and Analysis
    • Katy (note, the following mixes discussion with Katy's presentation)
      • Predicting species response to increased rousource availability (history, questions, dataset)
      • what happens when you increase productivity (experimentally), and then look at what happens to diversity
      • N (nitrogen) Fertilization experiments (KBS oldfields, ARC heath)
        • if you add nitrogen, it generally increases productivity (Gough et al. 2000)
        • Every experiment found that as you increase productivy it decreases diversity, which doesn't follow the natural productivity/species-richness curve (this is gaining interest in the community: we are increasing productivity of systems in general with environmental change going on, e.g., urbanization increases nitrogen/fertizilation, and a desire to know the result on diversity)
        • What is Primary Productivity "can of worms" discussion
          • Want to aggregate data at many different scales/communities to get many types of graphs; you want to ultimately go from smallest possible scale to largest scale (infer curve of graph)
          • What decision making occurs before any analysis happens? This goes into the data discovery/integration. Deana: can we build a repository of methodologies?
          • You do a broad category of the dataset, but not the details (a little bit of woody stuff, and so on...)
          • Bob: need a step where someone can look at the methodology ...
          • Rich: Need to capture what it is you are measuring; it isn't as much a methodology issue
          • Bob: Clark and Clark paper covers some of this (Bob said he'd dig up the ref)
        • Most of the time, productivity scales well, i.e., it is a linear scaling, e.g., anpp vs. area is a linear relationship. So for example, even though they are smaller plots, you can get g/m^2 measures. Basically, productivity doubles as area doubles...
        • N addition decreases species diversity: plot at lter sites of anpp (above-ground primary productivity) versus relative species density; species density is the number of species observed in a given area
        • Two studies, measures at different scales: either you simulate or measure a species area curve, and extrapolate to different areas.
        • N fertilization positivevly effects productivity; negatively effects diversity
          • N fert. interacts with environment; species sorting (dominance composition) negatively effects diversity
        • Species response to fertilization:
            • In the case of increased fertility, can we predict what species we will lose? What species will become dominant?
            • Are these responses contingent on system characteristics?
        • Dataset: 8 lter sites, 28 community types (mainly vegetation classifications, based on the experiments/manipulations done at the sites); 831 species
        • Dataset characteristics
          • N added (g/m^2/yr), 10 (ARC), 9.5 (CDR), 60/wk (GCE), etc.
          • Form of N: NH4-NO3 pellets (ARC), Liguid Urea-N, NH4-NO3 pellets, etc.
          • Treatment plot size (m^2): differs from 900 to .25
          • Sample plot size (m^2): .25 to 10 (these don't differ that much really: .32, .30, .25, 1) ... it isn't, however, appropriate to compare directly, without the species area curves, the .3's with the 1's
          • Replication: 2 to 10
          • Duration (yrs): 2 to 13
        • Often, a matrix like this is constructed at the beginning of doing a "synthesis", and the most important points to track for each site are: treatment, sample size, duration, and replication.
      • Data "Request" (this is basicaly a data procurement request/query)
        • Contacted LTER and asked for the data in a particular format (basically the matrices)
        • At each site, asked for N-fertilization experiments: abundance, species, measures of productivity, treatment plots, un-treated plots, herbaceous systems, and to give latest sample time
        • List of vegetational forms; growth forms (secondary growth); herbaceous is a property of plants
        • Herbaceous term applied to a nonwoody stem/plant with minimal secondary growth
        • Example dataset:
          • atts: site, comm, species name, RA_Control, V_Control, RankC, PRankC, n, V_Naddition, RankN, PRankN, Cot, Dur, LF, DLF, HT, CLN, Origin, Family, Response, ImmExt, InRR, Change,
          • comm is a subset of n-fertilization of sites (e.g., tiled and untilled in KBS), thus <Site,Comm> denotes the actual place
          • each row consistutes an observation within an experiment
          • RA_control is the mean, V_control is the variance, the rank is derived from control, ...
      • Lots of discussion about separation of syntax and semantics issues; and of excel details
      • "Generic" stuff
        • Species/attribute matrix: to compute trait responses, functional attributes
        • Measure traits in as many species, then throw away points
    • Both projects pretty much took from the same original, "raw" data sets
      • 34 from Katy's project
      • 13 sites from NCEAS project
    • Brainstorming:
      • Six-month view: we know diversity made up of species, we have all that data, but don't use it to its full potential, productivity/diversity data needs to be integrated with community structure, and integrating across a lot of sites.
      • General tasks: Identifying data that is relevant (talking with people), permission to obtain the data, understanding the structure and content of the data (sampling design, how or what attributes mean), and then determining which can be appropriately integrated to do an analysis. From Jornada: http://jornada-www.nmsu.edu/ (go to "Research Data" > "LTER Data" > "Plant")
      • anpp versus r and a bunch of data points. how can the data points be linked back to a table of other features ... of the species that are involved in the point. The point represents the set of species in an area (the species "richness") ... the auxiliary table is characteristics of the species
      • As another example, compare what happens to the species between the points (as area increases); and more importantly the functional traits of the change
        • Comment: These are just visualization things
      • Null model, and the null expectation
      • Abundance vs loss prob for n-fix and not n-fix
      • Standard toolset
      • Measuring functional diversity is a big problem -- a computationally complex problem
      • For loss-probability and N-fixers you need to do data integration: how many sites are needed, and so on ...

  • Biodiversity Change Analysis
    • How to understand change in biodiversity: what are the factors causing the change in biodiversity
      • Given an attribute matrix, which are changing the most or least at particular segments in the species-area curve
      • Focus on Community structure
      • Niche breadth
      • Predict why diversity would declince / change

  • Deana's Example
    • Query
      • Some data by keyword: "biodiversity", "species counts", "abundance", species names, functional trait
      • Location, e.g., place name, coordinates, bounding box
      • Sample method
      • Analysis type, e.g., counts, abundance
    • Construct logically and semantically equivalent views
    • Group on sample methods (transect vs plot)
    • Data
      • data1: transection 20m -> rarefaction (species-area curve) -> scale to 1 m interpolation
      • data3: transection 20m -> rarefaction (species-area curve) -> scale to 1 m interpolation
      • data2: plot 5m^2 -> species-area curve -> scale to 1 m interpolation
      • data4: plot 1m^2
    • Integrate transect data
    • Integrate plot data
    • Construct graphs

September 23rd

  • Agenda Setting
    • Next meeeting
    • Convert Deana's starting workflow to a specific analysis ...

  • Next Meeting
    • February or March time frame
    • What we might speak about / do for next meeting
      • Get data, scripts, etc.
      • Get the biodiv workflow into Kepler
      • Maybe get familiar with kepler / ecological tutorial for feedback
      • Data integration examples
      • Given the technology -- what do you want to do?
      • IRC channel for BEAM? Weekly, bi-weekly scheduled meeting?

  • Group Breakouts
    • Biodiversity ontology breakout
      • Participants: Deana Pennington, Kristin Vanderbilt, Evan Weiher and Rich Williams.
    • Biodiversity workflow breakout
      • Participants: Steve Cox, David Chalcraft, Shawn Bowers, Bertram Ludaescher, Mark Schildhauer, Chad Berkley,
Dan Higgins, Jianting Zhang (jzhang@lternet.edu)

  • Notes from ontology breakout
    • The discussion broadly covered traits of plant communities important in biodiversity and productivity experiments and experimental methodologies. The following notes are raw and will require considerable work to formalize. As such, the categorizations suggested by the indented formatting should be regarded as preliminary.
      • Traits of a Population (aggregated group of individuals) of Plant Species:

  • Notes from workflow breakout
    • Workflow based on part of Steve and David's Jornada analysis



Go to top   More info...   Attach file...
This particular version was published on 28-Sep-2004 11:16:52 PDT by SDSC.bowers.