Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



ENM Pipeline Conference Call 20 Oct 2004

This is version 8. It is not the current version, and thus it cannot be edited.
[Back to current version]   [Restore this version]


Participants

Jones, Zhang, Pereira, Higgins, Tao, Schildhauer, Spears, Berkley

Discussion

  • Dan gave overview of pipeline refactoring
    • Discussion of how to handle distributed execution
      • Dan has "bulletin-board" model in mind right now
      • Matt would like a more 'cluster' oriented approach with a controller
    • Ricardo suggests that we should reduce granularity of inputs to make it computationally tractable and then address the parallelization more comprehensively in a second iteration
    • Ricardo: cluster at Kansas should be examined to discover issues relevant to the parallelization effort for Kepler
  • Rod gave overview or DiGIR/DarwinCore data sources in Kepler
    • Exposes data as fields, rows, tables

Decisions

  • Delay parallelization effort until we can do it right
  • Implement a simpler GARP workflow that can run in one day on one machine
    • Preprocessing to a reasonable grid density
    • Choose fewer species (e.g., 10-50)
    • Possibly eliminate the best subsets approach to reduce comnputational demand
    • Implement whole end-to-end iteration in the workflow for demonstration purposes

Action items



Go to top   More info...   Attach file...
This particular version was published on 20-Oct-2004 11:49:16 PDT by NCEAS.jones.