Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



ENM Pipeline Conference Call 20 Oct 2004

Difference between version 15 and version 1:

Line 2 was replaced by line 2
- Jones, Zhang, Pereira, Higgins, Tao, Schildhauer, Spears
+ Jones, Zhang, Pereira, Higgins, Tao, Schildhauer, Spears, Berkley
At line 5 added 24 lines.
+ ** Discussion of how to handle distributed execution
+ *** Dan has "bulletin-board" model in mind right now
+ *** Matt would like a more 'cluster' oriented approach with a controller
+ ** Ricardo suggests that we should reduce granularity of inputs to make it computationally tractable and then address the parallelization more comprehensively in a second iteration
+ ** Ricardo: cluster at Kansas should be examined to discover issues relevant to the parallelization effort for Kepler
+ * Rod gave overview or DiGIR/DarwinCore data sources in Kepler
+ ** Exposes data as fields, rows, tables
+ * Jianting's progress on the GIS actors
+ ** Convex hull, rasterization, buffering
+
+ !! Decisions
+ * Delay parallelization effort until we can do it right
+ * Implement a simpler GARP workflow that can run in one day on one machine
+ ** Preprocessing to a reasonable grid density
+ ** Choose fewer species (e.g., 10-50)
+ ** Possibly eliminate the best subsets approach to reduce computational demand
+ ** Implement whole end-to-end iteration in the workflow for demonstration purposes
+
+ !! Action items
+ * Rod will create another option in the DarwinCore data source to allow aggregations by species (across providers)
+ * Dan and Chad will handle format conversion from the .raw files to ascii
+ * Dan will enumerate tasks to complete ENM pipeline in bugzilla, with an overall tracker bug
+ ** Will assign developers as needed to get these steps done
+

Back to ENM Pipeline Conference Call 20 Oct 2004, or to the Page History.