At line 5 added 9 lines. |
+ ** Discussion of how to handle distributed execution |
+ *** Dan has "bulletin-board" model in mind right now |
+ *** Matt would like a more 'cluster' oriented approach with a controller |
+ ** Ricardo suggests that we should reduce granularity of inputs to make it computationally tractable and then address the parallelization more comprehensively in a second iteration |
+ ** Ricardo: cluster at Kansas should be examined to discover issues relevant to the parallelization effort for Kepler |
+ * Rod gave overview or DiGIR/DarwinCore data sources in Kepler |
+ ** Exposes data as fields, rows, tables |
+ * Jianting's progress on the GIS actors |
+ ** Convex hull, rasterization, buffering |
At line 6 added 7 lines. |
+ !! Decisions |
+ * Delay parallelization effort until we can do it right |
+ * Implement a simpler GARP workflow that can run in one day on one machine |
+ ** Preprocessing to a reasonable grid density |
+ ** Choose fewer species (e.g., 10-50) |
+ ** Possibly eliminate the best subsets approach to reduce computational demand |
+ ** Implement whole end-to-end iteration in the workflow for demonstration purposes |
Removed lines 8-9 |
- |
- |
At line 10 added 5 lines. |
+ * Rod will create another option in the DarwinCore data source to allow aggregations by species (across providers) |
+ * Dan and Chad will handle format conversion from the .raw files to ascii |
+ * Dan will enumerate tasks to complete ENM pipeline in bugzilla, with an overall tracker bug |
+ ** Will assign developers as needed to get these steps done |
+ |