Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



NU Taxon Update_Sept 2005

This is version 3. It is not the current version, and thus it cannot be edited.
[Back to current version]   [Restore this version]


Napier Update May-Sept 2005

  • Produced TCS Data Sets for multiple taxonomy visualisation
    • Kopersky Moss (14 * 10-10,000 taxa & species)
    • MANIS (15 * 1,000-9,000 taxa & species)
    • ITIS (7 * 200,000-250,000 taxa)
    • ITIS TCS XML file huge for seven entire hierarchies - 650MB
      • Unwieldy for large data sets?
      • Takes 2.5 minutes to read in and parse (slightly faster with StAX v SAX parser) and uses 350MB+ memory
      • Two pass parsing uses less memory but 4 minutes to parse

  • Additions/Refinements to visualisation
    • Ability to find uniquely named taxa out of a set of classifications
    • Speeded up large tree comparisons - 2.5s to find classification structural changes
    • Simple hash functions to reference names across classifications reduce memory footprint for sparser datasets without detriment to speed



Go to top   More info...   Attach file...
This particular version was published on 20-Oct-2005 09:41:22 PDT by NAPIER.graham.