Science Environment for Ecological Knowledge
Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of SEEK - Home
Science Environment for Ecological Knowledge









 

 

 



KU Taxon Update_Sept 2005

KU Development Update

  • Explored mechanism for automatically generating byte-compiled Java classes from a WSDL.

    • Found that it could be done using Apache's BCEL library and Javassist.
    • Determined that it would not be useful to have the Web Services actor be able to support complex objects because any downstream actors that were to use the Web Services actor would then require an understanding of those complex objects

  • After discussing it with Matt, it was decided the best route to go would be to add the API methods for the ENM use case in a manner that the used simple Java types or arrays of them
    • The WSDL was changed to add the necessary API functions
    • The old SOAP interface layer was tossed out and redesigned and redeveloped to completely isolate it from the business logic
      • Involved creating adapter classes for the concept objects to ensure that Java collection objects are serialized as arrays
      • Adding an additional class to manage the interface between the business logic layer, which has fewer API methods all of which use complex objects and the SOAP layer
      • Modification of the client side code

  • A couple of the new API methods needed graph like methods (getAuthoritativeList, getHigherTaxon)
    • Node/Edge/Graph classes were created that add all edges necessary for rapid tree/graph traversal at the database level was created
    • Hibernate mappings to those classes where created
    • Added back to the system the idea of an authoritative list

  • Modified the object model at the same time to more simply handle reference objects from TCS 0.953 and improve performance
    • modified business code, database mappings and code

  • Added an n-gram matching algorithm to implement a dictionary for matching arbitrary strings to database concepts. This is currently being used in getBestConcept, but will be adopted for other search algorithms

  • Added implementations of the following API methods:
    • getAuthoritativeList: returns all concepts at a particular rank from a particular subtree
    • getSynonymousNames: returns the list of all synonymous name strings for a concept
    • getHigherTaxon: returns the a higher taxon at a specified rank for a concept concept below it
    • getBestConcept: returns a list (of hopefully only 1) concept best matching a particular name within a particular authority

  • Modifications to the build process
    • War file generation of the SOAP service for easier deployment
    • Automatic generation of Java class files containing the SOAP <-> Java binding rules from the WSDL

  • Wrote a harvester for ITIS data that takes in a root TSN and generates an 0.95.3 instance document from that root to the leaf concepts
    • Parses an ITIS XML instance using XPath to generate concepts which are then marshalled using the already present JAXB 0.95.3 marshaller.
    • Incorporated tests for this * Found many ITIS XML instances do not conform to the XML specifications due to characters not being represented by their appropriate entity code. For those, the entity codes are fixed on the fly prior to parsing.

  • Wrote a generic import tool that takes in any instance of the TCS (in 0.88b or 0.95.3) and populates the database
    • Change in the way that the unmarshalling works sot it uses a chain of responsibility rather than factory methods
    • Numerous bugfixes to the 0.88b and 0.95.3 marshallers/unmarshallers

  • Modified the TOS to use the Spring framework to configure the application by composition of Java bean objects via dependency injection

  • Hibernate training to tune TOS performance

CIPRes Collaboration update

  • Data
    • both projects are creating XML schemas CIPRes for the creation and storage of phylogenetic trees.
      • CIPRes could adopt taxonomic model requirements from SEEK and inform SEEK of its requirements needs to help it fit CIPRes needs (there need not be a 1-to-1 mapping b/w our object model and TCS)
      • SEEK could accept CIPRes recommendations for incorporating sequence analysis in our data model

    • CIPRes must create an archival db, seek need not
      • CIPRes could develop and implement archival Treebase repository informed by the needs of SEEK TOS
      • Saves resources and expands user base on both projects
      • Jenny Wang (SEEK) and Rami Rifaieh (SDSC ontology) could meet to work on merging ontologies
      • Formal gathering to merge requirements
    • CIPRes may be able to improve and adapt SEEK tools for CIPRes and other projects
      • adopt SCIA (schema matching tool)
      • create workflow functionalities in Kepler

    • Interface
      • SEEK - Usability
      • CIPRes - dynamic interface generation and user configuration based on PISE strategy

    • Visualization
      • CIPRes involved in tree visualization, but no deticated people on it now, can they help SEEK?



Go to top   Edit this page   More info...   Attach file...
This page last changed on 30-Sep-2005 09:47:14 PDT by KU.stewart.