KU Taxon Update_Sept 2005

Your trail: KRMeetingJan06Models | KRMeetingJan06Notes | KRMeetingJan06Summary | KRSMSAHM2007 | KRSMSAHMDec2006 | KRSMSAHMDec2006BreakoutNotes | KRSMSSemanticAnnotationComments | KR_SMS_BEAM_Mar2005 | KRsum06 | KU.liu

KU Development Update

Explored mechanism for automatically generating byte-compiled Java classes from a WSDL.

Found that it could be done using Apache's BCEL library and Javassist.
Determined that it would not be useful to have the Web Services actor be able to support complex objects because any downstream actors that were to use the Web Services actor would then require an understanding of those complex objects

After discussing it with Matt, it was decided the best route to go would be to add the API methods for the ENM use case in a manner that the used simple Java types or arrays of them

The WSDL was changed to add the necessary API functions
The old SOAP interface layer was tossed out and redesigned and redeveloped to completely isolate it from the business logic

Involved creating adapter classes for the concept objects to ensure that Java collection objects are serialized as arrays
Adding an additional class to manage the interface between the business logic layer, which has fewer API methods all of which use complex objects and the SOAP layer
Modification of the client side code

A couple of the new API methods needed graph like methods (getAuthoritativeList, getHigherTaxon)

Node/Edge/Graph classes were created that add all edges necessary for rapid tree/graph traversal at the database level was created
Hibernate mappings to those classes where created
Added back to the system the idea of an authoritative list

Modified the object model at the same time to more simply handle reference objects from TCS 0.953 and improve performance

modified business code, database mappings and code

Added an n-gram matching algorithm to implement a dictionary for matching arbitrary strings to database concepts. This is currently being used in getBestConcept, but will be adopted for other search algorithms

Added implementations of the following API methods:

getAuthoritativeList: returns all concepts at a particular rank from a particular subtree
getSynonymousNames: returns the list of all synonymous name strings for a concept
getHigherTaxon: returns the a higher taxon at a specified rank for a concept concept below it
getBestConcept: returns a list (of hopefully only 1) concept best matching a particular name within a particular authority

Modifications to the build process

War file generation of the SOAP service for easier deployment
Automatic generation of Java class files containing the SOAP <-> Java binding rules from the WSDL

Wrote a harvester for ITIS data that takes in a root TSN and generates an 0.95.3 instance document from that root to the leaf concepts

Parses an ITIS XML instance using XPath to generate concepts which are then marshalled using the already present JAXB 0.95.3 marshaller.
Incorporated tests for this * Found many ITIS XML instances do not conform to the XML specifications due to characters not being represented by their appropriate entity code. For those, the entity codes are fixed on the fly prior to parsing.

Wrote a generic import tool that takes in any instance of the TCS (in 0.88b or 0.95.3) and populates the database

Change in the way that the unmarshalling works sot it uses a chain of responsibility rather than factory methods
Numerous bugfixes to the 0.88b and 0.95.3 marshallers/unmarshallers

Modified the TOS to use the Spring framework to configure the application by composition of Java bean objects via dependency injection

Hibernate training to tune TOS performance

CIPRes Collaboration update

Data

both projects are creating XML schemas CIPRes for the creation and storage of phylogenetic trees.

CIPRes could adopt taxonomic model requirements from SEEK and inform SEEK of its requirements needs to help it fit CIPRes needs (there need not be a 1-to-1 mapping b/w our object model and TCS)
SEEK could accept CIPRes recommendations for incorporating sequence analysis in our data model

CIPRes must create an archival db, seek need not

CIPRes could develop and implement archival Treebase repository informed by the needs of SEEK TOS
Saves resources and expands user base on both projects
Jenny Wang (SEEK) and Rami Rifaieh (SDSC ontology) could meet to work on merging ontologies
Formal gathering to merge requirements

CIPRes may be able to improve and adapt SEEK tools for CIPRes and other projects

adopt SCIA (schema matching tool)
create workflow functionalities in Kepler

Interface

SEEK - Usability
CIPRes - dynamic interface generation and user configuration based on PISE strategy

Visualization

CIPRes involved in tree visualization, but no deticated people on it now, can they help SEEK?

Go to top Edit this page More info... Attach file...

This page last changed on 30-Sep-2005 09:47:14 PDT by KU.stewart.

This material is based upon work supported by the National Science Foundation under award 0225676. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).


Long Term Ecological Research Network, UNM	National Center for Ecological Analysis and Synthesis, UCSB	Biodiversity Research Center, KU	San Diego Supercomputer Center, UCSD


Arizona State University	Napier University	University of North Carolina	University of Vermont


UC Davis Genome Center

Copyright 2004 Partnership for Biodiversity Informatics, University of New Mexico, The Regents of the University of California, and University of Kansas