Nico Implementation Thoughts

SEEK Taxon "Tools" – Graphic User Interfaces (NMF - Nov. 4, 2004)

Introduction – why are we designing what?

This is an attempt to summarize discussion results (morning, November 3rd) about "tools" we need to develop in near future. The tools are meant to be used in two main workflow processes:

(1) assist ecologists in the taxonomic marking up of their submitted information, i.e. in the interactive process of identifying a suitable concept to tie to the names originally present in the data set (Canis lupus (name as in ecologist's database) is transformed into Canis lupus sec. NA Mammal Checklist, 2000 (concept now marked up in SEEK database));

(2) assist taxonomic specialists in adding, editing, and relating concept information which stems from various classifications.

Although the workflow processes quite efficiently split up two kinds of SEEK interactors, the actual functionalities and graphics they will need are in partial overlap. Many sorts of information that ecologists will just want to see, specialists will also want to edit. We have to be aware of usability issues, in particular:

(1) our tools and displays should not look too awkward in the eyes of users (although some displays are likely new to them in terms of their content); and

(2) hierarchical displays have scalability issues; for editing one needs to look at a relatively small subset of concepts in (likely only two) associated classifications; but other displays for broad overviews of long-term taxonomic dynamics must accommodate a multitude of classifications, each with possibly thousands of concepts. Specialist need to see enough information to understand how two sets of concepts are related to each other and still have space on the screen to enter relationships among them.

(3) a tool to just enter concept sets newly might not need many visual displays at all; those resulting trees could be displayed once the information has been entered in full.

So what do we have right now?

(1) Martin's visualizations tools have display functions, but no editing options (yet). They deal quite amazingly with the transition (scalability) from one to many classifications, and from a few to many concepts per classification. The tool supports (in principle?) data sets aggregated (as a file) through the Taxonomic Concept Schema. Due to its original ties with Prometheus I, it has a particular (color utilizing) way of interpreting concept relationships (based on the actual overlap in included specimens, not an independent (and usually textual) assessment by a specialist). But Martin & co. are working to relax this restriction and allow the display of other kinds of relationships represented in a database. The adding/editing of concepts and relationships may be incorporated but is not in the planning.

(2) Xianhua's Tax Browser & Editor uses graph (ontology-like) visualizations to see and connect concepts and relationships. It accommodates (mostly) the TCS structures and information. Its display options do not scale as well as the previous tools, and only one internally consistent hierarchy is fully featured; external (point) concepts can be related to those using drag-and-drop functions. On the other hand, editing, adding, and reconnecting concepts is possible.

How can we use our existing tools, and what is still missing?

This is only my take on things. Martin's tool is probably closer to being usable than Xianhua's tool (who started later). Also, there is room for each within our plans. Martin's tool could satisfy the needs of ecologists as well as of taxonomic specialists who wish to browse through few or many classifications with few or many concepts, and to explore their internal relationships and any connections of concepts among classifications (these are all Use Cases on our Wiki page). In the process, one can efficiently acquire a visual impression (aided by things like "percentage of similarity" displays) of the overall amount of taxonomic dynamic and the taxonomic domains where they manifest themselves. The tool should not be used for adding/editing on concepts/relationships, but it should support name synonymies and concept relationships (the latter possibly to the level of Nico & Bob's paper; that way the Prometheus comparisons (ost.) and those based on the diagnoses (int.) could stand next to each other; the moss data (ext.) could be stored as well).We have to assess whether users are comfortable with viewing the new sorts of displays (acc. to Trevor much of this has been done already and the current perspective is preferable over others).

Let's assume that ecologists will be satisfied with the visualization options from Martin's tool. They help them see the extent of variation among multiple classifications, and thus assist in the marking up of names to concepts as part of their Morpho/Kepler data set submission/analysis activities.

Taxonomists will need Martin's tool and more. What I envision is that they will at some point explore multiple classifications, and then locate couplets of concepts/subtrees that they might want to (re)connect according to their insights on how they should be matched with each other. They might then select (through clicking) two concepts (or concept subtrees), and take them into the successor of Xianhua's tool. This could be calling up another pop-up window. The TCBE successor would not use graph displays which (as I and apparently many others think) are counter-intuitive for specialists. Taxonomists tend to display their results as trees (not nets or stars) or indented (ranked) lists which actually convey a sense of temporal succession, with an old "root" at the bottom/left and the younger apices on the top/right. All Linnean-compatible classifications (which is > 99% of what I assume we will manage (and that includes PhyloCode results…)) can be accommodated with these tree displays.

The Smithsonian (NA) mammals Flash display is rather neat in that it retains the well-established tree display form which most taxonomists use (as opposed to the displays Martin had to opt for in order to show more information on a single screen), yet it also handles (to a degree) the scalability challenges which make it convenient for use. The Flash display could be adopted by the TCBE update, though it is not a must. If adopted, then it would need to have the ability to display polytomies (i.e. three to many branches at any level), not just strictly bifurcate trees. It would also (ideally) have the option to display more than two levels at a time, i.e. not just Family to Genus and then Genus to Species, but also Family to Genus to Species (to Subspecies; 3-4 levels may be enough). Then, we need two display windows showing those trees, one on the left and the other on the right.

Once a specialist has passed from Martin's to Xianhua's module and is viewing just the parts of two trees (left and right) of interest, then it ought to be possible to select a pair of concepts for which text boxes come up (perhaps) underneath the two tree displays. They contain the kind of detailed information which the KU Taxonomic Object Server supports. They also allow one to specify new concept relationships à la TCS and Nico & Bob's paper. This way we establish concept relationships. The same kind of interface is then also used to explore previously entered concept relationships in detail going beyond that of Martin's tool.

Finally, we need a third module/tool just to enter new concepts and their internal (child/parent) relationships, based on the TCS/TOS structure. This tool may have only text boxes, but once a set of concepts/relationships has been entered, visual confirmation should be provided through each of the previous two tools.

One of things we should seriously consider, at least with Xianhua's tool which provides displays very similar to those systematists use anyway, is the compatibility of our tools with programs/tree file formats used in PAUP, MacClade, Mesquite, WinClada, etc. This will allow us to visualize many trees deposited on-line (see e.g. www.treebase.org). Ultimately Martin's and Xianhua's tool must also be integrated well.

The tools above are specifically made for visualization and entering tasks involving concept trees – not just species. Some of the searching/display functions which ecologists (and others) will use while marking up their data may not require hierarchical visualization tools, especially if they work exclusively at the species level (or any other single level). I am thinking about returning lists of things. Those tools (display concepts, display related concepts, display concepts in full, etc.) need to be built as well, but maybe Aimee et al. could take the lead on that. Even though Xianhua-type graph displays would work for these kinds of display, I think we might get away without them; star-like or reticulate displays may be less scalable and informative than just plain lists of concepts with relationships (see the printed moss checklist which works well for what it tries to achieve).

Nico Franz, Nov. 4, 2004

Go to top Edit this page More info... Attach file...

This page last changed on 04-Nov-2004 13:54:07 PST by NCEAS.franz.