All Hands Meeting 2005

This is version 83. It is not the current version, and thus it cannot be edited.
[Back to current version] [Restore this version]

Logistics

Where: San Diego, CA
Venue: San Diego Supercomputer Center

When: Oct 24-28, 2005
Weather: Historical

Current

The SEEK 2005 All Hands meeting will be a forum for discussing progress on the EcoGrid, SMS, KR, Kepler, BEAM, Taxon, and EOT activities. We expect that all funded SEEK participants will attend in order to stimulate communication and collaboration across the project. We will also be inviting colleagues from the CIPRES, NESCENT, and GEON projects to discuss opportunities for collaboration with allied disciplines. The basic agenda is presented below, although this will be adjusted as we near the meeting timeframe due to the rapid advances made by the SEEK project.

Agenda

Monday, October 24

Morning arrivals from airport
1:00pm Introductions (Jones)
1:05pm Welcome to SDSC (Vijay Samalam, Executive Director of SDSC)
1:15pm Plenary (Jones)

Half hour update talks on:

SEEK overall and Kepler (Jones)
BEAM (Pennington)
Taxon (Kennedy)
Semantic Mediation (Ludaescher)
Knowledge Representation (Schildhauer)
EcoGrid (Jones)
EOT (Romanello)

Tuesday, October 25

8:00am Coffee
8:30 - 9:30 am Invited presentations from visiting projects

CIPRES (30 minutes)
NESCENT (30 minutes)

9:30 - 10:30 Questions and Discussion on synergy and collaboration with NESCENT and CIPRES
10:30 - 10:45 Break

10:45 - 12:00 Data Integration as 2006 goal

Merging data sets tagged with taxonomic concepts (Peet, 15 minutes presentation)
Discuss Taxon/SMS/Kepler Integration

How can and should these subprojects work together to solve the data integration problem?

12:00pm - 1:00pm Lunch

Synthesis Center demo at SDSC, 30 minutes (see http://syncenter.org)

1:00 pm - 5:30pm

Breakout: Data supporting case studies (Pennington)
EcoGrid Breakout (Jones)
Overview of incomplete tasks (Jones)

Reorganization of EcoGrid code, refactoring clients
Authentication via GAMA and GSI (report and discussion)
EcoGrid portal
Ontology-based searching in EcoGrid

Taxon Breakout (Beach) Taxon Breakout

Laura Downey, Going Forward Presentation from Estes Park
Xianhua Liu, Concept Mapping and authoring tool: review of prototype
Martin Graham, Collections visualization tool: review of prototype

7:00pm Group Dinner, Location: Rock Bottom

Wednesday, October 26

8:00am Coffee
8:30 am - 12:00 pm

8:30 Joint Session

9:30 Kepler breakout

9:30 Taxon Breakout

Taxon-Kepler Interaction Design and Engineering Discussion with Dan Higgins

Use case 1: GARP
Use case 2: Biodiversity

12:00pm - 1:00pm Lunch

SEEK-Taxon lunch, Franz, Stewart, Gales with CIPRES Miller, Jin, Lucie): Common ground

1:00 pm - 5:30 pm

Kepler breakout

Taxon Breakout

Extracting concepts from online and monographic sources, just mammals?, (Susan, Aravind)

Currently: 1600 PDF documents obtained, 100 are bat taxonomy papers, extracting data next
Desired: extract parent-child hierarchies, descriptions, synonyms,

TOS Data acquisition roadblocks, process, role of software tools, getting data into TOS with an import tool for usability testers.

Currently: ITIS ("relational" concepts), Bats from MSW 2005 (MSW concepts from original pubs and synonyms without bib references), MSW 1993, FNA in TCS, German Mosses (concepts with concept maps),

Requirements: TOS Actor could be tested with plant data,

Possible: Use Bob Peet's plant concept data (44,000 with relationships?)with PLANTS county level distribution data as a supplement to bat use case scenario, Ranunculus data set, 8 classifications, NA & Mexico.

Action: Stinger Guala to identify in 2 weeks the number of versions of PLANTS, send to KU to DiGIRize, Bob will send rich treatement of plant concepts the US Southeast. Available now: Ranunculus data, it needs to be upgraded to latest TCS, Xianhua will do that in 1 week. Jan or Feb: All USDA Plants in version 4, mapped against all plants in FNA, and also all of the Alan Weekly collaboration, his version mapped against 8 different classifications of plants.

Broader TDWG and community issues for TCS and TOS. What are their expectations? Our responsibilities? Can SEEK demonstrate the utility of TCS and TOS within the next year? What tools do we need to complete and harden for the community to buy into TCS and TOS? What can we expect from the GBIF community? Short term versus ultimate objectives

S-T TOS/TCS External Roles/Personas

S-T might be a concept provider for other people to test their concept applications,
S-T might be a concept repository for projects looking for a place to store them, might need a batch import process if requested.
S-T *must* use GUIDs because there will be multiple concept object servers
S-T if TOS is a global reference implementation, then we need to implement the whole TCS schema in TOS, we could not implement just some TCS fields, we would need to input and output 100% standard TCS documents.
Schema changes over time, do we need to maintain records in each version forever?

S-T TOS/TCS Internal Roles/Personas

See SEEK use cases 1 & 2 and other related logic

Discussion of the data independence problem with DiGIR queries that have the same name for 2 or more concepts. Solutions: Give user option to allow duplicates, eliminate duplicates, combine overlaps or go interactive and give user alert that name used for multiple concepts, ot just log errors in a sideband pipeline.

Usability thread: which users assigned to which evaluations, next steps, roadmap (Laura w/ slides)

Uber-discussion: next two year vision, deliverables, roadmap, engineering dependencies, end-user tasks to be supported, data acquisition for the prototypes, decomposing Kepler engineering steps to accomplish usage scenarios. How do all the SEEK-Taxon components fit together now?

4:00 May Tutorial Workshop Planning, Pennington for Romanello-Katz, Beach

Thursday, October 27

8:00am Coffee
8:30 am - 12:00 pm

Knowledge Representation / Annotation Breakout

Overview of charge (Schildhauer, 15 minutes + 15 minutes discussion)
GrOWL ontology editor (Krivov, 15 minute presentation + 15 minute discussion)
Status of existing ontologies (Madin, 30 minutes presentation + 30 minutes discussion)

10:30-10:45 Break

Ontologies in Kepler

Storing and Managing Ontologies in Kepler

Support for versioning, accessing, classifying, etc.
Central versus Distributed storage

Conventions for Kepler/SEEK ontologies

LSIDs for identifying ontologies
Labels (for concept and role names) and comments
Storing ontologies in multiple versus one file (what should go in a single file?)

Ontology curation and population in Kepler

Allowing users to upload/provide new ontologies to Kepler
Supporting multiple ontologies for actor quick search

Ontology editing w/in Kepler
Making Ontologies convenient for Kepler users

Support for "views" and "templates", i.e., interfaces and methods for filtering, organizing, summarizing, and visualizing Ontologies

Tools for SEEK ontology development

Glossary support
Documentation
Browsing, familiarization
Organization; "templates" and "views" (e.g., biodiversity templates, etc.)

Improving the Annotation Tool

Filling in Annotations via ontology constraints
Defining what makes a "good" annotation, and how to guide users towards these
Improvements to the UI, including selecting ontologies and "templated" annotations

Working exercise: building annotations

Target case studies: ENM and Biodiversity
Start as whole group, discuss actor/data annotation needs and differences
Possibly break into subgroups to work on annotations

Taxon Breakout

Update of EML to accomodate Taxon Concepts (Peet, Franz, Jones)
Future Plans: events, objectives, deliverables

12:00pm - 1:00pm Lunch
1:00 pm - 5:30 pm

Semantic Mediation / Data Integration Breakout

Overview of charge (Ludaescher, 15 minutes + 15 minutes discussion)
SCIA integration in Kepler (Wang, 15 minute presentation + 30 minute discussion)
Working exercise: data integration approaches

Case study requirements for integration
Simple "smart" concatenation
Extending to more complex (and realistic) examples
Adequacy of annotations and ontologies for integration

Data Management support for sms

The EML actor as entry point for data integration
How do we deal with multiple output formats from EML?
How do we deal with null values?
How do we carry out integration specifications given by sms?

For example, using "R" to load instance data from multiple voluminous data sets (e.g., 3 100MB data files) and executing data integration "recipes"
A database-style approach? (e.g., SQL)

What approach do we use for representing data conversions?

Examples: unit conversions, count/area = density, rescaling/projection
Represent conversions within ontology?
Represent conversions as actors (searched via annotations, e.g.)
Define a new or leverage an existing representation for conversions (Prolog, SWRL, XML syntax a la STMML, MoML attributes)
Use potpourri approach (i.e., try to accomodate multiple mechanisms)

Taxon Breakout

Bob Peet leaves; Less critical items
1:00 PM Taxon meets SMS and KR 1 Hr. Bertram, Mark, et al.
NESCENT workshop proposal
ESA Booth

Friday, October 28

8:00am Coffee

Plenary

BEAM Use cases: what needs to be accomplished

12:00pm - Adjourn
Travel back to home institutions, only schedule flights that leave after 2:00 pm

Participants

All SEEK participants
Invited participants form other projects

Go to top More info... Attach file...

This particular version was published on 26-Oct-2005 23:37:58 PDT by KU.beach.