Position Paper by Harry Hochheiser, National Institute on Aging

For the Workshop on "Information Visualization Software Infrastructures" at IEEE 2004 Visualization,
Organized by Katy Börner, Indiana University, USA and Jean-Daniel Fekete, INRIA, France

Part I

I.1) What functionality should a general InfoVis infrastructure provide?

A general InfoVis infrastructure might include some or all of the following components:

I.2) What do you see as the main technical challenges for creating a central but flexible and universally useful (information) visualization software infrastructure (as opposed to 100 different ones)?

Part II

Please describe the (information) visualization software infrastructure you are working on.

II.1) Project Name and Web Address

The Open Microscopy Environment http://www.openmicroscopy.org

II.2) Core Team Members (Please list in order, Role of Project Member, Full Name, E-mail. eg: Developer, John Doe, jdoe@univ.edu)

Developer: Harry Hochheiser, hsh@nih.gov
Developer: Chris Allan: callan@blackcat.ca
Developer: Jean-Marie Burel, j.burel@dundee.ac.uk
Developer: Doug Creager, dcreager@alum.mit.edu
Developer: Andrea Falconi, a.falconi@dundee.ac.uk
Developer: Josiah Johnston, siah@nih.gov
Developer: Jeff Mellen, jeffm@mit.edu
Principal Investigator: Ilya Goldberg, igg@nih.gov
Principal Investigator: Jason Swedlow, jason@lifesci.dundee.ac.uk
Principal Investigator: Peter Sorger, psorger@mit.edu

II.3) Project Start Date

1999

II.4) Targeted User Group

Cell and Molecular Biologists

II.5) Supported User Tasks

Importation, categorization, retrieval, and management of microscopy images. Execution of image analysis algorithms. Display and visualization of analysis results.

II.6) Major Features of the System Architecture

The open microscopy environment (OME) is a platform for supporting digital microscopy research. A series of clients built upon a common architecture support visualization of microscopy data and metadata. Thus, OME is a domain-specific application environment that includes an extensible visualization model.

OME is a client-server system that supports an XML-based data model for describing digital microscopy data, analysis workflows, and results of the application of those workflows to image sets. OME contains a history of all data elements, allowing derivation of the provenance of any analytic result. The data model can be extended as needed through semantic types: custom data structures defined in XML.

The data model is stored in a relational database system. A Perl layer provides an object-relational mapping, which can then be accessed by remote clients via XMLRPC and HTTP, or directly by for generation of a web-based front-end.

Java visualization tools run in an extensible, configurable client environment. Visualization clients must implement a common interface, and share a common event bus for inter-client communication. Events passed over the bus can be used to open views in specific clients, or for inter-client communication needed for tight coupling between alternative views of data.

The extensibility of OME's data model provides a challenge for visualizations: can a generic framework be built to support effective coordination visualization of arbitrary compound data types?

II.7) Algorithms Provided

Treemaps and zooming (via piccolo) for image collections, directed graph layout for analysis "chains", tight-coupling for multiple coordinated windows, dynamic queries for filters, drag-and-drop construction of analysis chains.

II.8) Snapshot of the Interface

Project/Dataset browser: In OME a project contains zero or more datasets, each of which contains images. Each dataset may in turn be a part of one more projects. This browser provides a coordinated overview of the many-to-many relationships between projects, and datasets, along with the ability to zoom in on individual images.

Other visualizations not shown include palettes of modules and analysis chains that can be used to create new chains, chain creation canvases, an image viewer, and a 3D viewer for trajectories of cellular bodies.

II.9) Development Platform

Server: Unix (Linux, OS X) Clients: Unix (Linux, OS X), Windows

II.10) Supported Operating Systems

Server: Unix (Linux,/OS X) Client: Any OS running Java 1.4.2

II.5) Software Dependencies/Required LibrariesII.5) Current License

LGPL

II.5) Number of Users/Downloads

Over 200 downloads since Spring 2004.

II.5) Pros and Cons

From a development standpoint, the visualization architecture allows for straightforward communication between clients within a common environment. This provides a starting point for building visualizations that are coordinated yet distinct.

Although OME's data model is extensible, the visualizations currently implemented are specifically tied to the specific details of both this model, and the underlying system architecture. Significant re-engineering work would be needed to use these tools with a different data source.

II.5) Planned Work

Additional Visualizations: OME's image analysis model involves "chains" (DAGs) of cascading analysis modules, with the output of one computational module providing parameters for the next. All results are stored, leading to a data history, whereby the derivation of any result can be tracked back to its source. This leads to the need for visualizations that can be used to see the results of various executions in context. Specific components include: These views will be coordinated where appropriate, with a particular focus on linking all visualizations back to the original images.

Part III

OME's extensible data model presents interesting challenges for effective visualization. As new semantic types are defined to support new types of analyses, users will need visualizations that will both display these data types and coordinate these displays with other views of related data. Currently, this requires the construction of custom visualizations for each new semantic type, a practice that is limited in scalability and extensibility. An information visualization infrastructure could simplify this task significantly.

Ideally, this infrastructure would support the use of a declarative language to define visualizations and coordinations. Just as semantic types are currently defined in an XML Schema, a visualization description for a data type might describe the type of visualization to be used, and the interactions between that visualization and other views.

In addition to being of immediate use to our work with OME, an information visualization infrastructure that provides this degree of flexibility would be generalizable to a wide variety of applications. As a concrete application domain, OME might provide an interesting testing ground for validating any infrastructure design.