Position Paper by James Slack

For the Workshop on "Information Visualization Software Infrastructures" at IEEE 2004 Visualization,
Organized by Katy Börner, Indiana University, USA and Jean-Daniel Fekete, INRIA, France

Part I

I.1) What functionality should a general InfoVis infrastructure provide?

A general InfoVis infrastructure should provide a well documented, simple API so existing applications can be modified to interact with minor integration changes. The API should possibly be extensible as well, so modules can be shared by several users interested in different datasets for similar domains. The API is perhaps the most important part in the infrastructure. Application developers need to understand standard APIs to existing applications prior to developing future tools.

Modularity is definitely a functionality that should be provided by any generic application. Similar to such modules in applications as the Firefox web browser, the modules should interact with each other, have capability for updates, be available in a central shared repository, and also be sufficiently documented.

I.2) What do you see as the main technical challenges for creating a central but flexible and universally useful information visualization software infrastructure, as opposed to 100 different ones?

The first step in creating one central infrastructure is getting everyone to agree on standards. Although this can be difficult, perhaps a compromise is to build a bare-bones application framework that requires specific modules for each intended purpose.

Although a modular framework provides flexibility, it may be the source of headaches if not properly moderated and maintained. Perhaps a good model to follow in developing a generic, universally acceptable infrastructure would be a package-based Linux distribution such as Debian, which provides flexibility for many possible system configurations with both common and specific modules.

Also, one must consider that not every application has the same dataset requirements so the infrastructure must be able to adapt to specific needs of applications, or nobody will use it. Getting people to change their optimized applications to use a generic toolkit is also quite tricky, but perhaps an open development environment would persuade developers to create modules for their own applications.

Part II

II.1) Project Name and Web Address

The project has the generic name Olduvai, but the visual metaphor is referred to as Accordion Drawer.

Sourceforge hosted site for Olduvai (currently only TreeJuxtaposer and SequenceJuxtaposer applications):
http://olduvai.sf.net

Soon PowerSetViewer, BacJuxtaposer, and other such applications currently in various stages of development will be added to the site. All applications, several datasets, and published papers for each application are included on this site. Additions such as a tutorial about the interface and navigation capabilities of TreeJuxtaposer are in progress.

II.2) Core Team Members
Team leader/Grad student supervisor/Developer, Tamara Munzner, tmm@cs.ubc.ca
Grad student/Developer, James Slack, jslack@cs.ubc.ca
Grad student/Developer, Kristian Hildebrand, hilde@cs.ubc.ca
Grad student/Developer, Qiang Kong, qkong@cs.ubc.ca
Contributor, François Guimbretière, francois@cs.umd.edu

Former Project Members
Developer, Li Zhang, l.zhang@hp.com
Developer, Serdar Taşıran, stasiran@ku.edu.tr
Developer, Yunhong Zhou, yunhong.zhou@hp.com

II.3) Project Start Date

2001

II.4) Targeted User Group

For TreeJuxtaposer, we target systematic biologists and other who need to browse phylogenetic trees. SequenceJuxtaposer is aimed at biologists and bioinformaticians who need to browse genomic sequences. Oncologists are the target audience of BacJuxtaposer, while PowerSetViewer is aimed at data mining of transaction databases.

II.5) Supported User Tasks

Depending on the application, we support user tasks in visualization and navigation of tree structures (TreeJuxtaposer), genomic sequences (SequenceJuxtaposer), cancer data (BacJuxtaposer), and power sets (PowerSetViewer) with an accordion drawing interface.

II.6) Major Features of the System Architecture

Our architecture provides guaranteed visibility of important data with global Focus+Context. Our accordion drawing infrastructure provides smooth animated transitions with progressive rendering approaches to reduce the animation frame rendering time for large datasets.

The rendering algorithms we use are all linear in the total number of nodes drawn with respect to the number of pixels used to display the data. Rendering approaches for existing applications are optimized to use the topology of the dataset when available. Applications with limited or non-existent topologies have an imposed topological structure, currently piggybacked on the accordion drawing grid hierarchy.

Current development is aimed at providing dynamic allocation of accordion drawing structures to support editing, adding and deleting nodes in a sparse layout, and forms of semantic zooming that could provide topological structure to datasets in applications such as SequenceJuxtaposer and BacJuxtaposer.

II.7) Algorithms Provided

We provide several algorithms for our accordion drawing infrastructure, such as: guaranteed visibility of marked items; Focus+Context in an accordion drawing layout; progressive rendering for smooth, animated transitions; and several application specific rendering and node marking algorithms for tree, regular grid, overlapping vertically offset ranges, and sparsely laid out data in a large domain.

II.8) Snapshots of the Interface

The following are snapshots of TreeJuxtaposer, SequenceJuxtaposer, and BacJuxtaposer:
TreeJuxtaposer


SequenceJuxtaposer


BacJuxtaposer

II.9) Development Platform

Linux, Mac OSX, Windows

II.10) Supported Operating Systems

Linux, Mac OSX, Windows

II.11) Software Dependencies/Required Libraries

Java, GL4Java

II.12) Current License

BSD

II.13) Number of Users/Downloads

unknown

II.14) Pros and Cons

Pros: Java portability, easy to use interface, extremely efficient integration of features, high performance Cons: distribution of Java libraries, producing installers, fairly monolithic codebase with only a small number of application domains currently supported

II.15) Planned Work

Combining TreeJuxtaposer and SequenceJuxtaposer, further developing BacJuxtaposer with semantic zooming, new tools in different application domains, logging functionality for simple undo and replay actions, user studies with target groups, editing datasets, and analysis tools for specific biological tasks.

Part III

My main interest in attending the workshop is to see how several other non-toolkit applications have been integrated into toolkits. Our approach has been to develop specific applications and recently we have been extracting generic components that might be of some use to existing toolkits. Furthermore, communication with others on integrating our existing applications with potentially useful components may reduce the work required to implement new functionality for our future applications.