Position Paper by Jean-Daniel Fekete, INRIA Futurs

For the Workshop on "Information Visualization Software Infrastructures" at IEEE 2004 Visualization,
Organized by Katy Börner, Indiana University, USA and Jean-Daniel Fekete, INRIA, France

Part I

I.1) What functionality should a general InfoVis infrastructure provide?

There are several levels of infrastructure needed and important communication issues between these levels.

Closest to the user, information visualization requires an interactive environment where several visualizations can be created, combined, synchronized and configured quickly to encourage exploration of datasets in the richest way;

Accessing datasets requires another level of infrastructure providing a wide range of mechanisms to load data from various file formats, various sources like relational databases or continuous data streams, or more sophisticated tools producing data using data-processing techniques like KDD;

Tailoring visualization algorithm is a third level of infrastructure. Some visualization techniques can be richly tailored. Tailoring these visualization algorithms should be made simple by using a visual system to avoid programming when possible;

There will always be new visualization techniques; they should be as easy as possible to integrate in the infrastructure: the convenience should not be only to the end user but also to the visualization programmer.

Combining data-access, data-processing and visualization techniques should be easy and visual.

Visualization can be the front-end to the main application or can be the application by itself. It could also be a component inside an already existing application. The infrastructure should be flexible enough to integrate into an already existing application without requiring too much demand from that application.

The infrastructure should provide several access levels, depending on the intended users skills. No programming should be required for simple users. Visual programming should be accessible for power-users. General programming should be easy for programmers.

I.2) What do you see as the main technical challenges for creating a central but flexible and universally useful (information) visualization software infrastructure (as opposed to 100 different ones)?

Avoiding fragmentation seems the most serious challenge now. We can see fragmentation in the GUI world and there is no benefit for users or for programmers. On the contrary, there hasn't been any important progress since the 90's in evaluation tools or in building tools in GUIs mainly due to this fragmentation.

In the other side, there is a successful experience of coordinated platform development in the field of Computational Geometry for example with the CGAL (www.cgal.org). This environment contains all the important data-structures and algorithms used in computational geometry to spare researchers the burden to implement them again and again. CGAL is continuously updated by the research community.

Technically, there are several small problems to address: programming languages, data-structures and communication mechanisms but none seem impossible to solve. The main issues are related to programming languages (C++, Java, C#), the graphics API (Windows/CGI, Java/Grapics, OpenGL, C#) and communication mechanisms (common file formats, common inter-process communication mechanisms, common language bindings).

Part II

Please describe the (information) visualization software infrastructure you are working on.

The InfoVis Toolkit is an Interactive Graphics Toolkit written in Java to ease the development of Information Visualization applications and components. The main characteristics of the InfoVis Toolkit are:

Unified data structure
The base data structure is a table of columns. Columns contain objects of homogeneous types, such as integers or strings. Trees and Graphs are derived from Tables.
Small memory footprint
Using homogeneous columns instead of compound types improves dramatically the memory required to store large tables, trees or graphs, and generally the time to manage them.
Unified set of interactive components
Interactive filtering (a.k.a. dynamic queries) can be performed with the same control objects and components regardless of the data structure, simplifying the reuse of existing components and the design of generic ones. Magic lenses such as fisheye lenses or excentric labels can be applied to all the existing and added visualizations.
Fast
the InfoVis Toolkit can use accelerated graphics provided by Agile2D, an implementations of Java2D based on the OpenGL API for hardware accelerated graphics. On machine with hardware acceleration, some visualizations redisplay 100 times faster than with the standard Java2D implementation.
Extensible
the InfoVis Toolkit is meant to incorporate new information visualization techniques and is distributed with the full sources and with a very liberal license. It could be a base for student projects, research projects or commercial products.

II.1) Project Name and Web Address

The InfoVis Toolkit ( www.lri.fr/~fekete/InfovisToolkit)

II.2) Core Team Members (Please list in order, Role of Project Member, Full Name, E-mail. eg: Developer, John Doe, jdoe@univ.edu)

Project Manager and main Developer: Jean-Daniel Fekete, IN-SITU Project, INRIA Futurs and LRI, Université Paris-Sud, France.

II.3) Project Start Date

January 2003

II.4) Targeted User Group

Professional programmers interested by adding Information Visualization capabilities to their applications

Students for projects or as course support to understand algorithms, data-structures or interaction techniques and experiment with them

Researchers in Information Visualization to simplify the development of new visualization or interaction techniques

II.5) Supported User Tasks

The users of the Toolkit are programmers, as well as end-users.

For programmers, InfoVis implements a strong separation of concern between data-structures, visualization techniques, dynamic query techniques and navigation techniques. Programmers can:

For end-users, the toolkit provides a set of sample applications that can be used to visualize existing datasets in various file formats using several visualization techniques, zoom, filter and apply "magic lenses" on all the visualizations.

II.6) Major Features of the System Architecture

InfoVis implements data structures with high performance and low memory footprint specially crafted for Information Visualization and dynamic queries where interaction is essential. The data structures are based on tables made out of columns. Contrary to most other InfoVis systems, a data record is not shown as a Java Object but is an index (an integer). This implementation leads to much higher performance but surprises students familiar with "pure" object oriented programming.

InfoVis heavily uses the "Factory" design pattern to allow extensions at several places (file formats, visualizations, dynamic query components, interaction panels, etc.)

II.7) Algorithms Provided

The InfoVis Toolkit, as of version 0.6, implements nine (9) types of visualization: Scatter Plots, Time Series and Parallel Coordinates for tables; Node-Link diagrams (Cartesian and Polar), Icicle trees and Treemaps for trees; Adjacency Matrices and Node-Link diagrams for graphs. These visualization algorithms are configurable internally.

II.8) Snapshot of the Interface

Architecture of the InfoVis Toolkit
Internal structure of the InfoVis Toolkit. Squares represent data structures whereas ellipses represent functions.

II.9) Development Platform

This software is being developed using J2SDK1.4, the Eclipse IDE, CVS for versioning and Ant as the build tool.

II.10) Supported Operating Systems

Any operating system that can run Java™.

II.5) Software Dependencies/Required Libraries

The InfoVis Toolkit is written in 100% pure Java™. It relies on some external packages included in the distribution: ANTLR used by some readers, XMLWriter by David Magginson and can optionally use Agile2D for accelerated rendering using OpenGL.

II.5) Current License

The InfoVis Toolkit is currently distributed under the Q Public License but will eventually change to BSD for the core with contributions using whatever license they wish.

II.5) Number of Users/Downloads

300+ since Feb 2004

II.5) Pros and Cons

Pros:

Cons:

II.5) Planned Work

Part III

Please describe your main interest in participating in the workshop

Determining the feasibility of combining efforts to create one common, shared Information Visualization infrastructure as opposed to 100s of underfunded or proprietary toolkits, platforms and frameworks.

Scouring for ideas for a common metadata format suited for saving/loading visualized datasets.

Eliciting feedback about the InfoVis Toolkit software architecture with regard to extensibility and ensuring that it is future-proof.

Please use no more than 4 pages, in this HTML format if possible.
Send the completed paper before September 30 to Jean-Daniel.Fekete@inria.fr and katy@indiana.edu.


Created by Jean-Daniel Fekete on Wed Jul 28 23:15:27 2004
Modified by Shashikant Penumarthy, Bruce Herr and Katy Börner on Mon Aug 02 18:00:00 2004