Pajek Tutorial

Pajek is a program for analyzing large networks, and arguably the best drawing program on the market. 
 
Download Information:
Pajek is freeware software and can be downloaded from the following url:
http://vlado.fmf.uni-lj.si/pub/networks/pajek/.
 
On the web page, click on the first link available on the first line saying “Pajek 0.91”. This provides an option to download “pajek91.exe” to your local hard drive. Click on the executable file and follow the installation instructions. After the successful installation of Pajek, a shortcut to run the Pajek and a folder named “Pajek” containing all the relevant libraries for the program are available.
 
Subsequently, we discuss Pajek's input file format, parsers that help you generate that format, as well as how to use Pajek for simple data layout.
 
File Data Format:
The file format accepted by Pajek provides information on the vertices, arcs (directed edges), and undirected edges. A short example showing the file format is given below:
-------------------------------------
*Vertices 3
1 "Doc1" 0.0 0.0 0.0 ic Green bc Brown
2 "Doc2" 0.0 0.0 0.0 ic Green bc Brown
3 "Doc3" 0.0 0.0 0.0 ic Green bc Brown 
*Arcs
1 2 3 c Green
2 3 5 c Black
*Edges
1 3 4 c Green
-------------------------------------
Herein there are 3 vertices Doc1, Doc2 and Doc3 denoted by numbers 1, 2 and 3. The (fill) color of these nodes is Green and the border color is Brown. The initial layout location of the nodes is (0,0,0). Note that the (x,y,z) values can be changed interactively after drawing.
 
There are two arcs (directed edges). The first goes from node 1 (Doc1) to node 2 (Doc2) with a weight of 3 and in color Green. 
 
For edges, there is one from node 1 (Doc1) to node 3 (Doc3) of weight of 4, and in Green color.
 
Imagine you want to layout a set of nodes according to a given similarity matrix. Given the similarity matrix, e.g., the sample file generated using Latent Semantic Analysis, you can use a Perl parser pajekConv.pl to generate the Pajek input file. To execute the Perl scrip on 'ella' or 'iuni', simply type at the command prompt:
perl pajekConv.pl inputFileName outputFileName 
Make sure to replace inputFileName with the name of the similarity matrix file. The generated Pajek input file will be named outputFileName.
 
Pajek Execution:
Start the Pajek Program by clicking on the “Pajek” shortcut icon. The interface shown in figure 1 will pop up
 
Figure 1: Initial interface for Pajek to perform file reading operation
 
Specify the Pajek input 'Network' file by clicking on the yellow folder icon under 'Network' (see area highlighted in red on the left in Figure 1). Reading of the Network will be confirmed as shown in Figure 2.
 
Figure 2: Initial file read feedback interface
 
To display the network, go to the Draw (Ctrl + G) menu option (highligted on the top right in Figure 1). The resulting initial layout is shown in Figure 3.
 
Figure 3: Initial layout of nodes
 
In order to layout the nodes according to their similarity - as given in the similarity matrix discussed above - you can apply different layout algorithms available via the “Layout” menu option. A feature to start the algorithm from random, circular etc.. is also provided as well, see Figure 4.
 
Figure 4: Algorithms and starting positions available for the data
 
In addition, you can choose to show/hide the node information, edges etc. using the “Options” menu shown in Figure 5.
 
Figure 5: Options menu
 
Explore the options for lines, vertices, color etc.
 
Set Threshold Value
One would often come across a situation where we have a big mesh of lines connecting the nodes. Then the graphics could be improved by running the pajek_setThreshold.pl script on the similarity matrix that you had previously generated. The way this script works is that it turns all the values (including the threshold value) to zero. And thus removes the edge between the two nodes. After this step, you would have to repeat the same steps of using a pajekConv.pl file to get the pajek input format file.
 
Useful Tip:
In case of a very large data file, the algorithm processing and layout readability can be improved by hiding vertice labels and edges among vertices.
 
Now if you want to make the graph a bit colorful, then you could specify the vertices colors in the input file. The term "ic" implies internal color and "bc" is to indicate the external color. All color options that are visible in pajek are given in Color.pdf. The input file would then have the following format in the vertices section
Example:
1 "MOLECULAR SEQUENCE DATA" 0.1 0.1 0.1 ic Orange bc Mahogany
 
In addition, to this if we want to increase the size of the vertices, then we also have a define the values in the vertices section using two variables: "x_fact" and "y_fact". The example below illustrates the declaration method:
Example:
1 "MOLECULAR SEQUENCE DATA" 0.1 0.1 0.1 x_fact 7 y_fact 7 ic Orange bc Mahogany
 
Note: These changes may sometimes be not visible in the pajek GUI. To activate different sizes in Draw window you must then use Options/Mark vertices using/Real sizes On from the GUI menu selection.
 
The snapshot shows resulting viz with variable nodes size, different edge widths and color combination. The visualization can be generated from the following pajek input file.
 
Additonal information on different variables available to make relevant changes in the viz. can be located in the following documentation
 
If you wish to explore some of the additional features that Pajek has to offer, you might consider going through the extensive manual.
 

Authors are Katy Borner, Ketan Mane, Sidharth Thakur, Jeegar Maru