  • 5.1.4 Studying Four Major NetSci Researchers (ISI Data)

Anchor Word Co-Occurrence Network Word Co-Occurrence Network Word Co-Occurrence Network





The Extract Word Co-Occurrence Network algorithm has been updated. To run this workflow you will need to update the plugin by downloading the




edu.iu.nwb.composite.extractcowordfromtable_1.0.1.jar file and copying it into your plugins directory. Make sure to remove the old plugin: "edu.iu.nwb.composite.extractcowordfromtable_1.0.0" file from the plugins directory, otherwise the new plugin will not work. If you have not updated Sci2 by adding plugins before, there are some brief directions on how to do so in 3.2 Additional Plugins.

In the Sci2 Tool, select "361 unique ISI Records" from the 'FourNetSciResearchers' dataset in the Data Manager. Run 'Preprocessing > Topical > Lowercase, Tokenize, Stem, and Stopword Text' using the following parameters:


titleAggregate Function File

Make sure to If you are working with ISI data, you can use the aggregate function file indicated in the image below. Aggregate function files can be found in sci2/sampledata/scientometrics/properties. If you are not working with ISI data and wish to create your own aggregate function file, you can find more information in 3.6 Property Files

The outcome is a network in which nodes represent words and edges and denote their joint appearance in a paper. Word co-occurrence networks are rather large and dense. Running the 'Analysis > Networks > Network Analysis Toolkit (NAT)' reveals that the network has 2,821 word nodes and 242,385 co-occurrence edges.


The following step is based on the old, incorrect algorithm. If you have updated to the new Extract Word Co-Occurrence algorithm, see the note at the beginning of this workflow, then you will not need to delete the isolates and can proceed directly to applying the DrL (VxOrd) layout.

There are 354 isolated nodes that can be removed by running 'Preprocessing > Networks > Delete Isolates' on the Co-Word Occurrence network. Note that when isolates are removed, papers without abstracts are removed along with the keywords.