Child pages
  • 5.1.4 Studying Four Major NetSci Researchers (ISI Data)

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


  1. Resize Linear > Nodes > globalcitationcount> From: 1 To: 50 > When the nodes have no 'globalcitationcount': 0.1 > Do Resize Linear
  2. Colorize > Nodes > globalcitationcount > From:   To:   > When the nodes have no 'globalcitationcount': 0.1 >   >Do Colorize
  3. Colorize > Edges > weight > From (select the "RGB" tab) 127, 193, 65 To: (select the "RGB" tab) 0, 0, 0
  4. Wiki Markup
    Type in Interpreter:
         >for n in g.nodes:
         \[tab\] n.strokecolor = n.color
    Or, select the 'Interpreter' tab at the bottom, left-hand corner of the GUESS window, and enter the command lines:
         > resizeLinear(globalcitationcount,1,50)
         > colorize(globalcitationcount,gray,black)
         > for e in g.edges:
         \[tab\] e.color="127,193,65,255"
    Note: The Interpreter tab will have '>>>' as a prompt for these commands. It is not necessary to type '>" at the beginning of the line. You should type each line individually and press "Enter" to submit the commands to the Interpreter.
    This will result in nodes which are linearly sized and color coded by their GCC, connected by green directed edges, as shown in Figure 5.11 (left). Any numeric node attribute within the network can be used to code the nodes. To view the available attributes, mouse over a node. The GUESS interface supports pan and zoom, node selection, and details on demand. For more information, refer to the GUESS tutorial at&nbsp;<span style="color: #006daf">[</span><span style="color: #006daf"><a href=""></a></span>\|\].

Figure 5.11: Directed, unweighted paper-paper citation network for 'FourNetSciResearchers' dataset with all papers and references in the GUESS user interface (left) and a pruned paper-paper citation network after removing all references and isolates (right)


A merge table can be automatically generated by applying the Jaro distance metric (Jaro, 1989, 1995) available in the open source Similarity Measure Library ([|])  to identify potential duplicates. In the Sci2 Tool, simply select the co-author network and run 'Data Preparation > Text Files > Detect Duplicate Nodes'. using the parameters:


The updated co-authorship network can be visualized using 'Visualization > Networks > GUESS', (See section GUESS Visualizations for more information regarding GUESS).
Figure 5.13 shows the layout of the combined 'FourNetSciResearchers' dataset after it was modified using the following commands in the "Interpreter":
     > resizeLinear(numberofworks,1,50)
     > colorize(numberofworks,gray,black)
     > for n in g.nodes:

Wiki Markup
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\[tab\] n.strokecolor = n.color

     > resizeLinear(numberofcoauthoredworks, .25, 8)
     > colorize(numberofcoauthoredworks, "127,193,65,255", black)
Wiki Markup
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;> nodesbynumworks = g.nodes\[:\]

     > def bynumworks(n1, n2):
Wiki Markup
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\[tab\] return cmp(n1.numberofworks, n2.numberofworks)

     > nodesbynumworks.sort(bynumworks)
     > nodesbynumworks.reverse()
     > for i in range(0, 50):
Wiki Markup
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\[tab\] nodesbynumworks\[i\].labelvisible = true


In the "Bibliographic Coupling Similarity Network," edges with low weights can be eliminated by running 'Preprocessing > Networks > Extract Edges Above or Below Value'  with with the following parameter values:


Text normalization utilizes the Standard Analyzer provided by Lucene ([|]). It separates text into word tokens, normalizes word tokens to lower case, removes "s" from the end of words, removes dots from acronyms, deletes stop words, and applies the English Snowball stemmer ([|]), which is a version of the Porter2 stemmer designed for the English language..