Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The horizontal bar graph visualizations in Figures 5.5, 5.6, and 5.7 make it easy to see the timespan of different researchers, as well as the types and volume of grants they generally receive (e.g., many small grants or a handful of large ones). From here, it may be useful to compare their Co-PI networks and look more closely at award totals. Select each dataset in the Data Manager window and run 'Data Preparation > Text Files > Extract Co-Occurrence Network' using these parameters (note that the Aggregation Function File is located in 'yoursci2directory/sampledata/scientometrics/properties/' folder:

...

Run 'Visualization > Networks > GUESS' on each generated network to visualize the resulting Co-PI relationships. Select 'GEM' from the layout menu to organize the nodes and edges.

...

To produce a co-authorship network in the Sci2 Tool, select the table of all 361 unique ISI records from the 'FourNetSciResearchers' dataset in the Data Manager window. Run 'Data Preparation > Extract Co-Author Network' using the parameter:

The result is two derived files in the Data Manager window: the "Extracted Co-Authorship Network" and an "Author information" table (also known as a "merge table"), which lists unique authors. In order to manually examine and edit the list of unique authors, open the merge table in your default spreadsheet program. In the spreadsheet, select all records, including "label," "timesCited," "numberOfWorks," "uniqueIndex," and "combineValues," and sort by "label." Identify names that refer to the same person. In order to merge two names, first delete the asterisk ('*') in the "combineValues" column of the duplicate node's row. Then, copy the "uniqueIndex" of the name that should be kept and paste it into the cell of the name that should be deleted. Resave the revised table as a .csv file and reload it. Select both the merge table and the network and run 'Data Preparation > Update Network by Merging Nodes' Table 5.2 shows the result of merging "Albet, R" and "Albert, R": "Albet, R" will be deleted and all of the node linkages and citation counts will be added to "Albert, R".

...

In sum, unification of author names can be done manually or automatically, independently or in conjunction with other data manipulation. It is recommended that users create the initial merge table automatically and fine-tune it as needed. Note that the same procedure can be used to identify duplicate references – simply select a paper-citation network and run 'Data Preparation > Text Files > Detect Duplicate Nodes' using the same parameters as above and a merge table for references will be created.

...

Load the file using 'File > Load' and following this path_:_ 'yoursci2directory/sampledata/scientometrics/isi/FourNetSciResearchers.isi' using .' File > Load. A table of all records and a table of 361 records with unique ISI ids will appear in the Data Manager.

Select the "361 Unique ISI Records" in the Data Manager and run 'Data Preparation > Text Files > Extract Paper Citation Network.' Select "Extracted Paper Citation Network" and run 'Data Preparation > Text Files > ExtractReference Extract Reference Co-Occurrence (Bibliographic Coupling) Network.'

Running 'Analysis > Networks > Network Analysis Toolkit (NAT)' reveals that the network has 5,342 nodes (5,013 of which are isolate nodes) and 6,277 edges.

In the "Bibliographic Coupling Similarity Network," edges with low weights can be eliminated by running 'Preprocessing > Networks > Extract Edges Above or Below Value' with the following parameter values:

...

5.1.4.4 Document Co-Citation Network (DCA)

Load the file using 'File > Load' and following this path: 'yoursci2directory/sampledata/scientometrics/isi/FourNetSciResearchers.isi' using 'File > Load.' Choose "ISI scholarly format" in the pop-up 'Load' window. .'  A table of all records and a table of 361 records with unique ISI ids will appear in the Data Manager.

Select the "361 Unique ISI Records" and run 'Data Preparation > Text Files > Extract Document Co-Citation Network.' The co-citation network will have 5,335 nodes (213 of which are isolates) and 193,039 edges. Isolates can be removed by running 'Preprocessing > Networks > Delete Isolates.' The resulting network has 5122 nodes and 193,039 edges – and is too dense for display in GUESS. Edges with low weights can be eliminated by running 'Preprocessing > Networks > Extract Edges Above or Below Value' with parameter values:
     Extract from this number: 4
     Below?: # leave unchecked
     Numeric Attribute: weight

...

In the Sci2 Tool, select "361 unique ISI Records" from the 'FourNetSciResearchers' dataset in the Data Manager. Run 'Preprocessing > Topical > Normalize Lowercase, Tokenize, Stem, and Stopword Text' using the following parameters:

...

The result is a derived table – "with normalized Abstract" – in which the text in the abstract column is normalized. Select this table and run 'Data Preparation > Text Files > Extract Word Co-Occurrence Network' using parameters:

The outcome is a network in which nodes represent words and edges and denote their joint appearance in a paper. Word co-occurrence networks are rather large and dense. Running the 'Analysis > Networks > Network Analysis Toolkit (NAT)' reveals that the network has 2,821 word nodes and 242,385 co-occurrence edges.

There are 354 isolated nodes that can be removed by running 'Preprocessing > Networks > Delete Isolates' on the Co-Word Occurrence network. Note that when isolates are removed, papers without abstracts are removed along with the keywords.

...

Wiki Markup
Once edges have been removed, the network "top 1000 edges by weight" can be visualized by running _'Visualization > Networks >_ _[CISHELL:GUESS]__'_. In GUESS, run the following commands in the Interpreter:
     > for node in g.nodes:
     \[tab\]  node.x = node.xpos * 40
     \[tab\]  node.y = node.ypos * 40
     \[tab\]
     > resizeLinear(references, 2, 40)
     > colorize(references,\[200,200,200\],\[0,0,0\])
     > resizeLinear(weight, .1, 2)
     > g.edges.color = "127,193,65,255"

...

Note that only the top 1000 edges (by weight) in this large network appear in the above visualization, creating the impression of isolate nodes. To remove nodes that are not connected by the top 1000 edges (by weight), run 'Preprocessing > Networks > Delete Isolates' on the "top 1000 edges by weight" network and visualize the result using the workflow described above.

...