Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Network Extraction

Q: How are networks extracted from my data?

A: First a data set is loaded into Sci2. This tool can handle a variety of different data formats, to see all the data formats supported by Sci2 check the data formats section of our documentation.  Now, once a data set has been loaded, it can be viewed as a comma separated value (CSV) file. Simply right click on the table icon in the data manager of Sci2 and view the table in the spreadsheet program of your choice.

Once loaded into Sci2, all data assume the table format. In the CSV file pictured above, each column head represents a different attribute of the network and each row represents a node in the network. For example, if a directed network were to be extracted from the table above with a target column of Award Number and a source column of NSF Organization

Then the resulting network visualization will look like this:

Note, the above network was generated in Sci2 and then opened in Gephi, where the color and labels were added.  Sci2 v1.0 alpha supports a bridge to Gephi


In the network above, the green nodes represent the NSF organizations in this data set and the blue nodes represent the awards (labeled by award number) associated with those NSF organizations. You will notice that since the NSF organization was selected as the source node, each organization is represented in the network only once, while it may appear multiple times in the original data set. This is because each row in the data set is associated with a specific award, the target in this extraction. In some cases this is a one-to-multiple relationship and in others a one-to-one, depending on how many awards are associated with a particular organization. This is an example of just one type of network that can be extracted from data loaded into Sci2. You can create your own data set in Excel or any other spreadsheet program, save the data as a CSV (*.csv) file, load it into Sci2, and perform a variety of analyses and visualizations.

You can also practice with the variety of sample data sets located here.  It is highly recommended that you follow the sample workflows provided on this documentation wiki.


Q: I am trying to extract a network by using more than two attributes of the data set as input parameters in a network extraction algorithm. Is this possible?

A: Yes, often times you will want to have multiple attributes available in your network extraction. For example, you may want to extract a directed network from NSF principle investigators (PIs) to their awards and size the award nodes by amount awarded. However, you will notice the Extract Directed Network algorithm only allows for two input parameters: source column and target column. More attributes can be added to the network by using property files (aggregate function files). These are simple text files that can be added to many network extractions to aggregate data, allowing you to have more than two attributes in a network with one extraction. More information, including how to create your own property files, can be found here.


Q: While trying to extract a network from a CSV file and using a comma “,” in the text delimiter field I got an error.

A: The parameter field will not accept a comma. This is due, in part, to the Prefuse library that was created back in 2006 and the CSV parser is still not able to handle a comma in the same column. You can simply use a “;” or a “|” instead. If your data set contains many commas consider opening the file in Excel and using a macro to replace the commas with another acceptable symbol.

Database Functionality

Q: Is there database functionality for Sci2 v1.0 alpha?

A: Currently, there are ISI, Medline, and Scopus databases for Sci2 v1.0. However, these databases are not ready for release yet. We need another sprint to get them ready. Be sure to check the Sci2 news page for updated information on current and upcoming releases of the tool.

Global Level Studies - Macro

 Q: I am interested in visualizing some of my geospatial data, but I only see a world map and a U.S. map. Are there country-specific maps on which I can overlay a country-specific geospatial network? 

A: Unfortunately the U.S. map and the world map are currently the only maps we have available.  However, if you would like to submit a feature request it can be done by visiting the Sci2 development page.


Q: GUESS is taking a very long time to visualize my network. Is there any way to speed up this process? 

 A: Unfortunately GUESS only uses one core in the CPU instead of multiple and this means visualizing large networks can take some time. Since GUESS is a third party visualization tool supported by Sci2 we don’t have control over how many cores are used in processing. However, the latest release, Sci2 v1.0 alpha, supports Gephi ( a visualization tool that handles large networks well. For more information see our documentation on Gephi.

Don't see what you are looking for in Sci2?

Development services can be performed by our team. You can submit a feature request at the Sci2 by visiting the Sci2 web site: Also, development services can be performed for a fee, prices vary according to the scope of work. Previous paid development/services performed by our center have included evaluation and monitoring, data compilation, visual interfaces for community data, VIVO researcher networking, and consulting/training services. To get an estimate for your project contact our center:

  • No labels