Q: I am having trouble installing and running Sci2. What should I do?
A: One common problem when initially installing Sci2 is not having a current version of Java. Sci2 runs on Java 1.6 (32 bit) or newer. If you are having trouble installing Sci2, you may want to install the latest 32 bit version of Java. Make sure you install the 32 bit version, as Sci2 will only run with a 32 bit version. You can run 32 bit and 64 bit versions of Java simultaneously. If you have the latest version of Java installed, but is not being detected by Sci2, you must specify the correct path to be used by the tool to locate the correct version of Java. You will want to target the javaw.exe file – which is likely located in the C:\Program Files (x86)\Java\jre7\bin directory, assuming you have installed Java in the default place.
Now, you will have to tell Sci2 to look in this directory. In your Sci2 installation, open sci2.ini (with Notepad or any other plain text editor), delete the contents, and replace them with this:
If you're installation of Java is located in another directory on your system, you'll need to replace the path to javaw.exe with the one you find on your system. Save the file in the Sci2 installation and you should be able to start the tool.
Sci2 Locale Issues
Q: I am having trouble loading data into Sci2 and having trouble converting PostScript files to PDF files.
A: If you are outside of the United States, problems loading data, saving files from the data manager, converting files types, and running the Temporal Bar Graph algorithm, due to locale issues. If you are working on a Windows machine (for Mac and Linux refer to the Changing System Locale guide) and you want to change the locale, go to the Start Menu and then select Control Panel and then select Region and Language, then change the format to English (United States):
If you are working on a Mac, open System Preferences and choose Language and Text. Then go to the Formats tab and set the Region to the United States.
General Data Loading
Q: Does it matter that my data file has the same file format extension used in Sci2 documentation or may I just use the .txt and .csv extensions?
Sci2 documentation often references data formats through the use of file extensions (.isi, .nwb, .mat, .enw), which indicate the type of data being loaded into Sci2 for parsing. Yet often data are exported by other tools and and databases as files with the file extension .txt, which can indicate one of two possibilities: that the data are formatted as a tab separated columns; or that the data is for written in a machine readable format. For data saved as a Tab Separated format, you will need to convert these tables into CSV formats before you load these files into Sci2.
However, if the data is written in a Machine readable format, such as Web of Science (.isi) files and Endnote Export format file (.enw), Sci2 will parse data loaded with the .txt extension, provided that you know the correct data formatting used in the file. For example, when you export data from EndNote Web or Endnote desktop as the Endnote Export format, files are given the .txt extension. You may maintain this extension or change it to .enw. Changing the extension allows Sci2 to automatically read in the data without selecting the data format parameter. Whereas, by keeping the .txt file extension, you will have to select the parameter “Refer Scholarly format” when loading the data into Sci2.
ISI Format Loading questions
Q: I am having trouble loading an ISI file that I downloaded from Web of Science. What should I do?
A: Web of Science recently changed the format for their data. Try updating your version of Sci2 to the newest, v1.0 alpha. Also, see the page about the ISI format, which includes a work-around for loading ISI data in older versions of the tool.
Q: I want to import some ISI data into Sci2 but I can only download files in increments of 500 records from the Web of Science. How can I merge and load an ISI dataset larger than 500 records into Sci2?
A: It is typically best to perform this type of merge before you load the data into Sci2. Download the data from the Web of Science as a plain text file and then open them in the text editor of your choice. It is usually to open the files in a more advanced text editor, such as WordPad or Notepad ++ because they will be formatted in a way that easier to read. However, do not open the files in Word. You will want to copy and paste the records into one file. Make sure to paste the contents of the files after the ER (End Record) tag of the last record and before the EF (End File) tag that closes the ISI file. Once you have merged the files, you will want to shave it as a text (.txt) file. Then you should change the extension from .txt to .isi. If you are having trouble seeing file extensions right-click on the file and select "Properties" (Windows) and then you should be able to see the file name and extension. Simply delete the old extension and replace it with the desired one. On a Mac, you will need to right-click and select "Get Info" and then choose the "Name & Extension" tab to change the file extension.
Now you are ready to load the file into Sci2. Make sure to load the file as an "ISI flat format" file. If you are having trouble loading the ISI file, see the previous question in this section. For ISI data, Sci2 should automatically detect and remove duplicate records. However, you still may need to check over the data, as Web of Science data can sometimes be inconsistent.
Q: The data I am working with is really messy. How much of the data cleaning can I do in Sci2?
A: It depends on the data format, but there are some options for tidying up your data under the Preprocessing menu in Sci2. If you are working with a large dataset in CSV format, you may need to do a lot of the data cleaning prior to loading the file into Sci2. You can do a lot of this data cleaning in Excel or with Google Refine, an extremely useful program for cleaning data.
Q: The ISI data I downloaded was not formatted in the same way as those used in your workflows?
Web of Science provides researchers with two ways to gather citation data: Citation Lists are detailed article records that may be generated through the search interface or the marked list tool in; Citation Reports simplified citation information and a citation statistics for the articles selected by a user that can be exported into tabular formats. Citation List text files are the format used in our network analysis workflows; in particular, this format can be acquired with cited reference lists that are important creating citation networks and document co-occurrence networks. These types of records can be loaded into a Sci2 directly and they are reformatted by the Sci2 program into CSV files for further analysis. Instructions on how to download Web of Science Citation List records in this format can be found in the Data Acquisition portion of the manual.
Q: How are networks extracted from my data?
A: First a data set is loaded into Sci2. This tool can handle a variety of different data formats, to see all the data formats supported by Sci2 check the data formats section of our documentation. Now, once a data set has been loaded, it can be viewed as a comma separated value (CSV) file. Simply right click on the table icon in the data manager of Sci2 and view the table in the spreadsheet program of your choice.
Once loaded into Sci2, all data assume the table format. In the CSV file pictured above, each column head represents a different attribute of the network and each row represents a node in the network. For example, if a directed network were to be extracted from the table above with a target column of Award Number and a source column of NSF Organization
Then the resulting network visualization will look like this:
In the network above, the green nodes represent the NSF organizations in this data set and the blue nodes represent the awards (labeled by award number) associated with those NSF organizations. You will notice that since the NSF organization was selected as the source node, each organization is represented in the network only once, while it may appear multiple times in the original data set. This is because each row in the data set is associated with a specific award, the target in this extraction. In some cases this is a one-to-multiple relationship and in others a one-to-one, depending on how many awards are associated with a particular organization. This is an example of just one type of network that can be extracted from data loaded into Sci2. You can create your own data set in Excel or any other spreadsheet program, save the data as a CSV (*.csv) file, load it into Sci2, and perform a variety of analyses and visualizations.
Q: I am trying to extract a network by using more than two attributes of the data set as input parameters in a network extraction algorithm. Is this possible?
A: Yes, often times you will want to have multiple attributes available in your network extraction. For example, you may want to extract a directed network from NSF principle investigators (PIs) to their awards and size the award nodes by amount awarded. However, you will notice the Extract Directed Network algorithm only allows for two input parameters: source column and target column. More attributes can be added to the network by using property files (aggregate function files). These are simple text files that can be added to many network extractions to aggregate data, allowing you to have more than two attributes in a network with one extraction. More information, including how to create your own property files, can be found here.
Q: While trying to extract a network from a CSV file and using a comma “,” in the text delimiter field I got an error.
A: The parameter field will not accept a comma. This is due, in part, to the Prefuse library that was created back in 2006 and the CSV parser is still not able to handle a comma in the same column. You can simply use a “;” or a “|” instead. If your data set contains many commas consider opening the file in Excel and using a macro to replace the commas with another acceptable symbol.
Q: Is there database functionality for Sci2 v1.0 alpha?
A: Currently, there are ISI, Medline, and Scopus databases for Sci2 v1.0. However, the database plugin not ready for release yet. We need another sprint to get them ready. However, the plugin that allows files to be loaded as databases is available for Sci2 v0.5.2 alpha or older. Be sure to check the Sci2 news page for updated information on current and upcoming releases of the tool. https://sci2.cns.iu.edu/user/news.php
Global Level Studies - Macro
Q: I am interested in visualizing some of my geospatial data, but I only see a world map and a U.S. map. Are there country-specific maps on which I can overlay a country-specific geospatial network?
A: Unfortunately the U.S. map and the world map are currently the only maps we have available. However, if you would like to submit a feature request it can be done by visiting the Sci2 development page. https://sci2.cns.iu.edu/user/developer.php
Saving and Visualizing Networks
Q: GUESS is taking a very long time to visualize my network. Is there any way to speed up this process?
A: Unfortunately GUESS only uses one core in the CPU instead of multiple and this means visualizing large networks can take some time. Since GUESS is a third party visualization tool supported by Sci2 we don’t have control over how many cores are used in processing. However, the latest release, Sci2 v1.0 alpha, supports Gephi (http://gephi.org/) a visualization tool that handles large networks well. For more information see our documentation on Gephi.
Q: I saved my visualization as a Postscript file, but I am having trouble converting the Postscript file to a PDF so it can be viewed.
A: Converting Postscript files can be tricky. If you have a version of the Adobe suite (or the Pro version of Adobe Acrobat) installed on your machine, the converter will come pre-installed. However, there are also some free options available for postscript conversion: Ghostscript and GSviewer. Ghostscript is a PostScript interpreter, and GSview is a graphical interface for Ghostscript; you can download both from Ghostscript, Ghostview and GSview. Both are freeware. The GSview/Ghostscript combination is capable of opening PostScript level two files, which are the type generated by Sci2. See the following web page for a useful tutorial on how to install Ghostscript: http://pages.cs.wisc.edu/~ghost/doc/gnu/7.05/Install.htm
Q: I am having trouble entering commands in the Interpreter.
A: Typing commands into the Interpreter can be tricky, but once you learn it can be an extremely powerful way to manipulate a network visualization. Here a few common mistakes to avoid:
After typing a ":" you will need to hit "enter" and "tab" to start the new line, followed by another "enter" to execute the command. This is most common in the following scenario:
- When you type an command in the interpreter you do not need to type ">" to begin, simply begin with the text of your command. To learn more about how to manipulate network visualizations see our documentation on GUESS.
Q: I load the network into GUESS but no node is shown or the network is loaded without Graph Modifier panel.
A: There are some attribute names are reserved in GUESS. The same attribute name in the data could cause conflict in GUESS. Rename the following attribute names in your data if exist. Then save the data and reload into Sci2.
The following show sample data issues in NWB file format. Simply rename the attribute to size_d, type_d, count_d would solve the problem. See our documentation on GUESS to learn more about GUESS.
Third-Party Plugins and Licenses
A: Sci2 has been released under the Apache 2.0 license. Licenses for the various components used by Sci2 can be found under the "licenses" folder in your Sci2 installation directory.
The Sci2 v1.0 build released as of September 2013 will include additional third-party plugins :
- Google Scholar Citation Reader
- Bing Geocoder
- Flickr Reader
- Twitter Reader
- Facebook plugin only allows users download their own data. For more information, see http://www.givememydata.com/.
- Twitter Reader plugin allows users to download the tweets of any twitter user, however the user and the application have to first be registered and authenticated by the API before performing any queries. For more information please see: https://dev.twitter.com/terms/api-terms
- Google Scholar Citation Reader will fetch data indexed by the Google Scholar search engine. For data usage policies please review the Google license.
- Bing Geocoder plugin will require an authentication key (Basic Key or Enterprise Key). The usage policy depends on the type of key. For more information, see http://www.microsoft.com/maps/
- Flickr Reader plugin downloads image URLs through REST service with API key without authentication. For more information, see http://www.flickr.com/services/api/
Please note that if required all plugins can be uninstalled separately.
Q: Does Sci2 perform any encryption or utilize any encryption processes?
A : Sci2 by itself does not perform or use encryption, however plugins such as the Facebook plugin and the Twitter Reader plugin use encryption for user authentication and data downloads.
Q: Does Sci2 have ECCN?
A : No, Sci2 does not have a ECCN
Don't see what you are looking for in Sci2?
Development services can be performed by our team. You can submit a feature request at the Sci2 by visiting the Sci2 web site: https://sci2.cns.iu.edu/user/developer.php. Also, development services can be performed for a fee, prices vary according to the scope of work. Previous paid development/services performed by our center have included evaluation and monitoring, data compilation, visual interfaces for community data, VIVO researcher networking, and consulting/training services. To get an estimate for your project contact our center: http://cns.slis.indiana.edu/.