Child pages
  • NSF format (.nsf)

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3


Basic statistics as well as the evolving co-PI networks as well as the geospatial coverage of certain research topics can be determined. To extract co-authorship (really co-investigatorship), choose the "All Investigators" column when utilizing the Extract Co-Occurrence Network from Table algorithm.

What should I know about how NWB handles this format?

When an NSF csv file is read, the following actions are performed:

Code Block

Convert NSF file into a table
Remove Marks NSF file as "pre-2014" or "current" depending on format of headers
Handle duplicate columns
Replace '\"' with '""' (more standard method of escaping quotes in csv files)
Normalize CO-PI column (makehandle themexcessive |spaces separated)
Normalize Primary PI column (make it looks like CO-PI format)
in names)
If it is a "current" format NSF:
	Normalize dollar amounts (go from strings to floats)
Else if it is a "pre-2014" format NSF:
	Normalize PI names (put it in "first name first" format with no comma
Create "All Investigators" Column (merge CO-PI and Primary PI column)


"NSF" files are really comma-separated value files, but because they are horribly formatted, require special processing by Network Workbench. Thus, in order to get Network Workbench to treat them correctly, the extension must be changed to ".nsf".

What is the difference between the "current" and "pre-2014" NSF formats?

A full listing of changes may be found on the JIRA issue page for this update. The most important thing for users to keep in mind is that sets of data are now delimited by commas instead of pipes in the current version. Also, when using aggregate function files, you will want to use specially labelled "pre-2014" properties files when working with those older data sets.