Child pages
  • 4.2 Data Acquisition and Preparation

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

After you have refined your search query to the WoS database, you will want to add records to your Marked List. To do this, you may select individual articles using the check boxes next to a publication record and then selecting the button, "Add to Marked List", see figure 4.3a2e.

 

Figure 4.2e: Locating the Marked List button on Web of Science

You may also add records to the Marked List using a range of publication records in a search result. To do this, select the "Add to Marked List" button in the search result page without adding marks to individual publication records. Next, a box will pop up in the screen that allows you to enter a range for the publication records that you are interested and then select the "Add" button, see figure 4.3b 2f for a picture of this screen.

...

You will notice that the publications you have selected in the search page will now have orange checks next to them, and the Marked List tab on the Navigation bar for the site will be updated to include the total number of records in your list. After you've completed adding publication records to your marked list you will want to visit the Marked List page. Click on the link in the navigation bar named "Marked List" that has an orange count box next to it, see figure 4.3c2g.


Figure 4.2g: Click the Marked List link in the navigation bar to access the Marked List page on Web of Science

The Marked List page provides you with a list of publications selected and various means of exporting publication data, including limiting exports to specific databases that are part of the WoS database. To export the most amount of data from WoS and to ensure the most replicable results, it is best to choose the Web of Science Core Collection export tool. The export tool outlines the various steps you will need to take: from selecting the exported records, selecting the content for exporting, and the format that the data will come in, see figure 4.3d 2h for a view of this export tool.

...

At the export screen, select 'Comma separated file, .csv' (e.g. Excel) and select the types of information that you will need. For our purposes, select 'All available information' from the drop-down menus and choose 'Export'.
Save the file as WattsStrogatz.scopus. The resulting file can be seen in Figure 4.52j.


Figure 4.2j: Saving records in Scopus and viewing WattsStrogatz.scopus

...

Google Scholar data can be acquired using Publish or Perish (Harzing, 2008) that can be freely downloaded from http://www.harzing.com/pop.htm. A query for papers by Albert-László Barabási run on Sept. 21, 2008 results in 111 papers that have been cited 14,343 times, see Figure 4.62k.


Figure 4.2k: Publish or Perish search results for Albert-László Barabási and viewing barabasiPoP.csv

...

To download your Facebook data, open the Sci2 Tool and select the "Facebook" option in the file menu. In the next submenu, select the "Access Token" option. You may be asked to log into your account; afterwards, you will be redirected to a webpage that provides you with an access token. In your browser, right click on the grey text, choose "Select All", and then copy this text to your clipboard. Figure 4.7 2l demonstrates this process.

...

After you have copied your access token, return to the Sci2 Tool and the file menu. In the "Facebook" sub-menu, select either "Facebook Friends Data" or "Mutual Friends". A window will pop up, like in figure 4.82m. Paste your access token into the text field and hit the "OK" button.

...

Funding data provided by the National Institutes of Health (NIH), and associated publications and patents, can be retrieved via the NIH RePORTER site (http://projectreporter.nih.gov/reporter.cfm). The database draws from eRA, Medline, PubMed Central, NIH Intramural, and iEdison. Search by location, PI name, category, etc., see Figure 4.102o.


Figure 4.2o: NIH RePORTER search interface and search results page

...

Medline, U.S. patent, as well as funding data provided by the National Science Foundation and the National Institutes of Health can be downloaded from the Scholarly Database (SDB) at Indiana University. SDB supports keyword based cross-search of the different data types and data can be downloaded in bulk, see Figures 4.12 2q and 4.13 2r for interface snapshots.

Register to get a free account or use 'Email: nwb@indiana.edu' and 'Password: nwb' to try out functionality.
Search the four databases separately or in combination for 'Creators' (authors, inventors, investigators) or terms occurring in 'Title,' 'Abstract,' or 'All Text' for all or specific years. If multiple terms are entered in a field, they are automatically combined using the Boolean operator 'OR.' Entering 'breast cancer' will match any record with 'breast' or 'cancer' in that field. Using the Boolean operator AND (for example, 'breast AND cancer') would only match records that contain both terms. Double quotations can be used to match compound terms, e.g., "breast cancer" retrieves records with the phrase "breast cancer," but not records where 'breast' and 'cancer' are present in isolation. The importance of a particular term in a query can be increased by putting a ^ and a number after the term. For instance, 'breast cancer^10' would increase the importance of matching the term 'cancer' by ten compared to matching the term 'breast.'

...