Child pages
  • 5.2.5 Burst Detection in Physics and Complex Networks (ISI Data)

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Table of Contents

AlessandroVespignani.isi

 

Time frame:

1990-2006

Region(s):

Indiana University, University of Rome, Yale University, Leiden University, International Center for Theoretical Physics, University of Paris-Sud

Topical Area(s):

Informatics, Complex Network Science and System Research, Physics, Statistics, Epidemics

Analysis Type(s):

Burst Detection

...

The "Gamma" parameter is the value that state transition costs are proportional to. This parameter is used to control how ease easy the automaton can change states. The higher the "Gamma" value, the smaller the list of bursts generated.

...

The "Date Format" specifies how the date column will be interpreted as a date/time. See http://javadocs.sunoracle.com/j2se/1.4.2javase/6/docs/api/java/text/SimpleDateFormat.html for details.

The "Text Column" parameter is the name of the column with values (delimiter and tokens) to be computed for bursting results.

...

And the "End" field indicates when the burst stopped. A An empty value in the "End" field indicates that the burst lasted until the last date present in the dataset. Where the "End" field is empty, put manually add the last year present in the dataset. In ; in this case, 2006.

After you manually add manually this information, save this .csv file somewhere in your computer. Load back this Reload the .csv file into Sci2 using 'File > Load'. Select 'Standart Standard csv format' int the pop-up window. A new table will appear in the Data Manager. To visualize these the table that contains the results of the Burst Detection algorithm, select the table you just loaded in the Data Manager and run 'Visualization > Temporal > Temporal Bar Graph' with the following parameters:

...

Again where the "End" field is empty, put manually add the last year present in the dataset. In ; in this case, 2006.

After you manually add manually this information, save this .csv file somewhere in your computer. Load back Reload this .csv file into Sci2 using 'File > Load'. Select 'Standart Standard csv format' int the pop-up window. A new table will appear in the Data Manager. To visualize these table that contains these new results for the Burst Detection algorithm, select the table you just loaded in the Data Manager and run 'Visualization > Temporal > Horizontal Bar Graph (not included version)' with the same parameters.

...

As expected, a larger number of bursts appear, and the new bursts have a smaller weight that those depicted in the first graph. These smaller, more numerous bursting terms permit a more detailed view of the dataset and allow the identification of trends. The "protein" burst starting in 2003, for example, indicates the year in which Alessandro Vespignani started to work with "protein-protein interaction networks," while the burst "epidem" - also from 2001 - is related to the application of complex networks to the analysis of epidemic phenomena in biological networks.

Anchor
5.2.5.2 Updating the Vespignani Dataset
5.2.5.2 Updating the Vespignani Dataset
5.2.5.2 Updating the Vespignani Dataset

The original dataset for Alessandro Vespignani was created in 2006. If you wish to update the dataset to gain an understanding for how his research has changed and evolved since 2006 you can obtain a new dataset from the from Web of Science, see 4.2.1.3 ISI Web of Science. However, another way to obtain an individual researcher's publication information is to use their Google Scholar profile, if they have one. One of the biggest benefits to using a Google Scholar profile is that you will get publications not indexed in Web of Science, such as some book chapters. In this example, we will obtain the publication information for Alessandro Vespignani using Google Scholar:

...

If the author or investigator you have searched has for a Google Scholar profile, you will see a link to their profile at the top of the results page:

...

Keep in mind that not every author you search will necessarily have a Google Scholar profile, but for those that do, this is a very useful way to get their publication information. Click on the link to view Alessandro Vespignani's profile, and then select all publications and click the export button at the top of his publication list to export the citation information:

...

After you have specified the export format you can save the CSV file to your desired location by clicking the "Export all articles by Alessandro Vespignani" button. Save the file do to your desktop and then load it into Sci2  Sci2 in the standard CSV format:

...

This will open the dataset in Excel (or you preferred spreadsheet editor). You will notice that the Lowercase, Tokenize, Stem, and Stopword Text algorithm has place brackets around the years. You will need to remove these before you can run the Burst Detection algorithm. In Excel, hit 'Ctrl-F' on the keyboard. This will bring up the Find and Replace tool. Highlight the column of years and then perform a find and replace:

 Image Added

 

You will have to repeat this for the other bracket symbol. This will essentially allow you remove the brackets around the years. Next you will need to remove those publications for which there is no year information. Burst Detection will not run if there are empty values in the date column. You can search for the publications and find the proper date, but the year value could be empty because these are forthcoming publications. In this example, we will just remove all publications without a value in the year column:

...

This will result in a "Burst detection analysis (Year, Title): maximum burst level 1" file in the data manager , right Right click on this file to view the data:

...

You will need to edit the data before you can run the Temporal Bar Graph algorithm to visualize the results of the burst detection. First, you should make sure every record has an "End" date or else the Temporal Bar Graph will not run properly. We know that this dataset contains records that are labeled with the year of 2013, so that will be our end date for those bursts that are still continuing:

...

Before you can visualize the results with the Temporal Bar Graph it is important to know that if you want to size bars based on weight, the weight value will be distributed across the length of the burst. In other words, the total area of the bar corresponds to the weight value. So This means you can have a bar with a high weight value that appears thinner when , compared to bar with a lower weight value if the former burst occurs over a longer period than the latter. Finally, before you visualize this dataset, you can add some categories to allow you to color your bars. For example you can sort the records from largest to smallest based on the "totalweighttotal weight" column and assign strong, medium, and weak categories to these records based on the "totalweighttotal weight" values:

Now, save the file to your desktop and reload it into Sci2 in the standard CSV format and run 'Visualization > Temporal > Temporal Bar Graph', entering the following parameters:

Note , selecting that if you select the "Simplified Layout" option no legend will be created for the map, allowing . This allows you to create your own legend that will be accurate based creating new weight values. To learn how to create a legend for your visualization see 2.4 Saving Visualizations for Publication.

...

Save the PostScript file to your desktop, and if you have a version of the Adobe Creative Suite on your machine you can simply double-click the PostScript file to launch Adobe Distiller and automatically convert the PostScript file into a PDF for viewing. However, if you do not have a copy of the Adobe Creative Suite installed on your machine, you can use an online version of GhostScript to convert PostScript files to PDF files: http://ps2pdf.com/. The resulting visualization should look similar to the following:

 

Remember that the weight for the bars is equal to the total area, not simply the thickness. So, including the color categories will help users make more sense of the visualization. You notice that this burst analysis for Alessandro Vesipignani's publications looks similar to the one created in the previous section. However, this new burst analysis takes into consideration his more recent publications and interests in human mobility networks and epidemiology. This workflow can easily be repeated using any author who has a profile in Google Scholar. Give it a try for yourself!

 5.2.5.3 Visualizing Burst Detection in Excel

...