Description

This algorithm can be used to

extract the weighted document-document network for an ISI database

. This is an undirected, weighted network.

Menu Path

Extract -> From ISI Database -> Extract Weighted Document-Document Network

Input Parameters
Threshold

Any document-document edge whose calculated similarity is below this value will not be given an edge in the network output by this algorithm.

Comparison Algorithm

The comparison algorithm is used to calculate the similarity between documents. You may select between the following algorithms:

Name

Link to More Information

JaccardCoefficient

Wikipedia

CosineSimilarity

Wikipedia

SørensenSimilarityIndex

Wikipedia

Output

A network file with all documents connected by a weighted edge as calculated by the comparison algorithm if the weight of the was above the threshold value.

Usage Hints

The Cosine Similarity comparison algorithm does not preform any real comparison between the terms. It only uses the existence or nonexistence of a term in the set for the calculation.

Implementation Details

The Cosine Similarity comparison algorithm does not preform any real comparison between the terms. It only uses the existence or nonexistence of a term in the set for the calculation.

The network file has the following node properties

Label

Type

Source

label

String

A concatenation of PUBLICATION_YEAR, DOCUMENT_VOLUME, DIGITAL_OBJECT_IDENTIFIER, BEGINNING_PAGE from the DOCUMENT table, TWENTY_NINE_CHARACTER_SOURCE_TITLE_ABBREVIATION from the SOURCE table, and UNSPLIT_NAME from the PERSON table.

The network file has the following edge properties:

Label

Type

Source

weight

float

The comparison algorithm chosen.

DEFAULT_SOURCE_KEY

int

The node given to the edge.

DEFAULT_TARGET_KEY

int

The node given to the edge.

See Also