Child pages
  • Match References to Papers

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The use of the word "Paper" and "References" in this algorithm name is a misnomer. It should actually be Match References Citations to Documents to maintain technicality.

This algorithm attempts to match References Citations to Documents (in an "is a" relationship). A Reference citation is considered to match a Document document if and only if:

  • the Reference Citation Author, Page Number, Source, Volume, and Year are all provided and are valid;
  • the Reference Citation Author matches the first Author of the Documentdocument ;
  • the Reference Citation Page Number matches the Document document Beginning Page;
  • the Reference Citation Source and Document Source are exactly the same Source;
  • the Reference Citation Volume matches the Document document Volume;
  • the Reference citation Year matches the Document document Year.

The output of this algorithm is a copy of the input database, but with the REFERENCE Citations Table table updated to point to the DOCUMENT Documents Table table (via the PAPERthe document_FKidfield). When it is finished, it reports how many References citations were matched to Papers documents and how many References citations were over-matched to Documentsdocuments. An over-matched Reference citation is a Reference citation that matches more than one Documentdocument.

Usage Hints

Load an ISI file into the tool, then create a database from it using the ISI database loader.

It is strongly recommended that the database be cleaned before matching References citations to Documentsdocuments.

For a quick analysis of a small dataset you may wish to merge together author entities with identical names. For a scientifically sound analysis of a larger dataset, you can find author entity merging suggestions (or manually set your own merging orders from scratch) and perform the merge.

Then, you will probably want to merge together journal entities according to recognized variants.