Time-dependent Citation Counts
This method is an extension of the classical citation count ranking
method. The difference is that it takes into consideration the "age"
(publication date) of each citation by introducing a time decay factor.
The time decay parameter can have values between 0 and 1, where a
value of 0 signifies that we never "forget" any citation, and a
value of 1 signifies that we "forget" all the citations that were
acquired in the past years, and we only take into account the ones
from the present year.
Algorithm:
1. Read the citation data from the database or from a file that
has the following format: x[tab]y where the paper with recid x
cites the paper with recid y. The citation data is stored in
a map key:value, where the 'key' is a record id and the 'value'
is the list of publications that cite publication 'key'.
2. Read the publication dates for each paper from the database
or from a file that has the following format: x[tab]y,
where x is a recid and y is the publication year. There are
several possibilities for retrieving the dates from the database:
i) using the 260__c MARC tag that contains only the publication
year (this is the option that we are using);
ii) using the 269__c MARC tag that contains the complete
publication date.
For the papers that do not have a publication date, we consider
the date of insertion in the database (961__x tag). If neither
the publication date nor the insertion date are available,
we consider an average date (computed with the existing dates).
3. Read the time_decay factor from the configuration file.
4. Calculate the weight of each citation based on its age
and time_decay.
5. Calculate the final weight of each publication by summing up
the weights of its citations. In case of equality, the order
of the publications is given by the publication date
(newest papers first).
6. Write the ranks to the database and to a file. The name of
the file and the number of ranks that should be outputted can
be set into the configuration file. If the name of the file
is not set, then the ranks are only written in the database.