ArticleAppraiser

This repo documents the code used to rank novel machine learning journal articles.

Notable packages used include textacy, spacy, and scholarly.

Defining Novelty

For a paper to be novel and useful, I posit that it has to be (1) new and (2) impactful.

In academia, it is generally regarded that the number of times a paper is cited is a strong indicator of the novelty of a paper. However, there are two limitations to using this citation data to predict novelty:

A paper only starts accummulating citations months after it has been published.
Citation data of journal articles was not provided for this data challenge.

Metrics

Given the data provided, two proxies can be used to measure novelty, as defined below:

Topic Score: A score that is based on the chronological order of a paper within a subject field (i.e. topic) and the number of papers in said field.
Author Score: A score that is based on the h-index of authors of the paper.

The final Novelty Score is calculated based on the two metrics above. More details can be found in the Jupyter Notebooks.

Jupyter Notebooks

I have organized the work I did into five distinct notebooks. I recommend looking through them in the order below.

EDA.ipynb contains exploratory data analyses.
Topic Modelling.ipynb contains the pipeline to extracta and assign topics to each journal article.
Scholars.ipynb parses the scraped scholars information.
Scores.ipynb outlines the metrics used to score the documents.
Model Training.ipynb contains code for model development.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
scripts		scripts
src		src
1. EDA.ipynb		1. EDA.ipynb
2. Topic Modelling.ipynb		2. Topic Modelling.ipynb
3. Scholars.ipynb		3. Scholars.ipynb
4. Scores.ipynb		4. Scores.ipynb
5. Model Training.ipynb		5. Model Training.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArticleAppraiser

Defining Novelty

Metrics

Jupyter Notebooks

About

Releases

Packages

Languages

christeefy/ArticleAppraiser

Folders and files

Latest commit

History

Repository files navigation

ArticleAppraiser

Defining Novelty

Metrics

Jupyter Notebooks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages