Identification of an Anonymous Reviewer of an Academic Paper

We deal with the identification of academic paper reviewers, using methods from the field of Natural Language Processing (NLP) and Deep Learning (DL).
Academic papers and reviews are traditionally given anonymously, but it is possible that using advanced Machine Learning tools it may be proven that there exists the possibility of identifying the identity of the reviewer. If so, this could have a major influence on the way in which academic reviews are given, since the de-anonymization may open the door to prejudice, competitive effects and more.
As far as we know, this problem has not been studied before, and holds interesting research topics: the reviews are naturally short texts, and anonymous – so there is very little data to handle. As a result, the main goal of the project is a cross-domain problem: learning on the domain of academic papers and inference in the domain of academic reviews with fairly small datasets.
As part of the project, we learned about the fields of NLP and feature extraction for the projects main goal: testing the model on a “simple toy problem”, testing the model on a similar problem of training and inferring on academic papers, and training and inferring on academic reviews.
During the project we performed a process of feature extraction for the above problems, created datasets, performed preprocessing steps, and used various machine learning algorithms in order to improve our model and our understanding of the problem.
We managed to achieve quite good results in a research field which is naturally challenging.

magnifying glass

Identification of an Anonymous Reviewer of an Academic Paper