Description
This a set of Arabic book reviews containing over 63,000 reviews. This is the largest sentiment analysis dataset in Arabic to-date. The dataset was downloaded from www.goodreads.com during the month of March 2013. The package contains the cleaned up reviews, together with a utility class in Python that provides an easy interface to loading the standard training and tests. More information is available in the reference below and the README file.
Download
- LABR v2.0 [11.6 MB] or browse and download the code and data from GitHub.
- Includes standard splits of the data into training, validation, and testing, as well as scripts to reproduce the basic experiments described in [2].
- Contains splits into three sentiment polarities: positive, negative, and neutral instead of just two classes as in version 1.
- LABR v1.0 [8.5 MB] or browse and download the code and data from GitHub.
- Includes standard splits of the data into training, validation, and testing, as well as scripts to reproduce the basic experiments described in [1].
- This work is done jointly with Mahmoud Nabil and Amir Atiya.
- Work on LABR v2.0 and the experiments described in [2] were performed by Mahmoud Nabil.
References
- Mohamed Aly and Amir Atiya.
LABR: Large Scale Arabic Book Reviews Dataset, Meetings of the Association of Computational Linguistics (ACL), Sofia, Bulgaria, August 2013.
[pdf]
- Mahmoud Nabil, Mohamed Aly, and Amir Atiya. LABR 2.0: Large Scale Arabic Sentiment Analysis Benchmark. arXiv e-print (arXiv:1411.6718), 2014. [pdf]
|
 Updating...
Mohamed Aly, Aug 3, 2013, 9:14 AM
Mohamed Aly, Mar 10, 2015, 11:09 PM
|