Classification of Indian English Poetry into Pre-Independence and Post-Independence Eras using Combination of Semantics, Topics and Style features

Main Article Content

K.Praveenkumar, Venkata Naresh Mandhala , Debnath Bhattacharyya, Debrup Banerjee

Abstract

Automatic classification of poetry era is a challenging task. In Indian English Poetry, the poems are categorized into two eras named Pre-Independence and Post-Independence. The poetry style and themes are changed from one era to another era depending on the authors era that he/she belongs to. Hence, this study is testing of different feature selection methods and ensembled features to identify the poems era automatically. The poetry classification can be carried out on semantics, topics and style features. In this experiment, we have used Latent Semantic Analysis (LSA) to find semantic features, Latent Dirichlet Allocation (LDA) topic modeling
to find topic features, along with these phonemics, syntactic elements and structure of poetry as style features. The experiment is carried out on 760 poems written by 28 authors, in this, 344 belongs to Pre-Independence era and 416 belongs to Post-Independence era. The classification accuracy 91.20% is achieved using Random Forest classifier with combination of LSA and LDA
feature set. Further with the combination of style features to LSA and LDA features the classification result achieved is 92%. The study showed that the poetry can be classified into different eras with decent accuracy based on the combination of topics, words
and style of poetry as features.

Article Details

Section
Articles