A Comparative Analysis of Feature Selection Algorithms in Cross Domain
Sentiment Classification

Lipika      Goel; Sonam      Gupta; Avdhesh      Gupta; Neha      Nandal; Siddhi   Nath   Rajan; Pradeep      Gupta

doi:10.2174/0126662558276889240125062857

Abstract

Background: Cross-domain Sentiment Classification is a well-researched field in sentiment analysis. The biggest challenge in CDSC arises from the differences in domains and features, which cause a decrease in model performance when applying source domain features to predict sentiment in the target domain. To address this challenge, several feature selection methods can be employed to identify the most relevant features for training and testing in CDSC.

Methods: The primary objective of this study is to perform a comparative analysis of different feature selection methods on the various CDSC tasks. In this study, statistical test-based feature selection methods using 18 classifiers for the CDSC task has been implemented. The impact of these feature selection methods on Amazon product reviews, specifically those in the DVD, Electronics, Kitchen, and TV domains, has been compared. Total 12x18 experiments were conducted for each feature selection method by varying source and target domain pairs from the Amazon product reviews dataset and by using 18 classifiers. Performance evaluation measures are accuracy and f-score.

Results: From the experiments, it has been inferred that the CSDC task depends on various factors for a good performance, from the right domain selection to the right feature selection method. We have concluded that the best training dataset is Electronics as it gives more precise results while testing in either domain selected for our study.

Conclusion: Cross-domain sentiment analysis is a dynamic and interdisciplinary field that offers valuable insights for understanding how sentiment varies across different domains.

Graphical Abstract

[1]
A. Madasu,  and S. Elango, "Efficient feature selection techniques for sentiment analysis", Multimedia Tools Appl., vol. 79, no. 9-10, pp. 6313-6335, 2020.
 [http://dx.doi.org/10.1007/s11042-019-08409-z]
[2]
A. Ben-Hur,  and D. Horn, "Support vector clustering", J. Mach. Learn. Res., vol. 2, pp. 125-137, 2001.
[3]
T. Al-Moslmi, N. Omar, S. Abdullah,  and M. Albared, "Approaches to cross-domain sentiment analysis: A systematic literature review", IEEE Access, vol. 5, pp. 16173-16192, 2017.
 [http://dx.doi.org/10.1109/ACCESS.2017.2690342]
[4]
X. Duan, Y. Zhou, C. Jing, L. Zhang,  and R. Chen, "Cross-domain sentiment classification based on transfer learning and adversarial network", In IEEE 4th International Conference on Computer and Communications (ICCC)., Chengdu, China, 07-10 Dec , 2018. 
 [http://dx.doi.org/10.1109/CompComm.2018.8780771]
[5]
S.J. Pan,  and X. Ni, "Cross-domain sentiment classification via spectral feature alignment", In Proceedings of the 19th International Conference on World Wide Web, Raleigh, North Carolina, USA 26-30 April , 2010. 
[6]
D. Bollegala, D. Weir,  and J. Carroll, "Cross-domain sentiment classification using a sentiment sensitive thesaurus", IEEE Trans. Knowl. Data Eng., vol. 25, no. 8, pp. 1719-1731, 2013.
 [http://dx.doi.org/10.1109/TKDE.2012.103]
[7]
D. Bollegala, T. Mu,  and J.Y. Goulermas, "Cross-domain sentiment classification using sentiment sensitive embeddings", IEEE Trans. Knowl. Data Eng., vol. 28, no. 2, pp. 398-410, 2016.
 [http://dx.doi.org/10.1109/TKDE.2015.2475761]
[8]
B. Heredia, T.M. Khoshgoftaar, J. Prusa,  and M. Crawford, "Cross-domain sentiment analysis: An empirical investigation", In IEEE 17th International Conference on Information Reuse and Integration (IRI), Pittsburgh, PA, USA, 28-30 July , 2016. 
 [http://dx.doi.org/10.1109/IRI.2016.28]
[9]
Y. Ganin, A.H.L. Hana, F. Laviolette,  and V. Lempitsky, "Domain-adversarial training of neural networks", J. Mach. Learn. Res., vol. 17, pp. 1-35, 2016.
[10]
V. Birchha,  and B. Nigam, "Performance analysis of averaged perceptron machine learning classifier for breast cancer detection", Procedia Comput. Sci., vol. 218, no. C, pp. 2181-2190, 2023.
 [http://dx.doi.org/10.1016/j.procs.2023.01.194]
[11]
Z. Li, "Hierarchical attention transfer network for cross-domain sentiment classification", In Proceedings of the AAAI Conference on Artificial Intelligence, 2018. 
[12]
T. Manshu,  and W. Bing, "Adding prior knowledge in hierarchical attention neural network for cross domain sentiment classification", IEEE Access, vol. 7, pp. 32578-32588, 2019.
 [http://dx.doi.org/10.1109/ACCESS.2019.2901929]
[13]
O. Hourrane,  and N. Idrissi, "Sentiment classification on movie reviews and twitter: An experimental study of supervised learning models", In Proceedings of the 2019 IEEE 1st International Conference on Smart Systems and Data Science (ICSSD), Rabat, Morocco, 2019, pp. 1-6 
 [http://dx.doi.org/10.1109/ICSSD47982.2019.9003118]
[14]
B. Zhang, X. Xu, M. Yang, X. Chen,  and Y. Ye, "Cross-domain sentiment classification by capsule network with semantic rules", IEEE Access, vol. 6, pp. 58284-58294, 2018.
 [http://dx.doi.org/10.1109/ACCESS.2018.2874623]
[15]
M. Yang, W. Yin, Q. Qu, W. Tu, Y. Shen,  and X. Chen, "Neural attentive network for cross-domain aspect-level sentiment classification", IEEE Trans. Affect. Comput., 2019.
[16]
Y. Hao, T. Mu, R. Hong, M. Wang, X. Liu,  and J.Y. Goulermas, "Cross-domain sentiment encoding through stochastic word embedding", IEEE Trans. Knowl. Data Eng., vol. 32, no. 10, pp. 1909-1922, 2020.
 [http://dx.doi.org/10.1109/TKDE.2019.2913379]
[17]
H. Tang, Y. Mi, F. Xue,  and Y. Cao, "Graph domain adversarial transfer network for cross-domain sentiment classification", IEEE Access, vol. 9, pp. 33051-33060, 2021.
 [http://dx.doi.org/10.1109/ACCESS.2021.3061139]
[18]
J. Blitzer, M. Dredze,  and F. Pereira, "Biographies, Bollywood, boomboxes, and blenders: Domain adaptation for sentiment classification", In Proc. 45th ACL AMACL, Prague, Czech Republic, 2007, pp. 440-447 
[19]
K. Crammer, "Online passive-aggressive algorithms", J. Mach. Learn. Res., vol. 7, pp. 551-585, 2006.
[20]
M. Ghosh,  and G. Sanyal, "An ensemble approach to stabilize the features for multi-domain sentiment analysis using supervised machine learning", J. Big Data, vol. 5, no. 1, p. 44, 2018.
 [http://dx.doi.org/10.1186/s40537-018-0152-5]
[21]
T. Songbo,  and J. Zhang, "An empirical study of sentiment analysis for chinese documents", Expert Syst. Appl., vol. 34, no. 4, pp. 2622-2629, 2007.
[22]
L. Breiman, "Random forests", Mach. Learn., vol. 45, no. 1, pp. 5-32, 2001.
 [http://dx.doi.org/10.1023/A:1010933404324]
[23]
V.N. Vapnik, The nature of statistical learning theory., Springer: New York, 1995.
 [http://dx.doi.org/10.1007/978-1-4757-2440-0]
[24]
T. Joachims, Text categorization with support vector machines: Learning with many relevant features.In: Machine Learning: ECML-98., Springer: Berlin, Heidelberg, 1998.
[25]
Y. Yang,  and X. Lin, "A re-examination of text categorization methods", In In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, 1999. 
 [http://dx.doi.org/10.1145/312624.312647]
[26]
 A. Mountassir,  H. Benbrahim,  and   and I. Berrada ,, ""The nearest centroid based on vector norms: A new classification algorithm for a new document representation model", In: P. Perner, Ed.,", In: Machine Learning and Data Mining in Pattern Recognition. MLDM 2014., vol. 8556. Springer: Cham, 2014..
 [http://dx.doi.org/10.1007/978-3-319-08979-9_34]
[27]
P. Geurts, D. Ernst,  and L. Wehenkel, Extremely randomized trees., Springer, 2006.
 [http://dx.doi.org/10.1007/s10994-006-6226-1]
[28]
X. Feng, "Research of sentiment analysis based on adaboost algorithm", In 2019 International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 2019, pp. 279-282 
 [http://dx.doi.org/10.1109/MLBDBI48998.2019.00062]
[29]
W. Jin, B. Zhao, L. Zhang, C. Liu,  and H. Yu, "Back to common sense: Oxford dictionary descriptive knowledge augmentation for aspect-based sentiment analysis", Inf. Process. Manage., vol. 60, no. 3, p. 103260, 2023.
 [http://dx.doi.org/10.1016/j.ipm.2022.103260]
[30]
W. Dai, Q. Yang, G. Xue,  and Y. Yu, "Boosting for transfer learning", In Proceedings of the 24th International Conference on Machine Learning, Corvallis, Oregon, USA, 2007, pp. 193-200 
 [http://dx.doi.org/10.1145/1273496.1273521]
[31]
S. Xie, C. Hou, H. Yu, Z. Zhang, X. Luo,  and N. Zhu, "Multi-label disaster text classification via supervised contrastive learning for social media data", Comput. Electr. Eng., vol. 104, p. 108401, 2022.
 [http://dx.doi.org/10.1016/j.compeleceng.2022.108401]
[32]
Annamalai Suresh, "Sentiment classification using decision tree based feature selection", Int. J. Cont. Theory. Appl., vol. 9, pp. 419-425, 2016.
[33]
A. McCallum,  and K. Ni-gam, "A comparison of event models for naïve Bayes text classification", AAAI-98 workshop on learning for text categorization, 1998.
[34]
J.D.M. Rennie, L. Shih, J. Teevan,  and D.R. Karger, "Tackling the poor assumptions of naive Bayes text classifiers", In Proceedings of the Twentieth International Conference on Machine Learning, Washington DC, 2003. 
[35]
 A.M. Kibriya,,  E. Frank,,  B. Pfahringer,  and  and G. Holmes,, ""Multinomial naive bayes for text categorization revisited ", In: G.I. Webb, and X.  Yu, Eds.,", In: AI 2004: Advances in Artificial Intelligence. AI 2004, vol. 3339. Springer: Berlin, Heidelberg, 2004..
 [http://dx.doi.org/10.1007/978-3-540-30549-1_43]
[36]
L. Ladicky,  and P. Torr, Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, 28 June - 2 July, 2011.
[37]
A. Yadav,  and D.K. Vishwakarma, "Sentiment analysis using deep learning architectures: A review", Artif. Intell. Rev., vol. 53, no. 6, pp. 4335-4385, 2020.
 [http://dx.doi.org/10.1007/s10462-019-09794-5]
[38]
F. Li, "Cross-domain co-extraction of sentiment and topic lexicons", In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, Korea, 2013, pp. 410-419 
[39]
M.S. Akhtar, A. Kumar, A. Ekbal,  and P. Bhattacharyya, "A hybrid deep learning architecture for sentiment analysis", In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan , 2016, p. 482-493 
[40]
J. Meng, "Cross-domain text sentiment analysis based on CNN_FT method", Information, vol. 10, no. 5, p. 162, 2019.
[41]
H Pouransari, "Deep learning for sentiment analysis of movie reviews", 
[42]
X. Wei,  and H. Lin, "Cross-domain sentiment classification via constructing semantic correlation", IJCS, vol. 2, pp. 1-8, 2017.

Rights & Permissions Print Cite

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/0126662558276889240125062857	Print ISSN 2666-2558
Publisher Name Bentham Science Publisher	Online ISSN 2666-2566

Recent Advances in Computer Science and Communications

A Comparative Analysis of Feature Selection Algorithms in Cross Domain Sentiment Classification

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Abstract