A Conceptual Paper: Text Mining Exploration for Early Identification of Poverty in Yogyakarta

  • SETYAWAN WIDYARTO Universiti Selangor
Keywords: identification, Naïve Bayes classifier, text mining, word cloud


Purpose: This study aims to explore information about poverty related communities through social media Twitter to get the initial identification of poverty in the Yogyakarta region.

Background: Poverty in Indonesia is still a problem of the government which needs more attention so that poverty rates decrease, especially in Yogyakarta. The current poverty estimation by the government still uses static variables and to get the data requires more effort.

Design/Methodology/Approach: The keyword used to retrieve data on Twitter is based on word cloud processing results from journals or papers about poverty. The data used in this research is Twitter data in a certain period that uses Indonesian. Text data is processed using a machine learning method approach to training and testing data. The algorithm used to do text mining in this research is Naïve Bayes Classifier by producing 3 classes of sentiment analysis which are positive, negative and neutral. The classification accuracy in this study was 66% with the data used in the study to 7000 data.

Results/Findings: This result can be an initial identification for decision makers in undertaking poverty alleviation efforts in an area and can be used as a new dynamic variable for estimating poverty in Indonesia together with static variables from the government.

Conclusion and Implications:  The method used is expected to be able to identify poverty in very early warning time.


Anshul M., & Arpit G., (2010), Stock Prediction Using Twitter Sentiment Analysis, Stanfort Unversity.
Andrea Ceron et al, (2014), Using Sentiment Analysis to Monitor Electoral Campaigns: Method Matters—Evidence From the United States and Italy, Social Science Computer Review, page 1-18.
Bautista, C. C. (2018). Explaining Multidimensional Poverty: A Household-Level Analysis. Asian Economic Papers, 17(3), 183–210. doi:10.1162/asep_a_00648
Croes, R., (2014a), The Role of Tourism in Poverty Reduction: An Empirical Assessment. Tourism Economics, 20(2), 207-226. Poverty and Tourism
Croes, R., & Vanegas Sr, M., (2008), Cointegration and Causality between Tourism and Poverty Reduction. Journal of Travel Research, 47(1), 94-103. Belisle dan Hoy.
David. O., & Francesco, M., (2010), Research Challenge on Opinion Mining and Sentiment Analysis, Proceeding of the 12th conference of Fruct association, United Kingdom.
Diego, R., Rita, D., (2015), The impacts of tourism on poverty alleviation: an integrated research framework, Journal of Sustainable Tourism, ISSN: 0966-9582 (Print) 1747-7646 (Online), Publisher: Routledge, London.
Dorothée Charlier & Bérangère Legendre, 2018. "A Multidimensional Approach to Measuring Fuel Poverty," Post-Print halshs-01957796, HAL.
Dutta, S., and Shekhar, S., (2010), Bond Rating: A Non-Conservative Application of Neural Networks, Working Paper (Computer Science Division, University of California.
Duyu, T., Bing, Q., Ting, L., Yuekui, Y., (2015), User Modeling with Neural Network for Review Rating Prediction, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI).
Dritsakis, N., (2004), ‘Tourism as a long-run economic growth factor: an empirical investigation for Greece using causality analysis’, Tourism Economics, Vol 10, No 3, pp 305–316.
Hassan, S., Yulan, H., & Harith, A., (2011), Semantic Sentiment Analysisof Twitter, Proceeding of the Workshop on Information Extraction and Entity Analytics on Social Media Data. United Kingdom: Knowledge Media Institute.
Hawkins, D., & Mann, S., (2007), The world bank’s role in tourism development, Annals of Tourism Research, Vol 34, No 2, pp 348–363.
Jonathon, R., (2005), Using emoticonsto reduce dependency in machine learning techniques for sentiment classification. In ACL. TheAssociation for ComputerLinguistics.
Kim, H., Chen, M., & Jang, S., (2006), ‘Tourism expansion and economic development: the case of Taiwan’, Tourism Management, Vol 27, pp 925–933.
Kanna, B., & S.N. Singh., (2012), AWNN - Assisted Wind Power Forecasting using FeedForward Neural Network. IEEE Transactions on Sustainable Energy, Volume 3, pp. 306315.
Li, C., Bai, J., Zhang, L., Tang, H., & Luo, Y. (2019). Opinion community detection and opinion leader detection based on text information and network topology in cloud environment. Information Sciences. doi:10.1016/j.ins.2019.06.060
Manjula, G., Janice, J.H., Nga, & Jennifer, C., (2015), The relationship of rural tourism and poverty alleviation in Sri Lanka, Asian Journal of Tourism and Hospitality Research Volumes 8 and 9.
Michelle, A., & Grzegorz, K,. (2009), A Comparison of Sentiment Analysis Techniques: Polarizing Movie Blogs,” Conference on web search and web data mining (WSDM). University of Alberia: Department of Computing Science.
MIT Technology Review, 2013, https://www.technologyreview.com/lists/technologies/2013.
Meena, R., & Joao, G., (2013b), Marketing Research: The Role of Sentiment Analysis,. The 5th SNA-KDD Workshop’11. Universityof Porto, 2013B.
Mushtaq, R., & Bruneau, C. (2019). Microfinance, financial inclusion and ICT: Implications for poverty and inequality. Technology in Society, 101154. doi:10.1016/j.techsoc.2019.101154
Nan Li, D.D, & Wu, (2010), Decision Support System 48, Page 354-368, Elsevier.
Neethu, M. S., and R. Rajasree. (2013), Sentiment analysis in twitter using machine learning techniques. Computing, Communications and Networking Technologies (ICCCNT), Fourth International Conference on. IEEE.
Patrick, L., (2012), Extracting Strong Sentiment Trend from Twitter, Stanford University,.
Palanisamy, P., Vineet, Y., & Harsha, E., (2013), Serendio: Simple and Practical lexicon based approach to Sentiment Analysis." proceedings of Second Joint Conference on Lexical and Computational Semantics..
Schoen H., Gayo-Avello. D., Takis M., Mustafaraj E., Strohmaier M., (2014), The power of prediction with social media, Internet Research, Vol. 23 Iss 5 pp. 528 – 543.
Santillana M, Nguyen AT, Dredze M, Paul MJ, Nsoesie EO, Brownstein JS, (2015), Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance. PLoS Comput Biol 11(10): e1004513. https://doi.org/10.1371/journal.pcbi.1004513.
S. Naveen Balaji, P. Victer Paul,. & R. Saravanan., (2017), Survey on Sentiment Analysis based Stock Prediction using Big data Analytics, International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017].
Thakkar, H., & Patel, D., (2015), Approaches for Sentiment Analysis on Twitter: A State-of-Art study, Social and Information Networks, Cornell University Library.
Undang-undang no 10 Tahun 2009, Kepariwisataan di Indonesia.
Wang, P., Xu, B., Wu, Y., & Zhou, X., (2015), Link prediction in social networks: the state-of-the-art., Science China-Information Science, Vol. 58 011101:1–011101:38., link.springer.com.
V. Dao Truongab, C. Michael Hallbcde & Tony Garryf., (2014), Tourism and poverty alleviation: perceptions and experiences of poor people in Sapa, Vietnam., Journal of Sustainable Tourism, ISSN: 0966-9582 (Print) 1747-7646 (Online), Publisher: Routledge, London.
Xinyu, C., Youngwoon, C., Suk, Y., & Jang, (2015), Crime Prediction Using Twitter Sentiment and Weather, 978-1-4799-1832-4/15/2015, IEEE.
How to Cite
REDJEKI, S., & WIDYARTO, S. (2019). A Conceptual Paper: Text Mining Exploration for Early Identification of Poverty in Yogyakarta. Postgraduate Research Symposium - January 2020, 1(1). Retrieved from http://ojs.journals.unisel.edu.my/index.php/prsj20/article/view/0000-0002-6317-9875