Implemention of Data Mining Technics Considering Number of Refugees

Image credit: Unsplash

Abstract

Refugees are people fleeing conflict or persecution. because of their race, religion, nationality,membership in a particular social group, or political opinion. Refugees are defined and protected in international law, and must not be expelled or returned to situations where their life and freedom are at risk. The final refugees numbers used in this study were collected based on the given definition. In addition to obtained numbers, we also added socio-economic variables which affect ‘’ being a refugee’’.The data set used in this study includes both socio-economic attributes of refugees from 215 countries beetween 2008 and 2013 categorized refugee numbers. We used common data mining techniques such as Naive Bayes Decision Trees and K- Nearest Neighborhood classification algorithm in order to compare classification ratios. According to our study we obtained best classification ratio with K-Neighborhood algorithm with k-fold cross validation. In order to increase the classification success rate, it is also applied Principle Component Analysis. As a result of dimension reduction, better classification results were observed.

Publication
In Source Themes Conference
Ali Mertcan Köse
Ali Mertcan Köse
Ph.D. Candidate of Statistics

My research interests include latent variable modeling,supervised learning and bayesian statistics.