Determining the Intervening Effects of Exploratory Data Analysis and Feature Engineering in Telecoms Customer Churn Modelling









Abstract

The telecoms industry is a highly competitive sector which is constantly challenged by customer churn or attrition. In order to remain steadfast in the consumer business, companies need to have sophisticated churn management strategies that will harness valuable data for business intelligence. Data mining and machine learning are tools which can be used by telecoms companies to monitor the churn behaviour of customers. This study implemented exploratory data analysis and feature engineering in a public domain Telecoms dataset and applied seven (7) classification techniques namely, Naïve Bayes, Generalized Linear Model, Logistic Regression, Deep Learning, Decision Tree, Random Forest, and Gradient Boosted Trees. The results are analyzed using different metrics such as Accuracy, Classification error, Precision, Recall, F1-score, and AUC. This study discussed how these results are essential in reducing customer churn and improving customer service. The results obtained in the experiment demonstrate that the best classifier is Gradient Boosted Trees. It outperforms the other classifiers in almost all evaluation metrics. Further, all classifiers showed remarkable improved performance after the oversampling method is applied.


Modules


Algorithms

Decision Tree


Software And Hardware

• Hardware: Processor: i3 ,i5 RAM: 4GB Hard disk: 16 GB • Software: operating System : Windws2000/XP/7/8/10 Anaconda,jupyter,spyder,flask,hadoop Frontend :-python Backend:- MYSQL