IBM-Employee

BUSINESS INTELLIGENCE Employee Attrition 2

3 Dataset

The ”IBMEmployeeDataset” chosen for the purpose of understanding what are the significant reasons for employee attrition [3]. This is a fictional data set created by IBM data scientists. This the data about the employee on various factors influencing the attrition from the company. Inspiration is to predict the Attrition of an employee based on the various factors given. Firstly, the dataset is checked, it has 1058 rows (samples) and 35 columns (features). Any target variable is missing. There is no missing data as shown in the table 1 below.

Table 1: Properties of IBM Dataset

After that, the attrition rate is checked. The attrition ration in this data set is calculated as 83,0813% as shown in figure above.

However, the same features do not contribute to the analysis, not informative data discarded. The first fea- ture is ”EmployeeCount”, because all values are the same, equals to 1. Other features such as ”Over18”, ”StandardHours”, and ”EmployeeNumber” do not have a unique value. Also, ”HourlyRate”,”DailyRate” ”MonthlyRate” variables do not have any additional information for the analysis. These are causing noise in the learning process.

All these 7 features are deleted before the analysis. There is no need for data reduction. After that, variable types are investigated. As shown in the figure 2 below, the dataset has ”Numerical” and ”Categorical” variables. The categorical variables also have 3 different variable types, ”Boolean”, ”Ordinal”, and ”Nominal”. Categorical attributes can assume a finite number of values. These are usually representations of a qualitative property of an entity to which they refer to.

BOOLEANS: These are categorical attributes in which an attribute can be true or false (1/0). NOMINALS: These are categorical variables, which do not have a natural ordering. ORDINALS: These are categorical variables, which have a natural ordering with a meaning.

NOMINALS: These are categorical variables, which do not have a natural ordering. ORDINALS: These are categorical variables, which have a natural ordering with a meaning.

ORDINALS: These are categorical variables, which have a natural ordering with a meaning.

First Article : https://tunavatansever.com/2023/03/08/business-intelligence-employee-attrition/

Next page: https://tunavatansever.com/2023/04/06/analysis-business-intelligence-employee-attrition/

Leave a Reply

Your email address will not be published. Required fields are marked *