Python Project 3: HR Analytics Case Study¶

Author: Chinh X. Mai, Date: August 24, 2022

Table of contents

  • 1 Case description
  • 2 Goal of the case study
  • 3 Data analysis pipeline
  • 4 Important libraries
  • 5 Importing and validating data
  • 6 Transforming data
    • 6.1 Processing working time
    • 6.2 Recoding values
    • 6.3 Joining tables
    • 6.4 Summary statistics
    • 6.5 Final table
  • 7 Exploratory data analysis
    • 7.1 Univariate analysis
    • 7.2 Bivariate analysis
  • 8 Regression analysis
    • 8.1 Preparing data for logistic regression
    • 8.2 Fitting explanatory models
  • 9 Validating forecasting power and model choice
  • 10 Interpreting results
  • 11 Conclusion & executive summary

1 Case description

A large company named XYZ, employs, at any given point of time, around 4000 employees. However, every year, around 15% of its employees leave the company and need to be replaced with the talent pool available in the job market. The management believes that this level of attrition (employees leaving, either on their own or because they got fired) is bad for the company, because of the following reasons -

  • The former employees’ projects get delayed, which makes it difficult to meet timelines, resulting in a reputation loss among consumers and partners
  • A sizeable department has to be maintained, for the purposes of recruiting new talent
  • More often than not, the new employees have to be trained for the job and/or given time to acclimatise themselves to the company

Hence, the management has contracted an HR analytics firm to understand what factors they should focus on, in order to curb attrition. In other words, they want to know what changes they should make to their workplace, in order to get most of their employees to stay. Also, they want to know which of these variables is most important and needs to be addressed right away.

Since you are one of the star analysts at the firm, this project has been given to you.

2 Goal of the case study

You are required to model the probability of attrition using a logistic regression. The results thus obtained will be used by the management to understand what changes they should make to their workplace, in order to get most of their employees to stay.

Reference: Kaggle HR Analytics Case Study

3 Data analysis pipeline

This analysis focuses on finding the factors having significant impacts on the attrition rate. The analysis pipeline is summarized shortly by the following figure:

image.png

4 Important libraries

In [1]:
# Data and array manipulation
import pandas as pd
import numpy as np
import datetime as dt

# Regression analysis
import statsmodels.api as sm
from sklearn.model_selection import train_test_split
from sklearn import metrics

# Plotting and Visualization
import matplotlib.pyplot as plt
import plotly.express as px
import cufflinks as cf

# Interactive charts
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

# Options
init_notebook_mode(connected = True)
cf.go_offline()
pd.options.display.float_format = '{:,.3f}'.format
pd.set_option('display.max_columns', None)

# Default graphic settings
default_yaxis = dict(showgrid = False,
                     zeroline = False,
                     showline = False,
                     showticklabels = True)

5 Importing and validating data

In [2]:
# Importing metadata
df_data_dictionary = pd.read_excel('E:\\Data\\HR\\data_dictionary.xlsx').fillna('')

# Main table
df_general_data = pd.read_csv('E:\\Data\\HR\\general_data.csv')

# Employee & manager tables
df_employee_survey_data = pd.read_csv('E:\\Data\\HR\\employee_survey_data.csv')
df_manager_survey_data = pd.read_csv('E:\\Data\\HR\\manager_survey_data.csv')

# Check in & check out times
df_in_time = pd.read_csv('E:\\Data\\HR\\in_time.csv', index_col = 'Unnamed: 0').applymap(pd.Timestamp)
df_out_time = pd.read_csv('E:\\Data\\HR\\out_time.csv', index_col = 'Unnamed: 0').applymap(pd.Timestamp)
In [3]:
df_data_dictionary
Out[3]:
Variable Meaning Levels
0 Age Age of the employee
1 Attrition Whether the employee left in the previous year...
2 BusinessTravel How frequently the employees travelled for bus...
3 Department Department in company
4 DistanceFromHome Distance from home in kms
5 Education Education Level 1 'Below College'
6 2 'College'
7 3 'Bachelor'
8 4 'Master'
9 5 'Doctor'
10 EducationField Field of education
11 EmployeeCount Employee count
12 EmployeeNumber Employee number/id
13 EnvironmentSatisfaction Work Environment Satisfaction Level 1 'Low'
14 2 'Medium'
15 3 'High'
16 4 'Very High'
17 Gender Gender of employee
18 JobInvolvement Job Involvement Level 1 'Low'
19 2 'Medium'
20 3 'High'
21 4 'Very High'
22 JobLevel Job level at company on a scale of 1 to 5
23 JobRole Name of job role in company
24 JobSatisfaction Job Satisfaction Level 1 'Low'
25 2 'Medium'
26 3 'High'
27 4 'Very High'
28 MaritalStatus Marital status of the employee
29 MonthlyIncome Monthly income in rupees per month
30 NumCompaniesWorked Total number of companies the employee has wor...
31 Over18 Whether the employee is above 18 years of age ...
32 PercentSalaryHike Percent salary hike for last year
33 PerformanceRating Performance rating for last year 1 'Low'
34 2 'Good'
35 3 'Excellent'
36 4 'Outstanding'
37 RelationshipSatisfaction Relationship satisfaction level 1 'Low'
38 2 'Medium'
39 3 'High'
40 4 'Very High'
41 StandardHours Standard hours of work for the employee
42 StockOptionLevel Stock option level of the employee
43 TotalWorkingYears Total number of years the employee has worked ...
44 TrainingTimesLastYear Number of times training was conducted for thi...
45 WorkLifeBalance Work life balance level 1 'Bad'
46 2 'Good'
47 3 'Better'
48 4 'Best'
49 YearsAtCompany Total number of years spent at the company by ...
50 YearsSinceLastPromotion Number of years since last promotion
51 YearsWithCurrManager Number of years under current manager
In [4]:
df_data_dictionary.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 52 entries, 0 to 51
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   Variable  52 non-null     object
 1   Meaning   52 non-null     object
 2   Levels    52 non-null     object
dtypes: object(3)
memory usage: 1.3+ KB
In [5]:
df_general_data.head()
Out[5]:
Age Attrition BusinessTravel Department DistanceFromHome Education EducationField EmployeeCount EmployeeID Gender JobLevel JobRole MaritalStatus MonthlyIncome NumCompaniesWorked Over18 PercentSalaryHike StandardHours StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
0 51 No Travel_Rarely Sales 6 2 Life Sciences 1 1 Female 1 Healthcare Representative Married 131160 1.000 Y 11 8 0 1.000 6 1 0 0
1 31 Yes Travel_Frequently Research & Development 10 1 Life Sciences 1 2 Female 1 Research Scientist Single 41890 0.000 Y 23 8 1 6.000 3 5 1 4
2 32 No Travel_Frequently Research & Development 17 4 Other 1 3 Male 4 Sales Executive Married 193280 1.000 Y 15 8 3 5.000 2 5 0 3
3 38 No Non-Travel Research & Development 2 5 Life Sciences 1 4 Male 3 Human Resources Married 83210 3.000 Y 11 8 3 13.000 5 8 7 5
4 32 No Travel_Rarely Research & Development 10 1 Medical 1 5 Male 1 Sales Executive Single 23420 4.000 Y 12 8 2 9.000 2 6 0 4
In [6]:
df_general_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4410 entries, 0 to 4409
Data columns (total 24 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   Age                      4410 non-null   int64  
 1   Attrition                4410 non-null   object 
 2   BusinessTravel           4410 non-null   object 
 3   Department               4410 non-null   object 
 4   DistanceFromHome         4410 non-null   int64  
 5   Education                4410 non-null   int64  
 6   EducationField           4410 non-null   object 
 7   EmployeeCount            4410 non-null   int64  
 8   EmployeeID               4410 non-null   int64  
 9   Gender                   4410 non-null   object 
 10  JobLevel                 4410 non-null   int64  
 11  JobRole                  4410 non-null   object 
 12  MaritalStatus            4410 non-null   object 
 13  MonthlyIncome            4410 non-null   int64  
 14  NumCompaniesWorked       4391 non-null   float64
 15  Over18                   4410 non-null   object 
 16  PercentSalaryHike        4410 non-null   int64  
 17  StandardHours            4410 non-null   int64  
 18  StockOptionLevel         4410 non-null   int64  
 19  TotalWorkingYears        4401 non-null   float64
 20  TrainingTimesLastYear    4410 non-null   int64  
 21  YearsAtCompany           4410 non-null   int64  
 22  YearsSinceLastPromotion  4410 non-null   int64  
 23  YearsWithCurrManager     4410 non-null   int64  
dtypes: float64(2), int64(14), object(8)
memory usage: 827.0+ KB
In [7]:
df_employee_survey_data.head()
Out[7]:
EmployeeID EnvironmentSatisfaction JobSatisfaction WorkLifeBalance
0 1 3.000 4.000 2.000
1 2 3.000 2.000 4.000
2 3 2.000 2.000 1.000
3 4 4.000 4.000 3.000
4 5 4.000 1.000 3.000
In [8]:
df_employee_survey_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4410 entries, 0 to 4409
Data columns (total 4 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   EmployeeID               4410 non-null   int64  
 1   EnvironmentSatisfaction  4385 non-null   float64
 2   JobSatisfaction          4390 non-null   float64
 3   WorkLifeBalance          4372 non-null   float64
dtypes: float64(3), int64(1)
memory usage: 137.9 KB
In [9]:
df_manager_survey_data.head()
Out[9]:
EmployeeID JobInvolvement PerformanceRating
0 1 3 3
1 2 2 4
2 3 3 3
3 4 2 3
4 5 3 3
In [10]:
df_manager_survey_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4410 entries, 0 to 4409
Data columns (total 3 columns):
 #   Column             Non-Null Count  Dtype
---  ------             --------------  -----
 0   EmployeeID         4410 non-null   int64
 1   JobInvolvement     4410 non-null   int64
 2   PerformanceRating  4410 non-null   int64
dtypes: int64(3)
memory usage: 103.5 KB
In [11]:
df_in_time.head()
Out[11]:
2015-01-01 2015-01-02 2015-01-05 2015-01-06 2015-01-07 2015-01-08 2015-01-09 2015-01-12 2015-01-13 2015-01-14 2015-01-15 2015-01-16 2015-01-19 2015-01-20 2015-01-21 2015-01-22 2015-01-23 2015-01-26 2015-01-27 2015-01-28 2015-01-29 2015-01-30 2015-02-02 2015-02-03 2015-02-04 2015-02-05 2015-02-06 2015-02-09 2015-02-10 2015-02-11 2015-02-12 2015-02-13 2015-02-16 2015-02-17 2015-02-18 2015-02-19 2015-02-20 2015-02-23 2015-02-24 2015-02-25 2015-02-26 2015-02-27 2015-03-02 2015-03-03 2015-03-04 2015-03-05 2015-03-06 2015-03-09 2015-03-10 2015-03-11 2015-03-12 2015-03-13 2015-03-16 2015-03-17 2015-03-18 2015-03-19 2015-03-20 2015-03-23 2015-03-24 2015-03-25 2015-03-26 2015-03-27 2015-03-30 2015-03-31 2015-04-01 2015-04-02 2015-04-03 2015-04-06 2015-04-07 2015-04-08 2015-04-09 2015-04-10 2015-04-13 2015-04-14 2015-04-15 2015-04-16 2015-04-17 2015-04-20 2015-04-21 2015-04-22 2015-04-23 2015-04-24 2015-04-27 2015-04-28 2015-04-29 2015-04-30 2015-05-01 2015-05-04 2015-05-05 2015-05-06 2015-05-07 2015-05-08 2015-05-11 2015-05-12 2015-05-13 2015-05-14 2015-05-15 2015-05-18 2015-05-19 2015-05-20 2015-05-21 2015-05-22 2015-05-25 2015-05-26 2015-05-27 2015-05-28 2015-05-29 2015-06-01 2015-06-02 2015-06-03 2015-06-04 2015-06-05 2015-06-08 2015-06-09 2015-06-10 2015-06-11 2015-06-12 2015-06-15 2015-06-16 2015-06-17 2015-06-18 2015-06-19 2015-06-22 2015-06-23 2015-06-24 2015-06-25 2015-06-26 2015-06-29 2015-06-30 2015-07-01 2015-07-02 2015-07-03 2015-07-06 2015-07-07 2015-07-08 2015-07-09 2015-07-10 2015-07-13 2015-07-14 2015-07-15 2015-07-16 2015-07-17 2015-07-20 2015-07-21 2015-07-22 2015-07-23 2015-07-24 2015-07-27 2015-07-28 2015-07-29 2015-07-30 2015-07-31 2015-08-03 2015-08-04 2015-08-05 2015-08-06 2015-08-07 2015-08-10 2015-08-11 2015-08-12 2015-08-13 2015-08-14 2015-08-17 2015-08-18 2015-08-19 2015-08-20 2015-08-21 2015-08-24 2015-08-25 2015-08-26 2015-08-27 2015-08-28 2015-08-31 2015-09-01 2015-09-02 2015-09-03 2015-09-04 2015-09-07 2015-09-08 2015-09-09 2015-09-10 2015-09-11 2015-09-14 2015-09-15 2015-09-16 2015-09-17 2015-09-18 2015-09-21 2015-09-22 2015-09-23 2015-09-24 2015-09-25 2015-09-28 2015-09-29 2015-09-30 2015-10-01 2015-10-02 2015-10-05 2015-10-06 2015-10-07 2015-10-08 2015-10-09 2015-10-12 2015-10-13 2015-10-14 2015-10-15 2015-10-16 2015-10-19 2015-10-20 2015-10-21 2015-10-22 2015-10-23 2015-10-26 2015-10-27 2015-10-28 2015-10-29 2015-10-30 2015-11-02 2015-11-03 2015-11-04 2015-11-05 2015-11-06 2015-11-09 2015-11-10 2015-11-11 2015-11-12 2015-11-13 2015-11-16 2015-11-17 2015-11-18 2015-11-19 2015-11-20 2015-11-23 2015-11-24 2015-11-25 2015-11-26 2015-11-27 2015-11-30 2015-12-01 2015-12-02 2015-12-03 2015-12-04 2015-12-07 2015-12-08 2015-12-09 2015-12-10 2015-12-11 2015-12-14 2015-12-15 2015-12-16 2015-12-17 2015-12-18 2015-12-21 2015-12-22 2015-12-23 2015-12-24 2015-12-25 2015-12-28 2015-12-29 2015-12-30 2015-12-31
1 NaT 2015-01-02 09:43:45 2015-01-05 10:08:48 2015-01-06 09:54:26 2015-01-07 09:34:31 2015-01-08 09:51:09 2015-01-09 10:09:25 2015-01-12 09:42:53 2015-01-13 10:13:06 NaT 2015-01-15 10:01:24 2015-01-16 10:19:08 NaT 2015-01-20 09:50:34 2015-01-21 09:49:42 2015-01-22 09:47:45 2015-01-23 09:23:44 NaT 2015-01-27 09:50:37 2015-01-28 09:56:13 2015-01-29 09:53:47 2015-01-30 10:09:26 2015-02-02 09:38:43 2015-02-03 10:23:38 2015-02-04 09:48:37 2015-02-05 09:52:36 2015-02-06 09:53:23 2015-02-09 10:01:45 2015-02-10 10:14:18 2015-02-11 10:18:44 2015-02-12 10:10:35 2015-02-13 09:13:07 2015-02-16 10:14:02 2015-02-17 10:03:08 2015-02-18 10:23:06 NaT 2015-02-20 09:54:56 2015-02-23 09:46:59 2015-02-24 10:11:18 2015-02-25 09:52:36 2015-02-26 09:56:33 2015-02-27 09:58:46 2015-03-02 10:19:43 2015-03-03 10:08:37 2015-03-04 10:05:38 NaT 2015-03-06 09:55:53 2015-03-09 10:28:34 2015-03-10 09:47:30 2015-03-11 09:48:59 2015-03-12 10:10:51 2015-03-13 10:07:18 NaT 2015-03-17 10:05:22 2015-03-18 10:28:21 2015-03-19 10:01:52 2015-03-20 10:37:49 2015-03-23 10:11:19 2015-03-24 10:33:19 2015-03-25 09:41:35 NaT 2015-03-27 09:47:30 2015-03-30 10:11:44 2015-03-31 10:08:13 2015-04-01 10:12:37 2015-04-02 09:45:27 2015-04-03 10:00:30 2015-04-06 09:40:19 2015-04-07 10:16:43 2015-04-08 10:04:34 2015-04-09 09:21:27 2015-04-10 09:46:25 2015-04-13 09:39:30 2015-04-14 09:58:48 2015-04-15 09:55:56 2015-04-16 09:46:07 2015-04-17 09:54:13 2015-04-20 10:03:25 2015-04-21 09:50:41 2015-04-22 09:56:16 2015-04-23 10:12:42 2015-04-24 10:05:40 2015-04-27 10:12:01 2015-04-28 09:57:24 2015-04-29 09:46:15 2015-04-30 09:48:30 NaT 2015-05-04 09:40:52 2015-05-05 09:54:59 2015-05-06 09:49:06 2015-05-07 10:06:47 2015-05-08 09:55:02 NaT 2015-05-12 09:50:17 2015-05-13 10:24:03 2015-05-14 09:50:43 2015-05-15 09:55:32 NaT 2015-05-19 09:49:06 2015-05-20 10:00:39 2015-05-21 09:53:15 2015-05-22 10:19:44 2015-05-25 10:15:25 2015-05-26 10:17:01 2015-05-27 09:41:25 2015-05-28 09:57:00 NaT NaT 2015-06-02 10:20:49 2015-06-03 09:47:19 2015-06-04 10:23:40 NaT 2015-06-08 10:17:56 NaT 2015-06-10 09:54:33 2015-06-11 09:47:09 2015-06-12 10:13:55 2015-06-15 09:50:19 2015-06-16 09:59:04 2015-06-17 09:46:15 2015-06-18 10:03:15 2015-06-19 10:06:21 2015-06-22 09:48:40 2015-06-23 09:56:55 2015-06-24 09:59:40 2015-06-25 09:52:01 2015-06-26 10:07:58 2015-06-29 10:01:19 2015-06-30 10:27:40 2015-07-01 09:39:02 2015-07-02 10:07:33 2015-07-03 09:59:57 2015-07-06 09:37:06 2015-07-07 10:08:17 2015-07-08 09:56:30 2015-07-09 09:21:57 2015-07-10 09:43:31 2015-07-13 10:07:21 2015-07-14 10:06:39 2015-07-15 09:54:34 2015-07-16 10:22:47 NaT 2015-07-20 10:25:50 2015-07-21 09:48:05 2015-07-22 09:52:29 2015-07-23 10:04:42 2015-07-24 09:34:15 2015-07-27 09:56:24 2015-07-28 10:19:57 2015-07-29 09:48:10 2015-07-30 09:52:24 2015-07-31 10:11:14 2015-08-03 10:03:33 2015-08-04 09:59:34 2015-08-05 09:50:51 2015-08-06 09:56:52 2015-08-07 09:51:30 2015-08-10 10:12:23 2015-08-11 10:08:29 2015-08-12 09:39:08 2015-08-13 09:47:09 2015-08-14 10:00:21 2015-08-17 09:24:23 2015-08-18 10:25:27 2015-08-19 10:14:41 2015-08-20 09:57:56 2015-08-21 10:07:11 2015-08-24 10:07:49 2015-08-25 09:47:59 2015-08-26 09:53:49 2015-08-27 10:16:21 2015-08-28 09:53:47 2015-08-31 10:08:07 NaT 2015-09-02 10:37:59 2015-09-03 09:41:16 2015-09-04 10:11:00 2015-09-07 10:17:58 2015-09-08 10:31:57 2015-09-09 10:28:33 2015-09-10 09:54:28 2015-09-11 10:02:49 2015-09-14 10:15:40 2015-09-15 10:26:09 2015-09-16 10:11:03 NaT 2015-09-18 10:16:06 2015-09-21 09:48:01 2015-09-22 10:34:52 2015-09-23 10:03:15 2015-09-24 10:00:12 2015-09-25 10:00:00 2015-09-28 09:28:45 2015-09-29 09:51:58 2015-09-30 09:52:41 2015-10-01 10:05:28 NaT 2015-10-05 10:02:00 2015-10-06 09:48:24 2015-10-07 09:32:05 2015-10-08 09:39:01 2015-10-09 09:33:40 2015-10-12 10:14:36 NaT 2015-10-14 10:19:43 2015-10-15 10:19:51 2015-10-16 09:50:41 2015-10-19 09:55:08 2015-10-20 10:00:37 NaT NaT 2015-10-23 10:03:33 2015-10-26 09:51:07 2015-10-27 09:25:10 2015-10-28 09:50:30 2015-10-29 09:50:22 2015-10-30 09:51:07 2015-11-02 10:01:58 2015-11-03 10:28:53 2015-11-04 09:47:00 2015-11-05 10:24:57 2015-11-06 10:15:27 NaT NaT NaT 2015-11-12 10:02:34 2015-11-13 09:29:49 2015-11-16 10:16:02 2015-11-17 09:57:23 2015-11-18 10:13:24 2015-11-19 09:46:48 2015-11-20 10:09:35 2015-11-23 09:54:17 2015-11-24 10:02:07 2015-11-25 10:01:51 2015-11-26 10:18:45 2015-11-27 10:24:49 2015-11-30 10:04:24 2015-12-01 09:31:27 2015-12-02 09:57:52 2015-12-03 10:05:55 2015-12-04 09:55:09 2015-12-07 09:28:15 2015-12-08 09:54:56 2015-12-09 09:51:52 2015-12-10 09:40:25 2015-12-11 09:46:49 2015-12-14 10:03:33 NaT 2015-12-16 10:21:19 NaT NaT 2015-12-21 09:55:29 2015-12-22 10:04:06 2015-12-23 10:14:27 2015-12-24 10:11:35 NaT 2015-12-28 10:13:41 2015-12-29 10:03:36 2015-12-30 09:54:12 2015-12-31 10:12:44
2 NaT 2015-01-02 10:15:44 2015-01-05 10:21:05 NaT 2015-01-07 09:45:17 2015-01-08 10:09:04 2015-01-09 09:43:26 2015-01-12 10:00:07 2015-01-13 10:43:29 NaT 2015-01-15 09:37:57 2015-01-16 09:57:18 2015-01-19 10:23:43 2015-01-20 09:29:03 2015-01-21 09:46:45 2015-01-22 10:03:51 2015-01-23 09:20:06 NaT 2015-01-27 10:07:48 2015-01-28 10:08:25 2015-01-29 09:52:04 2015-01-30 09:49:49 2015-02-02 10:07:26 2015-02-03 09:55:45 2015-02-04 10:25:41 2015-02-05 10:05:11 2015-02-06 09:12:39 NaT 2015-02-10 10:10:13 2015-02-11 09:40:20 2015-02-12 10:10:09 2015-02-13 09:35:25 2015-02-16 10:04:12 2015-02-17 09:45:05 2015-02-18 09:54:42 2015-02-19 10:11:49 2015-02-20 10:06:18 2015-02-23 09:41:58 2015-02-24 09:44:22 2015-02-25 09:58:39 2015-02-26 09:59:07 2015-02-27 10:31:41 2015-03-02 09:55:43 2015-03-03 10:31:20 2015-03-04 09:56:59 NaT 2015-03-06 10:01:10 2015-03-09 10:02:33 2015-03-10 09:32:40 2015-03-11 09:57:02 2015-03-12 09:59:32 2015-03-13 09:59:14 2015-03-16 09:56:19 2015-03-17 09:36:08 2015-03-18 10:00:32 2015-03-19 09:46:51 2015-03-20 10:06:06 2015-03-23 10:02:02 NaT 2015-03-25 10:03:04 2015-03-26 09:52:38 2015-03-27 10:02:55 2015-03-30 10:04:24 2015-03-31 10:30:25 2015-04-01 09:46:24 2015-04-02 09:53:55 2015-04-03 10:30:21 2015-04-06 10:18:14 2015-04-07 10:22:22 2015-04-08 09:46:42 2015-04-09 10:02:22 2015-04-10 09:47:56 2015-04-13 09:33:46 2015-04-14 09:50:19 2015-04-15 10:06:41 2015-04-16 10:07:24 2015-04-17 09:40:35 2015-04-20 10:04:05 2015-04-21 10:21:53 2015-04-22 09:40:33 2015-04-23 10:06:54 2015-04-24 10:10:19 2015-04-27 09:33:44 2015-04-28 09:52:51 NaT 2015-04-30 10:14:03 NaT 2015-05-04 09:53:54 2015-05-05 10:25:20 2015-05-06 09:46:04 2015-05-07 09:49:37 2015-05-08 09:56:40 2015-05-11 09:58:30 2015-05-12 09:44:55 2015-05-13 10:06:35 NaT 2015-05-15 09:43:03 NaT 2015-05-19 09:48:05 2015-05-20 10:18:00 2015-05-21 09:42:47 2015-05-22 10:37:50 2015-05-25 09:46:03 2015-05-26 09:42:16 2015-05-27 09:53:20 2015-05-28 09:54:58 2015-05-29 09:52:45 2015-06-01 09:54:45 2015-06-02 09:57:56 2015-06-03 10:10:48 2015-06-04 09:31:48 2015-06-05 10:06:28 2015-06-08 09:45:19 2015-06-09 09:59:08 2015-06-10 09:39:05 2015-06-11 09:55:03 NaT 2015-06-15 09:45:02 2015-06-16 10:15:23 2015-06-17 09:52:23 2015-06-18 09:55:27 2015-06-19 10:13:33 2015-06-22 09:48:24 2015-06-23 09:59:12 2015-06-24 10:27:17 2015-06-25 10:12:33 2015-06-26 09:55:00 2015-06-29 10:02:27 2015-06-30 10:13:31 2015-07-01 09:45:31 2015-07-02 10:04:21 2015-07-03 10:00:01 2015-07-06 10:10:37 2015-07-07 10:29:29 2015-07-08 09:38:19 2015-07-09 10:21:59 2015-07-10 10:30:44 NaT 2015-07-14 10:35:52 2015-07-15 09:47:06 2015-07-16 09:59:13 NaT 2015-07-20 10:18:08 2015-07-21 10:23:11 2015-07-22 09:32:15 2015-07-23 09:40:12 2015-07-24 10:15:22 NaT 2015-07-28 10:16:14 2015-07-29 09:58:33 2015-07-30 09:57:10 2015-07-31 10:15:38 2015-08-03 09:58:28 2015-08-04 10:17:10 2015-08-05 09:38:46 2015-08-06 10:16:03 2015-08-07 09:52:22 2015-08-10 09:37:38 2015-08-11 09:54:35 2015-08-12 09:27:52 2015-08-13 09:50:57 2015-08-14 09:38:27 2015-08-17 09:52:16 2015-08-18 09:34:06 2015-08-19 09:43:02 2015-08-20 10:16:38 2015-08-21 10:07:18 2015-08-24 09:36:46 2015-08-25 09:42:22 2015-08-26 10:15:22 2015-08-27 10:11:04 2015-08-28 09:50:51 2015-08-31 10:23:10 2015-09-01 10:23:47 2015-09-02 09:57:01 2015-09-03 09:56:07 2015-09-04 10:33:03 2015-09-07 09:58:33 NaT 2015-09-09 10:17:43 2015-09-10 10:28:40 2015-09-11 09:58:20 2015-09-14 09:43:25 2015-09-15 09:25:53 2015-09-16 09:26:29 NaT 2015-09-18 10:27:07 2015-09-21 10:03:31 2015-09-22 09:35:23 2015-09-23 10:12:56 2015-09-24 10:02:52 2015-09-25 09:37:12 2015-09-28 10:11:06 2015-09-29 10:00:31 2015-09-30 10:22:56 2015-10-01 10:08:29 NaT 2015-10-05 10:02:26 2015-10-06 10:08:57 2015-10-07 10:10:19 2015-10-08 10:04:03 2015-10-09 09:40:41 2015-10-12 10:11:30 2015-10-13 10:14:39 2015-10-14 10:10:48 NaT 2015-10-16 10:20:19 2015-10-19 10:04:13 2015-10-20 09:39:10 2015-10-21 09:47:15 2015-10-22 09:56:54 2015-10-23 09:53:38 2015-10-26 09:49:43 2015-10-27 10:29:20 2015-10-28 09:47:39 2015-10-29 09:52:02 2015-10-30 09:47:37 2015-11-02 10:04:45 2015-11-03 09:36:03 2015-11-04 09:58:34 2015-11-05 09:50:08 2015-11-06 10:02:38 NaT NaT NaT 2015-11-12 09:54:47 2015-11-13 09:26:10 2015-11-16 10:22:00 2015-11-17 09:26:04 2015-11-18 10:10:39 NaT 2015-11-20 10:01:09 2015-11-23 09:51:11 2015-11-24 09:19:04 2015-11-25 09:43:44 2015-11-26 10:34:30 2015-11-27 10:15:10 2015-11-30 09:46:46 2015-12-01 09:56:22 2015-12-02 09:55:43 2015-12-03 10:05:37 2015-12-04 10:02:05 2015-12-07 09:36:05 2015-12-08 10:01:21 2015-12-09 09:49:23 2015-12-10 10:06:05 2015-12-11 10:15:18 2015-12-14 10:00:01 2015-12-15 10:30:18 2015-12-16 10:01:18 2015-12-17 09:15:08 2015-12-18 10:37:17 2015-12-21 09:49:02 2015-12-22 10:33:51 2015-12-23 10:12:10 NaT NaT 2015-12-28 09:31:45 2015-12-29 09:55:49 2015-12-30 10:32:25 2015-12-31 09:27:20
3 NaT 2015-01-02 10:17:41 2015-01-05 09:50:50 2015-01-06 10:14:13 2015-01-07 09:47:27 2015-01-08 10:03:40 2015-01-09 10:05:49 2015-01-12 10:03:47 2015-01-13 10:21:26 NaT 2015-01-15 09:55:11 2015-01-16 10:05:36 2015-01-19 09:47:53 2015-01-20 09:57:09 2015-01-21 10:29:40 2015-01-22 09:59:11 2015-01-23 10:16:34 NaT 2015-01-27 10:07:30 2015-01-28 10:05:43 2015-01-29 10:06:48 2015-01-30 10:14:36 2015-02-02 09:54:45 2015-02-03 09:27:11 2015-02-04 10:04:03 2015-02-05 10:08:11 2015-02-06 10:23:51 2015-02-09 10:08:43 2015-02-10 10:26:18 2015-02-11 10:02:13 2015-02-12 09:55:10 2015-02-13 10:32:34 2015-02-16 10:18:43 2015-02-17 10:04:54 2015-02-18 10:16:19 2015-02-19 09:47:19 2015-02-20 10:20:51 2015-02-23 10:14:02 2015-02-24 10:23:28 2015-02-25 09:59:06 2015-02-26 10:01:22 2015-02-27 09:57:26 2015-03-02 10:02:38 2015-03-03 10:07:45 2015-03-04 10:15:26 NaT 2015-03-06 09:48:49 2015-03-09 10:17:45 2015-03-10 09:46:10 2015-03-11 10:26:39 2015-03-12 09:56:28 2015-03-13 09:36:23 2015-03-16 10:07:30 2015-03-17 10:00:43 2015-03-18 10:20:10 2015-03-19 09:54:40 NaT 2015-03-23 10:04:28 2015-03-24 10:18:38 2015-03-25 09:43:22 2015-03-26 09:37:02 NaT 2015-03-30 09:54:55 2015-03-31 10:00:57 2015-04-01 09:52:42 2015-04-02 10:24:02 2015-04-03 10:01:42 2015-04-06 09:33:15 2015-04-07 10:13:13 2015-04-08 10:38:41 2015-04-09 10:17:34 2015-04-10 10:28:08 2015-04-13 10:14:55 2015-04-14 09:29:50 2015-04-15 10:06:52 2015-04-16 09:46:13 2015-04-17 10:16:08 2015-04-20 09:36:16 2015-04-21 10:07:10 2015-04-22 09:56:42 NaT 2015-04-24 10:10:15 2015-04-27 09:29:29 2015-04-28 10:01:45 2015-04-29 09:54:28 2015-04-30 09:40:30 NaT 2015-05-04 09:55:15 2015-05-05 09:53:50 2015-05-06 10:20:45 2015-05-07 09:49:38 2015-05-08 10:21:01 2015-05-11 09:47:03 2015-05-12 09:39:48 2015-05-13 09:50:23 2015-05-14 09:45:22 2015-05-15 09:48:59 2015-05-18 10:03:24 2015-05-19 09:40:26 2015-05-20 09:40:20 2015-05-21 09:37:39 2015-05-22 10:04:08 2015-05-25 10:06:54 2015-05-26 10:28:05 2015-05-27 10:14:07 2015-05-28 10:13:25 2015-05-29 10:08:23 2015-06-01 09:57:32 2015-06-02 09:58:49 2015-06-03 10:03:00 2015-06-04 09:32:13 2015-06-05 09:43:51 2015-06-08 09:31:20 2015-06-09 09:49:58 2015-06-10 10:14:49 2015-06-11 10:07:21 2015-06-12 10:10:53 2015-06-15 09:52:17 2015-06-16 09:33:01 2015-06-17 09:50:53 2015-06-18 10:01:04 2015-06-19 09:47:22 2015-06-22 09:52:10 2015-06-23 09:44:50 NaT 2015-06-25 10:10:41 2015-06-26 10:02:07 2015-06-29 09:33:13 2015-06-30 10:13:47 2015-07-01 10:07:56 2015-07-02 10:26:28 2015-07-03 09:40:28 2015-07-06 10:02:42 2015-07-07 10:50:49 2015-07-08 09:56:24 2015-07-09 10:08:14 2015-07-10 09:58:44 2015-07-13 09:59:18 2015-07-14 09:54:55 2015-07-15 09:42:13 2015-07-16 10:00:16 NaT 2015-07-20 10:12:37 2015-07-21 09:39:36 2015-07-22 09:50:19 2015-07-23 10:27:02 2015-07-24 09:58:29 2015-07-27 10:11:51 2015-07-28 09:44:58 2015-07-29 10:29:58 2015-07-30 10:27:08 2015-07-31 09:32:14 2015-08-03 09:49:21 2015-08-04 09:37:19 NaT 2015-08-06 10:00:11 2015-08-07 09:36:58 2015-08-10 09:51:11 2015-08-11 09:49:26 2015-08-12 10:26:18 2015-08-13 10:12:57 2015-08-14 09:51:22 2015-08-17 09:59:35 2015-08-18 09:57:40 2015-08-19 10:06:09 2015-08-20 10:09:06 2015-08-21 10:28:51 2015-08-24 10:04:52 2015-08-25 10:09:53 2015-08-26 10:26:55 2015-08-27 09:08:23 2015-08-28 10:07:20 2015-08-31 09:44:35 2015-09-01 09:51:52 2015-09-02 09:52:28 2015-09-03 10:05:51 2015-09-04 10:11:53 2015-09-07 09:28:00 2015-09-08 09:38:21 2015-09-09 09:45:09 2015-09-10 09:49:52 2015-09-11 09:57:16 2015-09-14 10:27:59 2015-09-15 10:11:20 2015-09-16 09:55:26 NaT 2015-09-18 09:54:29 2015-09-21 09:29:30 2015-09-22 09:35:39 2015-09-23 09:56:54 2015-09-24 10:31:35 2015-09-25 09:40:55 2015-09-28 09:55:28 2015-09-29 10:25:49 2015-09-30 10:11:02 2015-10-01 09:58:27 NaT 2015-10-05 10:10:21 2015-10-06 09:56:07 2015-10-07 10:09:43 2015-10-08 09:46:36 2015-10-09 10:00:44 2015-10-12 10:14:10 2015-10-13 10:22:14 2015-10-14 10:20:42 2015-10-15 10:12:05 2015-10-16 09:41:55 2015-10-19 10:03:40 2015-10-20 10:02:12 2015-10-21 10:52:27 2015-10-22 10:10:44 2015-10-23 10:08:37 2015-10-26 10:29:26 2015-10-27 09:48:57 2015-10-28 09:54:52 2015-10-29 09:47:27 2015-10-30 09:51:47 2015-11-02 09:54:18 2015-11-03 09:52:40 2015-11-04 10:19:14 2015-11-05 10:23:27 2015-11-06 10:06:29 NaT NaT NaT 2015-11-12 10:14:47 2015-11-13 09:43:37 2015-11-16 10:02:56 2015-11-17 09:51:38 2015-11-18 10:12:13 2015-11-19 10:08:46 2015-11-20 10:06:39 2015-11-23 10:02:01 2015-11-24 09:53:52 2015-11-25 10:16:19 2015-11-26 10:07:02 NaT 2015-11-30 10:01:33 2015-12-01 10:02:52 2015-12-02 10:10:34 2015-12-03 10:18:25 2015-12-04 10:06:40 2015-12-07 10:21:09 2015-12-08 09:59:19 2015-12-09 10:17:28 NaT 2015-12-11 09:49:55 2015-12-14 09:42:31 2015-12-15 09:54:48 2015-12-16 09:48:34 2015-12-17 09:53:17 2015-12-18 10:15:14 2015-12-21 10:10:28 2015-12-22 09:44:44 2015-12-23 10:15:54 2015-12-24 10:07:26 NaT 2015-12-28 09:42:05 2015-12-29 09:43:36 2015-12-30 09:34:05 2015-12-31 10:28:39
4 NaT 2015-01-02 10:05:06 2015-01-05 09:56:32 2015-01-06 10:11:07 2015-01-07 09:37:30 2015-01-08 10:02:08 2015-01-09 10:08:12 2015-01-12 10:13:42 2015-01-13 09:53:22 NaT 2015-01-15 10:00:50 2015-01-16 09:58:06 2015-01-19 09:43:11 2015-01-20 10:29:06 2015-01-21 10:04:33 2015-01-22 09:51:07 2015-01-23 09:56:56 NaT 2015-01-27 09:48:31 2015-01-28 10:00:39 2015-01-29 09:49:28 2015-01-30 09:56:31 2015-02-02 10:14:08 2015-02-03 10:01:31 2015-02-04 10:31:44 2015-02-05 10:02:39 2015-02-06 09:24:21 2015-02-09 09:56:27 NaT 2015-02-11 10:00:32 2015-02-12 10:01:48 2015-02-13 10:14:10 2015-02-16 09:50:27 2015-02-17 09:52:42 2015-02-18 09:39:22 2015-02-19 09:43:55 2015-02-20 10:01:49 2015-02-23 09:33:30 2015-02-24 10:27:00 2015-02-25 09:45:59 2015-02-26 09:48:05 2015-02-27 09:43:15 NaT 2015-03-03 10:15:24 2015-03-04 09:59:04 NaT NaT 2015-03-09 10:05:52 2015-03-10 09:29:22 2015-03-11 09:57:24 2015-03-12 09:41:16 2015-03-13 09:52:58 2015-03-16 09:56:00 2015-03-17 09:52:20 2015-03-18 10:25:14 2015-03-19 10:26:39 2015-03-20 10:22:47 2015-03-23 10:21:28 2015-03-24 10:26:25 2015-03-25 10:00:08 2015-03-26 10:21:30 NaT 2015-03-30 10:11:13 2015-03-31 09:59:04 2015-04-01 09:48:31 2015-04-02 09:37:22 2015-04-03 09:55:13 2015-04-06 09:59:50 2015-04-07 10:35:40 2015-04-08 10:26:15 2015-04-09 10:12:10 2015-04-10 10:22:56 2015-04-13 09:31:58 NaT 2015-04-15 10:12:59 2015-04-16 09:45:58 2015-04-17 10:04:10 2015-04-20 09:51:49 2015-04-21 10:13:27 2015-04-22 09:50:42 2015-04-23 09:37:10 2015-04-24 09:58:19 2015-04-27 09:53:47 2015-04-28 09:24:22 2015-04-29 09:55:33 2015-04-30 10:02:10 NaT 2015-05-04 10:28:52 2015-05-05 10:00:44 2015-05-06 10:18:13 2015-05-07 10:03:39 2015-05-08 10:00:50 2015-05-11 10:23:52 2015-05-12 09:36:52 2015-05-13 10:13:15 2015-05-14 09:45:33 2015-05-15 09:27:48 2015-05-18 09:29:24 2015-05-19 09:52:54 2015-05-20 10:05:04 2015-05-21 09:33:56 NaT 2015-05-25 09:55:48 2015-05-26 10:10:44 2015-05-27 10:20:30 2015-05-28 09:45:38 2015-05-29 10:12:37 2015-06-01 10:14:37 2015-06-02 09:54:11 2015-06-03 09:52:39 2015-06-04 10:06:30 2015-06-05 09:45:42 2015-06-08 09:49:32 2015-06-09 10:19:40 2015-06-10 10:09:14 2015-06-11 09:59:59 2015-06-12 10:17:33 2015-06-15 10:11:20 2015-06-16 09:57:08 2015-06-17 09:41:36 2015-06-18 09:39:27 2015-06-19 09:34:20 2015-06-22 10:10:20 2015-06-23 09:42:51 2015-06-24 10:08:58 2015-06-25 09:47:00 2015-06-26 09:53:39 2015-06-29 10:03:00 2015-06-30 09:42:37 2015-07-01 09:21:09 2015-07-02 09:43:19 2015-07-03 09:53:23 2015-07-06 09:50:21 2015-07-07 10:24:34 2015-07-08 10:13:37 2015-07-09 10:02:30 2015-07-10 10:20:48 2015-07-13 10:06:03 2015-07-14 10:13:16 2015-07-15 10:21:49 2015-07-16 09:50:24 NaT 2015-07-20 10:00:48 2015-07-21 09:50:21 NaT 2015-07-23 10:21:26 2015-07-24 09:27:01 2015-07-27 10:06:37 2015-07-28 09:24:12 2015-07-29 09:51:04 2015-07-30 10:00:50 2015-07-31 09:47:58 2015-08-03 10:11:15 2015-08-04 09:38:39 NaT 2015-08-06 09:57:51 2015-08-07 09:48:50 2015-08-10 10:15:50 2015-08-11 10:00:04 2015-08-12 09:35:06 2015-08-13 09:50:28 2015-08-14 09:37:57 2015-08-17 10:08:57 2015-08-18 10:08:19 2015-08-19 09:58:18 2015-08-20 09:58:16 2015-08-21 09:54:32 NaT 2015-08-25 09:51:55 2015-08-26 09:53:52 2015-08-27 10:06:05 NaT 2015-08-31 09:50:42 2015-09-01 09:40:25 2015-09-02 09:28:38 2015-09-03 09:56:30 2015-09-04 10:12:19 2015-09-07 10:24:08 2015-09-08 10:21:37 2015-09-09 10:18:14 2015-09-10 09:19:24 2015-09-11 10:50:44 2015-09-14 09:43:15 2015-09-15 09:51:29 2015-09-16 09:37:48 NaT 2015-09-18 10:16:55 2015-09-21 10:08:13 2015-09-22 09:58:12 2015-09-23 09:41:19 2015-09-24 10:14:13 NaT 2015-09-28 10:01:47 2015-09-29 10:05:44 2015-09-30 10:30:29 2015-10-01 09:53:38 NaT NaT 2015-10-06 09:53:14 2015-10-07 10:02:55 2015-10-08 09:30:44 2015-10-09 10:12:55 2015-10-12 10:17:48 2015-10-13 10:18:19 2015-10-14 09:54:16 2015-10-15 09:43:10 2015-10-16 09:56:29 2015-10-19 09:57:36 2015-10-20 09:12:03 2015-10-21 10:30:37 2015-10-22 09:49:39 2015-10-23 09:31:09 2015-10-26 09:59:23 2015-10-27 09:41:26 2015-10-28 09:38:11 2015-10-29 09:53:39 NaT 2015-11-02 10:00:05 2015-11-03 10:29:56 2015-11-04 10:12:46 2015-11-05 10:08:09 NaT NaT NaT NaT 2015-11-12 10:08:26 2015-11-13 09:37:08 2015-11-16 09:42:23 2015-11-17 10:20:38 2015-11-18 10:02:33 2015-11-19 10:27:15 2015-11-20 10:01:14 2015-11-23 09:55:40 2015-11-24 10:00:46 2015-11-25 09:33:48 2015-11-26 10:24:11 2015-11-27 10:09:17 2015-11-30 09:50:21 2015-12-01 09:52:48 2015-12-02 10:09:28 2015-12-03 09:24:05 2015-12-04 09:52:37 2015-12-07 10:00:22 2015-12-08 10:04:10 2015-12-09 10:25:19 2015-12-10 09:39:15 2015-12-11 09:46:02 2015-12-14 10:20:36 2015-12-15 09:37:17 2015-12-16 10:22:09 2015-12-17 09:54:36 2015-12-18 10:17:38 2015-12-21 09:58:21 2015-12-22 10:04:25 2015-12-23 10:11:46 2015-12-24 09:43:15 NaT 2015-12-28 09:52:44 2015-12-29 09:33:16 2015-12-30 10:18:12 2015-12-31 10:01:15
5 NaT 2015-01-02 10:28:17 2015-01-05 09:49:58 2015-01-06 09:45:28 2015-01-07 09:49:37 2015-01-08 10:19:44 2015-01-09 10:00:50 2015-01-12 10:29:27 2015-01-13 09:59:32 NaT 2015-01-15 10:06:12 2015-01-16 10:03:50 NaT 2015-01-20 10:10:29 2015-01-21 10:28:15 2015-01-22 10:10:10 2015-01-23 09:34:34 NaT 2015-01-27 09:56:59 2015-01-28 09:45:03 2015-01-29 10:11:41 2015-01-30 10:24:50 2015-02-02 09:43:27 2015-02-03 09:55:59 2015-02-04 10:03:25 2015-02-05 10:02:17 2015-02-06 09:55:43 2015-02-09 10:15:16 2015-02-10 09:43:48 2015-02-11 10:03:36 2015-02-12 10:02:17 2015-02-13 09:44:55 2015-02-16 09:55:58 2015-02-17 10:09:22 2015-02-18 09:37:58 2015-02-19 09:38:21 2015-02-20 10:02:12 2015-02-23 10:25:42 2015-02-24 09:51:00 2015-02-25 09:55:22 2015-02-26 09:58:05 2015-02-27 10:07:47 2015-03-02 09:59:53 2015-03-03 09:42:11 2015-03-04 10:00:32 NaT 2015-03-06 09:53:21 2015-03-09 09:56:58 2015-03-10 10:02:48 2015-03-11 09:48:03 2015-03-12 09:34:03 2015-03-13 10:12:48 2015-03-16 09:59:26 2015-03-17 10:16:28 2015-03-18 09:49:21 2015-03-19 10:06:49 2015-03-20 09:46:16 2015-03-23 10:12:47 2015-03-24 09:37:38 2015-03-25 10:16:56 2015-03-26 10:06:01 2015-03-27 09:47:43 2015-03-30 09:59:29 2015-03-31 10:28:28 2015-04-01 09:46:04 2015-04-02 09:54:15 2015-04-03 09:51:44 2015-04-06 10:05:02 2015-04-07 09:49:12 2015-04-08 10:08:27 2015-04-09 09:57:47 2015-04-10 10:09:16 2015-04-13 09:59:22 2015-04-14 09:58:02 2015-04-15 09:58:30 2015-04-16 09:49:06 2015-04-17 10:02:28 2015-04-20 09:55:16 NaT 2015-04-22 10:09:26 2015-04-23 10:00:47 2015-04-24 10:18:18 2015-04-27 10:18:43 2015-04-28 09:41:17 2015-04-29 10:21:11 2015-04-30 10:20:45 NaT 2015-05-04 09:41:58 2015-05-05 10:09:45 2015-05-06 10:28:38 2015-05-07 10:05:24 2015-05-08 10:20:21 2015-05-11 09:43:11 2015-05-12 09:46:57 2015-05-13 10:09:57 2015-05-14 10:07:50 2015-05-15 09:27:13 2015-05-18 10:08:39 2015-05-19 09:41:20 2015-05-20 10:21:00 2015-05-21 09:45:53 2015-05-22 10:13:55 2015-05-25 10:26:55 2015-05-26 09:44:25 2015-05-27 09:56:39 2015-05-28 09:51:58 2015-05-29 09:41:19 2015-06-01 10:22:26 2015-06-02 10:07:35 2015-06-03 10:05:21 2015-06-04 09:46:59 2015-06-05 09:47:52 2015-06-08 09:55:09 2015-06-09 10:36:58 2015-06-10 10:18:44 NaT 2015-06-12 10:25:57 2015-06-15 09:45:05 2015-06-16 10:26:43 2015-06-17 09:47:19 2015-06-18 10:13:49 2015-06-19 09:47:13 2015-06-22 10:04:08 2015-06-23 09:51:11 2015-06-24 10:12:56 2015-06-25 09:59:25 2015-06-26 09:29:47 2015-06-29 10:08:32 2015-06-30 10:03:44 2015-07-01 09:48:35 2015-07-02 09:24:58 2015-07-03 10:08:01 2015-07-06 09:41:01 2015-07-07 10:23:38 2015-07-08 09:46:59 2015-07-09 10:47:18 2015-07-10 10:06:36 2015-07-13 10:00:10 2015-07-14 09:36:55 2015-07-15 09:44:21 2015-07-16 09:34:59 NaT 2015-07-20 10:26:08 2015-07-21 09:42:08 2015-07-22 09:58:04 2015-07-23 10:09:40 2015-07-24 09:43:16 2015-07-27 09:59:05 2015-07-28 10:43:35 2015-07-29 10:54:55 2015-07-30 10:09:52 2015-07-31 09:49:40 2015-08-03 10:23:07 2015-08-04 09:48:50 2015-08-05 09:33:13 2015-08-06 09:56:03 2015-08-07 09:48:23 2015-08-10 10:12:45 2015-08-11 10:11:50 NaT 2015-08-13 10:06:11 2015-08-14 09:48:26 2015-08-17 10:19:25 2015-08-18 09:55:14 2015-08-19 09:45:22 2015-08-20 09:40:50 2015-08-21 09:52:20 2015-08-24 09:59:24 2015-08-25 10:24:46 2015-08-26 09:40:04 2015-08-27 10:00:36 2015-08-28 09:55:44 2015-08-31 10:09:20 2015-09-01 09:48:18 2015-09-02 10:16:14 2015-09-03 10:32:19 2015-09-04 10:04:26 2015-09-07 09:47:03 2015-09-08 10:14:53 2015-09-09 09:45:22 2015-09-10 09:42:04 2015-09-11 09:41:30 2015-09-14 09:48:38 2015-09-15 09:44:13 2015-09-16 09:55:17 NaT 2015-09-18 09:45:17 2015-09-21 09:47:48 2015-09-22 10:08:20 2015-09-23 09:44:15 2015-09-24 09:49:02 2015-09-25 09:46:15 2015-09-28 09:57:55 2015-09-29 09:50:24 2015-09-30 10:01:46 2015-10-01 09:58:38 NaT 2015-10-05 10:01:38 2015-10-06 09:53:54 2015-10-07 09:49:00 2015-10-08 09:52:18 2015-10-09 09:37:05 2015-10-12 09:47:44 2015-10-13 09:44:28 2015-10-14 10:05:50 2015-10-15 09:48:36 2015-10-16 09:59:24 2015-10-19 10:21:33 2015-10-20 09:54:18 2015-10-21 09:40:43 2015-10-22 09:54:53 2015-10-23 10:20:46 2015-10-26 09:26:54 2015-10-27 09:52:14 2015-10-28 10:05:45 2015-10-29 10:47:48 2015-10-30 09:54:08 2015-11-02 10:21:24 2015-11-03 10:14:29 2015-11-04 10:05:50 2015-11-05 10:14:59 2015-11-06 09:56:52 NaT NaT NaT 2015-11-12 09:40:29 2015-11-13 09:55:12 2015-11-16 10:15:51 2015-11-17 09:31:09 2015-11-18 10:05:50 2015-11-19 10:07:44 2015-11-20 09:58:14 2015-11-23 10:12:11 2015-11-24 09:56:16 2015-11-25 10:02:15 2015-11-26 09:54:51 2015-11-27 09:50:48 2015-11-30 10:00:02 2015-12-01 10:13:08 2015-12-02 09:38:52 2015-12-03 09:48:57 2015-12-04 09:55:11 2015-12-07 10:01:34 2015-12-08 09:18:02 2015-12-09 10:02:29 2015-12-10 10:18:53 2015-12-11 10:26:26 2015-12-14 10:20:18 2015-12-15 09:55:55 2015-12-16 10:09:25 2015-12-17 09:46:35 2015-12-18 09:58:35 2015-12-21 10:03:41 2015-12-22 10:10:30 2015-12-23 10:13:36 2015-12-24 09:44:24 NaT 2015-12-28 10:05:15 2015-12-29 10:30:53 2015-12-30 09:18:21 2015-12-31 09:41:09
In [12]:
df_in_time.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4410 entries, 1 to 4410
Columns: 261 entries, 2015-01-01 to 2015-12-31
dtypes: datetime64[ns](261)
memory usage: 8.8 MB
In [13]:
df_out_time.head()
Out[13]:
2015-01-01 2015-01-02 2015-01-05 2015-01-06 2015-01-07 2015-01-08 2015-01-09 2015-01-12 2015-01-13 2015-01-14 2015-01-15 2015-01-16 2015-01-19 2015-01-20 2015-01-21 2015-01-22 2015-01-23 2015-01-26 2015-01-27 2015-01-28 2015-01-29 2015-01-30 2015-02-02 2015-02-03 2015-02-04 2015-02-05 2015-02-06 2015-02-09 2015-02-10 2015-02-11 2015-02-12 2015-02-13 2015-02-16 2015-02-17 2015-02-18 2015-02-19 2015-02-20 2015-02-23 2015-02-24 2015-02-25 2015-02-26 2015-02-27 2015-03-02 2015-03-03 2015-03-04 2015-03-05 2015-03-06 2015-03-09 2015-03-10 2015-03-11 2015-03-12 2015-03-13 2015-03-16 2015-03-17 2015-03-18 2015-03-19 2015-03-20 2015-03-23 2015-03-24 2015-03-25 2015-03-26 2015-03-27 2015-03-30 2015-03-31 2015-04-01 2015-04-02 2015-04-03 2015-04-06 2015-04-07 2015-04-08 2015-04-09 2015-04-10 2015-04-13 2015-04-14 2015-04-15 2015-04-16 2015-04-17 2015-04-20 2015-04-21 2015-04-22 2015-04-23 2015-04-24 2015-04-27 2015-04-28 2015-04-29 2015-04-30 2015-05-01 2015-05-04 2015-05-05 2015-05-06 2015-05-07 2015-05-08 2015-05-11 2015-05-12 2015-05-13 2015-05-14 2015-05-15 2015-05-18 2015-05-19 2015-05-20 2015-05-21 2015-05-22 2015-05-25 2015-05-26 2015-05-27 2015-05-28 2015-05-29 2015-06-01 2015-06-02 2015-06-03 2015-06-04 2015-06-05 2015-06-08 2015-06-09 2015-06-10 2015-06-11 2015-06-12 2015-06-15 2015-06-16 2015-06-17 2015-06-18 2015-06-19 2015-06-22 2015-06-23 2015-06-24 2015-06-25 2015-06-26 2015-06-29 2015-06-30 2015-07-01 2015-07-02 2015-07-03 2015-07-06 2015-07-07 2015-07-08 2015-07-09 2015-07-10 2015-07-13 2015-07-14 2015-07-15 2015-07-16 2015-07-17 2015-07-20 2015-07-21 2015-07-22 2015-07-23 2015-07-24 2015-07-27 2015-07-28 2015-07-29 2015-07-30 2015-07-31 2015-08-03 2015-08-04 2015-08-05 2015-08-06 2015-08-07 2015-08-10 2015-08-11 2015-08-12 2015-08-13 2015-08-14 2015-08-17 2015-08-18 2015-08-19 2015-08-20 2015-08-21 2015-08-24 2015-08-25 2015-08-26 2015-08-27 2015-08-28 2015-08-31 2015-09-01 2015-09-02 2015-09-03 2015-09-04 2015-09-07 2015-09-08 2015-09-09 2015-09-10 2015-09-11 2015-09-14 2015-09-15 2015-09-16 2015-09-17 2015-09-18 2015-09-21 2015-09-22 2015-09-23 2015-09-24 2015-09-25 2015-09-28 2015-09-29 2015-09-30 2015-10-01 2015-10-02 2015-10-05 2015-10-06 2015-10-07 2015-10-08 2015-10-09 2015-10-12 2015-10-13 2015-10-14 2015-10-15 2015-10-16 2015-10-19 2015-10-20 2015-10-21 2015-10-22 2015-10-23 2015-10-26 2015-10-27 2015-10-28 2015-10-29 2015-10-30 2015-11-02 2015-11-03 2015-11-04 2015-11-05 2015-11-06 2015-11-09 2015-11-10 2015-11-11 2015-11-12 2015-11-13 2015-11-16 2015-11-17 2015-11-18 2015-11-19 2015-11-20 2015-11-23 2015-11-24 2015-11-25 2015-11-26 2015-11-27 2015-11-30 2015-12-01 2015-12-02 2015-12-03 2015-12-04 2015-12-07 2015-12-08 2015-12-09 2015-12-10 2015-12-11 2015-12-14 2015-12-15 2015-12-16 2015-12-17 2015-12-18 2015-12-21 2015-12-22 2015-12-23 2015-12-24 2015-12-25 2015-12-28 2015-12-29 2015-12-30 2015-12-31
1 NaT 2015-01-02 16:56:15 2015-01-05 17:20:11 2015-01-06 17:19:05 2015-01-07 16:34:55 2015-01-08 17:08:32 2015-01-09 17:38:29 2015-01-12 16:58:39 2015-01-13 18:02:58 NaT 2015-01-15 17:22:13 2015-01-16 17:35:11 NaT 2015-01-20 16:37:07 2015-01-21 16:55:24 2015-01-22 16:50:47 2015-01-23 17:00:01 NaT 2015-01-27 17:28:22 2015-01-28 17:03:21 2015-01-29 17:18:36 2015-01-30 17:00:25 2015-02-02 16:32:49 2015-02-03 17:35:49 2015-02-04 17:24:56 2015-02-05 17:26:31 2015-02-06 17:21:35 2015-02-09 17:37:50 2015-02-10 17:30:20 2015-02-11 17:30:19 2015-02-12 17:36:42 2015-02-13 16:25:26 2015-02-16 17:50:23 2015-02-17 17:28:06 2015-02-18 18:13:29 NaT 2015-02-20 17:44:52 2015-02-23 17:13:27 2015-02-24 17:24:48 2015-02-25 17:27:37 2015-02-26 17:08:36 2015-02-27 17:21:55 2015-03-02 17:29:08 2015-03-03 16:54:12 2015-03-04 17:50:18 NaT 2015-03-06 17:44:47 2015-03-09 17:53:05 2015-03-10 16:42:53 2015-03-11 16:58:39 2015-03-12 17:15:39 2015-03-13 17:25:55 NaT 2015-03-17 17:00:16 2015-03-18 17:40:13 2015-03-19 17:52:44 2015-03-20 17:47:34 2015-03-23 17:48:18 2015-03-24 17:48:26 2015-03-25 17:08:19 NaT 2015-03-27 17:20:51 2015-03-30 17:33:09 2015-03-31 18:00:07 2015-04-01 17:32:50 2015-04-02 17:24:58 2015-04-03 17:11:46 2015-04-06 16:42:46 2015-04-07 17:55:07 2015-04-08 17:30:12 2015-04-09 17:09:38 2015-04-10 16:53:22 2015-04-13 17:00:23 2015-04-14 17:07:36 2015-04-15 17:23:31 2015-04-16 17:31:32 2015-04-17 17:11:18 2015-04-20 17:45:08 2015-04-21 16:49:13 2015-04-22 17:27:46 2015-04-23 17:32:53 2015-04-24 17:39:25 2015-04-27 17:26:32 2015-04-28 17:50:45 2015-04-29 17:27:45 2015-04-30 17:18:59 NaT 2015-05-04 17:05:30 2015-05-05 17:13:30 2015-05-06 17:16:30 2015-05-07 17:10:33 2015-05-08 17:31:56 NaT 2015-05-12 17:16:58 2015-05-13 17:54:12 2015-05-14 17:44:49 2015-05-15 16:44:33 NaT 2015-05-19 17:09:05 2015-05-20 17:19:49 2015-05-21 17:21:10 2015-05-22 17:27:28 2015-05-25 17:49:48 2015-05-26 17:29:00 2015-05-27 17:34:05 2015-05-28 17:23:52 NaT NaT 2015-06-02 18:03:51 2015-06-03 17:18:10 2015-06-04 17:26:11 NaT 2015-06-08 17:22:18 NaT 2015-06-10 16:59:04 2015-06-11 17:21:39 2015-06-12 17:39:12 2015-06-15 17:26:02 2015-06-16 17:31:36 2015-06-17 17:17:49 2015-06-18 17:34:32 2015-06-19 17:21:17 2015-06-22 16:59:22 2015-06-23 16:51:41 2015-06-24 17:06:26 2015-06-25 17:37:14 2015-06-26 17:11:06 2015-06-29 17:41:01 2015-06-30 17:45:51 2015-07-01 17:22:27 2015-07-02 17:44:39 2015-07-03 17:09:29 2015-07-06 17:33:57 2015-07-07 17:47:22 2015-07-08 17:26:03 2015-07-09 16:18:10 2015-07-10 17:05:38 2015-07-13 17:23:56 2015-07-14 17:14:19 2015-07-15 17:32:02 2015-07-16 17:39:13 NaT 2015-07-20 17:31:04 2015-07-21 17:03:19 2015-07-22 17:45:13 2015-07-23 16:45:36 2015-07-24 17:03:05 2015-07-27 17:24:48 2015-07-28 17:02:15 2015-07-29 16:57:05 2015-07-30 17:06:20 2015-07-31 17:34:04 2015-08-03 17:00:31 2015-08-04 17:10:18 2015-08-05 17:31:20 2015-08-06 17:27:12 2015-08-07 17:16:59 2015-08-10 17:48:37 2015-08-11 17:40:01 2015-08-12 17:15:47 2015-08-13 17:09:17 2015-08-14 17:08:51 2015-08-17 16:27:43 2015-08-18 18:03:09 2015-08-19 17:09:56 2015-08-20 17:11:05 2015-08-21 17:51:02 2015-08-24 17:37:08 2015-08-25 17:12:59 2015-08-26 17:36:02 2015-08-27 17:57:41 2015-08-28 17:27:04 2015-08-31 17:08:19 NaT 2015-09-02 17:49:13 2015-09-03 17:07:17 2015-09-04 17:42:46 2015-09-07 17:40:58 2015-09-08 17:52:36 2015-09-09 18:00:21 2015-09-10 17:17:02 2015-09-11 17:22:52 2015-09-14 17:07:46 2015-09-15 17:34:11 2015-09-16 17:31:03 NaT 2015-09-18 17:43:14 2015-09-21 17:13:51 2015-09-22 18:14:06 2015-09-23 17:05:53 2015-09-24 17:19:26 2015-09-25 17:43:56 2015-09-28 17:07:25 2015-09-29 17:27:30 2015-09-30 17:14:43 2015-10-01 17:52:44 NaT 2015-10-05 17:03:41 2015-10-06 17:29:15 2015-10-07 16:56:00 2015-10-08 17:08:18 2015-10-09 17:10:00 2015-10-12 17:24:14 NaT 2015-10-14 17:49:02 2015-10-15 17:54:13 2015-10-16 16:40:00 2015-10-19 17:27:58 2015-10-20 17:02:55 NaT NaT 2015-10-23 17:13:46 2015-10-26 17:35:53 2015-10-27 16:28:13 2015-10-28 17:32:42 2015-10-29 16:57:29 2015-10-30 17:09:54 2015-11-02 16:53:43 2015-11-03 18:04:34 2015-11-04 16:43:40 2015-11-05 18:03:15 2015-11-06 17:15:12 NaT NaT NaT 2015-11-12 17:13:17 2015-11-13 16:49:27 2015-11-16 18:03:08 2015-11-17 17:03:06 2015-11-18 17:38:02 2015-11-19 16:46:27 2015-11-20 17:58:40 2015-11-23 17:13:22 2015-11-24 17:37:21 2015-11-25 17:26:51 2015-11-26 17:44:49 2015-11-27 17:30:02 2015-11-30 17:05:11 2015-12-01 16:53:21 2015-12-02 17:32:24 2015-12-03 17:40:59 2015-12-04 17:02:22 2015-12-07 16:21:37 2015-12-08 17:51:23 2015-12-09 17:48:46 2015-12-10 16:52:14 2015-12-11 17:25:56 2015-12-14 17:35:53 NaT 2015-12-16 17:54:26 NaT NaT 2015-12-21 17:15:50 2015-12-22 17:27:51 2015-12-23 16:44:44 2015-12-24 17:47:22 NaT 2015-12-28 18:00:07 2015-12-29 17:22:30 2015-12-30 17:40:56 2015-12-31 17:17:33
2 NaT 2015-01-02 18:22:17 2015-01-05 17:48:22 NaT 2015-01-07 17:09:06 2015-01-08 17:34:04 2015-01-09 16:52:29 2015-01-12 17:36:48 2015-01-13 18:00:13 NaT 2015-01-15 17:14:44 2015-01-16 17:40:57 2015-01-19 17:58:22 2015-01-20 17:05:13 2015-01-21 17:41:05 2015-01-22 17:26:26 2015-01-23 17:06:50 NaT 2015-01-27 17:35:50 2015-01-28 17:19:48 2015-01-29 17:07:38 2015-01-30 16:53:22 2015-02-02 17:45:18 2015-02-03 17:33:42 2015-02-04 18:04:20 2015-02-05 17:43:25 2015-02-06 17:06:42 NaT 2015-02-10 18:12:17 2015-02-11 17:18:36 2015-02-12 18:08:01 2015-02-13 17:19:25 2015-02-16 17:46:09 2015-02-17 17:09:30 2015-02-18 17:48:56 2015-02-19 18:06:35 2015-02-20 17:31:45 2015-02-23 17:57:18 2015-02-24 17:32:22 2015-02-25 17:43:47 2015-02-26 17:49:55 2015-02-27 18:38:54 2015-03-02 17:36:37 2015-03-03 18:12:57 2015-03-04 17:52:38 NaT 2015-03-06 17:21:33 2015-03-09 18:10:21 2015-03-10 17:14:06 2015-03-11 17:56:14 2015-03-12 17:30:53 2015-03-13 17:48:25 2015-03-16 17:08:38 2015-03-17 16:58:02 2015-03-18 17:16:35 2015-03-19 17:28:42 2015-03-20 17:31:55 2015-03-23 17:58:55 NaT 2015-03-25 17:22:19 2015-03-26 17:48:10 2015-03-27 17:49:22 2015-03-30 17:28:31 2015-03-31 18:28:47 2015-04-01 17:30:06 2015-04-02 17:39:09 2015-04-03 18:07:24 2015-04-06 17:40:31 2015-04-07 17:50:46 2015-04-08 17:09:41 2015-04-09 17:07:38 2015-04-10 17:43:12 2015-04-13 17:45:20 2015-04-14 18:00:46 2015-04-15 17:35:11 2015-04-16 17:21:42 2015-04-17 17:24:55 2015-04-20 17:51:08 2015-04-21 18:49:53 2015-04-22 17:40:46 2015-04-23 18:04:23 2015-04-24 17:42:16 2015-04-27 17:12:05 2015-04-28 17:29:15 NaT 2015-04-30 18:18:58 NaT 2015-05-04 17:41:29 2015-05-05 18:08:25 2015-05-06 17:37:33 2015-05-07 17:26:07 2015-05-08 17:07:25 2015-05-11 17:30:04 2015-05-12 17:39:57 2015-05-13 17:35:24 NaT 2015-05-15 17:06:54 NaT 2015-05-19 17:53:01 2015-05-20 17:30:15 2015-05-21 17:25:40 2015-05-22 18:39:12 2015-05-25 17:50:50 2015-05-26 17:49:25 2015-05-27 17:51:25 2015-05-28 17:11:38 2015-05-29 17:55:55 2015-06-01 18:01:11 2015-06-02 17:34:55 2015-06-03 17:58:25 2015-06-04 16:27:03 2015-06-05 17:46:19 2015-06-08 17:20:20 2015-06-09 17:24:55 2015-06-10 17:44:36 2015-06-11 17:53:52 NaT 2015-06-15 17:48:03 2015-06-16 18:01:24 2015-06-17 18:05:55 2015-06-18 17:48:28 2015-06-19 17:55:56 2015-06-22 17:48:27 2015-06-23 17:46:50 2015-06-24 17:59:00 2015-06-25 18:09:18 2015-06-26 17:35:47 2015-06-29 18:00:33 2015-06-30 17:20:12 2015-07-01 17:27:32 2015-07-02 17:52:49 2015-07-03 18:08:39 2015-07-06 17:53:47 2015-07-07 18:20:24 2015-07-08 17:39:41 2015-07-09 17:28:26 2015-07-10 18:12:57 NaT 2015-07-14 18:06:32 2015-07-15 17:25:39 2015-07-16 18:13:48 NaT 2015-07-20 17:47:07 2015-07-21 18:01:49 2015-07-22 17:20:40 2015-07-23 17:36:14 2015-07-24 18:07:05 NaT 2015-07-28 18:08:31 2015-07-29 17:54:49 2015-07-30 17:19:50 2015-07-31 17:30:49 2015-08-03 17:44:58 2015-08-04 18:08:25 2015-08-05 17:27:32 2015-08-06 18:48:44 2015-08-07 17:47:53 2015-08-10 17:24:08 2015-08-11 17:09:55 2015-08-12 17:13:38 2015-08-13 17:54:56 2015-08-14 17:29:11 2015-08-17 18:04:06 2015-08-18 17:03:21 2015-08-19 17:36:39 2015-08-20 17:56:23 2015-08-21 17:38:55 2015-08-24 17:05:20 2015-08-25 17:11:02 2015-08-26 17:44:47 2015-08-27 18:04:39 2015-08-28 17:24:55 2015-08-31 18:13:22 2015-09-01 17:18:16 2015-09-02 17:26:57 2015-09-03 17:50:40 2015-09-04 18:21:48 2015-09-07 18:10:09 NaT 2015-09-09 17:52:26 2015-09-10 18:22:09 2015-09-11 17:27:25 2015-09-14 17:10:53 2015-09-15 16:50:45 2015-09-16 17:16:37 NaT 2015-09-18 17:51:23 2015-09-21 17:45:44 2015-09-22 16:51:17 2015-09-23 17:55:03 2015-09-24 18:07:54 2015-09-25 17:45:34 2015-09-28 17:52:52 2015-09-29 18:02:18 2015-09-30 18:15:14 2015-10-01 17:33:53 NaT 2015-10-05 18:08:24 2015-10-06 17:12:33 2015-10-07 18:07:16 2015-10-08 18:19:06 2015-10-09 17:18:33 2015-10-12 17:16:51 2015-10-13 18:32:48 2015-10-14 17:45:04 NaT 2015-10-16 17:58:11 2015-10-19 18:00:41 2015-10-20 17:13:28 2015-10-21 16:51:56 2015-10-22 18:30:26 2015-10-23 16:37:09 2015-10-26 17:58:41 2015-10-27 18:23:56 2015-10-28 17:20:29 2015-10-29 17:44:08 2015-10-30 17:32:48 2015-11-02 17:29:17 2015-11-03 17:18:27 2015-11-04 17:54:32 2015-11-05 17:11:57 2015-11-06 17:28:38 NaT NaT NaT 2015-11-12 17:34:03 2015-11-13 17:38:25 2015-11-16 18:37:03 2015-11-17 17:26:09 2015-11-18 18:06:23 NaT 2015-11-20 17:52:48 2015-11-23 17:32:43 2015-11-24 17:01:00 2015-11-25 17:14:56 2015-11-26 18:28:44 2015-11-27 18:12:27 2015-11-30 17:18:40 2015-12-01 17:21:07 2015-12-02 18:02:46 2015-12-03 18:08:00 2015-12-04 18:01:03 2015-12-07 17:06:37 2015-12-08 17:44:18 2015-12-09 17:47:47 2015-12-10 17:55:23 2015-12-11 17:42:47 2015-12-14 17:32:11 2015-12-15 17:56:25 2015-12-16 18:16:37 2015-12-17 17:10:50 2015-12-18 18:31:28 2015-12-21 17:34:16 2015-12-22 18:16:35 2015-12-23 17:38:18 NaT NaT 2015-12-28 17:08:38 2015-12-29 17:54:46 2015-12-30 18:31:35 2015-12-31 17:40:58
3 NaT 2015-01-02 16:59:14 2015-01-05 17:06:46 2015-01-06 16:38:32 2015-01-07 16:33:21 2015-01-08 17:24:22 2015-01-09 16:57:30 2015-01-12 17:28:54 2015-01-13 17:21:25 NaT 2015-01-15 17:21:29 2015-01-16 17:18:13 2015-01-19 16:52:13 2015-01-20 16:52:23 2015-01-21 17:17:50 2015-01-22 17:27:54 2015-01-23 17:11:52 NaT 2015-01-27 17:22:27 2015-01-28 16:24:04 2015-01-29 17:20:07 2015-01-30 17:16:03 2015-02-02 16:59:01 2015-02-03 15:58:24 2015-02-04 17:25:54 2015-02-05 16:29:41 2015-02-06 17:18:15 2015-02-09 16:48:18 2015-02-10 17:15:49 2015-02-11 16:49:22 2015-02-12 16:22:31 2015-02-13 17:41:15 2015-02-16 17:20:05 2015-02-17 16:46:01 2015-02-18 17:19:08 2015-02-19 16:44:56 2015-02-20 17:14:02 2015-02-23 17:59:19 2015-02-24 17:08:34 2015-02-25 16:55:31 2015-02-26 17:34:33 2015-02-27 16:48:44 2015-03-02 17:32:13 2015-03-03 17:11:01 2015-03-04 17:26:45 NaT 2015-03-06 16:35:05 2015-03-09 17:08:34 2015-03-10 16:38:33 2015-03-11 17:31:26 2015-03-12 17:13:00 2015-03-13 16:25:40 2015-03-16 16:53:08 2015-03-17 17:07:58 2015-03-18 17:39:10 2015-03-19 16:27:24 NaT 2015-03-23 16:50:31 2015-03-24 17:12:39 2015-03-25 16:42:17 2015-03-26 17:02:08 NaT 2015-03-30 16:37:47 2015-03-31 17:29:39 2015-04-01 16:47:25 2015-04-02 17:14:02 2015-04-03 16:46:36 2015-04-06 16:27:34 2015-04-07 17:27:19 2015-04-08 17:36:18 2015-04-09 17:45:16 2015-04-10 17:13:27 2015-04-13 17:30:51 2015-04-14 16:43:49 2015-04-15 17:23:07 2015-04-16 16:49:51 2015-04-17 17:52:50 2015-04-20 17:03:14 2015-04-21 17:15:37 2015-04-22 17:23:28 NaT 2015-04-24 17:05:07 2015-04-27 16:29:39 2015-04-28 17:22:20 2015-04-29 16:52:17 2015-04-30 16:43:32 NaT 2015-05-04 16:42:30 2015-05-05 16:13:39 2015-05-06 17:27:27 2015-05-07 16:41:59 2015-05-08 17:13:18 2015-05-11 16:49:00 2015-05-12 16:45:55 2015-05-13 16:40:15 2015-05-14 17:01:56 2015-05-15 16:25:36 2015-05-18 17:17:40 2015-05-19 17:05:17 2015-05-20 17:06:32 2015-05-21 16:44:37 2015-05-22 16:54:06 2015-05-25 17:23:42 2015-05-26 16:44:02 2015-05-27 17:23:12 2015-05-28 16:31:51 2015-05-29 16:53:00 2015-06-01 16:44:52 2015-06-02 17:09:28 2015-06-03 17:21:49 2015-06-04 16:21:03 2015-06-05 16:24:55 2015-06-08 17:09:46 2015-06-09 16:46:00 2015-06-10 17:35:00 2015-06-11 17:20:01 2015-06-12 17:18:04 2015-06-15 16:58:07 2015-06-16 16:27:56 2015-06-17 16:50:32 2015-06-18 17:16:36 2015-06-19 16:43:47 2015-06-22 16:29:10 2015-06-23 16:53:24 NaT 2015-06-25 16:44:56 2015-06-26 17:05:33 2015-06-29 16:19:27 2015-06-30 16:49:09 2015-07-01 16:50:55 2015-07-02 17:17:35 2015-07-03 16:36:11 2015-07-06 18:00:41 2015-07-07 17:36:36 2015-07-08 16:47:57 2015-07-09 18:07:51 2015-07-10 17:01:41 2015-07-13 16:20:14 2015-07-14 17:18:35 2015-07-15 16:54:35 2015-07-16 16:56:24 NaT 2015-07-20 17:07:52 2015-07-21 16:51:57 2015-07-22 16:28:59 2015-07-23 17:23:36 2015-07-24 16:49:58 2015-07-27 17:18:46 2015-07-28 17:28:39 2015-07-29 17:53:35 2015-07-30 17:25:52 2015-07-31 16:44:55 2015-08-03 16:52:05 2015-08-04 16:25:27 NaT 2015-08-06 16:53:15 2015-08-07 16:31:42 2015-08-10 17:06:02 2015-08-11 16:36:01 2015-08-12 17:28:54 2015-08-13 16:57:54 2015-08-14 16:54:16 2015-08-17 16:53:27 2015-08-18 17:15:15 2015-08-19 17:08:09 2015-08-20 16:53:58 2015-08-21 17:29:32 2015-08-24 17:01:36 2015-08-25 16:52:06 2015-08-26 17:06:17 2015-08-27 16:05:12 2015-08-28 16:46:21 2015-08-31 16:29:00 2015-09-01 17:12:34 2015-09-02 17:06:39 2015-09-03 17:14:37 2015-09-04 17:16:44 2015-09-07 16:17:32 2015-09-08 16:46:52 2015-09-09 16:33:08 2015-09-10 17:22:21 2015-09-11 16:32:47 2015-09-14 17:49:44 2015-09-15 16:58:26 2015-09-16 16:51:35 NaT 2015-09-18 16:53:29 2015-09-21 16:42:32 2015-09-22 16:28:40 2015-09-23 17:06:45 2015-09-24 17:33:41 2015-09-25 16:39:34 2015-09-28 17:44:11 2015-09-29 17:52:11 2015-09-30 16:56:37 2015-10-01 17:16:00 NaT 2015-10-05 17:11:32 2015-10-06 17:27:11 2015-10-07 17:19:27 2015-10-08 16:42:29 2015-10-09 16:55:42 2015-10-12 17:19:20 2015-10-13 17:03:46 2015-10-14 17:25:43 2015-10-15 16:59:34 2015-10-16 16:42:49 2015-10-19 16:36:56 2015-10-20 17:00:27 2015-10-21 17:52:36 2015-10-22 17:12:08 2015-10-23 17:00:44 2015-10-26 17:44:21 2015-10-27 17:20:34 2015-10-28 16:22:30 2015-10-29 16:35:41 2015-10-30 16:29:34 2015-11-02 17:35:43 2015-11-03 16:55:12 2015-11-04 17:01:26 2015-11-05 17:04:42 2015-11-06 16:39:49 NaT NaT NaT 2015-11-12 17:36:44 2015-11-13 16:43:22 2015-11-16 16:51:03 2015-11-17 17:09:13 2015-11-18 17:28:22 2015-11-19 16:20:40 2015-11-20 17:40:18 2015-11-23 16:58:39 2015-11-24 16:33:43 2015-11-25 17:20:08 2015-11-26 17:34:28 NaT 2015-11-30 17:00:13 2015-12-01 16:59:05 2015-12-02 16:54:49 2015-12-03 17:04:26 2015-12-04 17:10:47 2015-12-07 17:09:04 2015-12-08 17:28:44 2015-12-09 17:49:34 NaT 2015-12-11 17:01:53 2015-12-14 16:48:18 2015-12-15 16:39:52 2015-12-16 16:42:44 2015-12-17 17:06:23 2015-12-18 17:02:23 2015-12-21 17:20:17 2015-12-22 16:32:50 2015-12-23 16:59:43 2015-12-24 16:58:25 NaT 2015-12-28 16:43:31 2015-12-29 17:09:56 2015-12-30 17:06:25 2015-12-31 17:15:50
4 NaT 2015-01-02 17:25:24 2015-01-05 17:14:03 2015-01-06 17:07:42 2015-01-07 16:32:40 2015-01-08 16:53:11 2015-01-09 17:19:47 2015-01-12 17:13:37 2015-01-13 17:11:45 NaT 2015-01-15 16:53:26 2015-01-16 16:52:34 2015-01-19 16:14:18 2015-01-20 17:39:50 2015-01-21 16:46:51 2015-01-22 16:51:48 2015-01-23 17:05:41 NaT 2015-01-27 17:03:48 2015-01-28 17:33:22 2015-01-29 16:44:01 2015-01-30 17:09:42 2015-02-02 17:18:12 2015-02-03 17:27:41 2015-02-04 17:30:22 2015-02-05 17:11:44 2015-02-06 16:33:04 2015-02-09 17:16:29 NaT 2015-02-11 17:04:11 2015-02-12 17:22:16 2015-02-13 17:22:54 2015-02-16 17:11:28 2015-02-17 16:59:20 2015-02-18 16:31:43 2015-02-19 17:15:20 2015-02-20 16:43:41 2015-02-23 16:44:17 2015-02-24 17:30:51 2015-02-25 16:02:49 2015-02-26 17:09:34 2015-02-27 16:59:05 NaT 2015-03-03 17:39:36 2015-03-04 16:26:53 NaT NaT 2015-03-09 17:15:18 2015-03-10 16:53:12 2015-03-11 17:27:00 2015-03-12 17:18:43 2015-03-13 16:42:00 2015-03-16 16:39:28 2015-03-17 17:04:50 2015-03-18 17:31:49 2015-03-19 17:27:02 2015-03-20 17:35:10 2015-03-23 17:18:20 2015-03-24 17:22:01 2015-03-25 17:20:54 2015-03-26 18:10:16 NaT 2015-03-30 17:13:45 2015-03-31 17:35:52 2015-04-01 17:19:06 2015-04-02 16:32:12 2015-04-03 16:59:23 2015-04-06 17:23:43 2015-04-07 17:39:00 2015-04-08 17:39:25 2015-04-09 17:15:13 2015-04-10 17:21:15 2015-04-13 16:22:01 NaT 2015-04-15 16:42:43 2015-04-16 16:50:43 2015-04-17 17:03:14 2015-04-20 17:10:10 2015-04-21 17:18:07 2015-04-22 17:06:08 2015-04-23 17:15:23 2015-04-24 17:24:56 2015-04-27 17:16:14 2015-04-28 16:19:59 2015-04-29 17:04:12 2015-04-30 17:09:25 NaT 2015-05-04 17:59:06 2015-05-05 16:44:22 2015-05-06 18:08:51 2015-05-07 17:13:02 2015-05-08 17:43:42 2015-05-11 17:31:57 2015-05-12 16:53:20 2015-05-13 17:23:44 2015-05-14 17:01:21 2015-05-15 16:36:35 2015-05-18 16:22:37 2015-05-19 17:23:25 2015-05-20 16:42:27 2015-05-21 16:46:33 NaT 2015-05-25 17:08:33 2015-05-26 17:33:07 2015-05-27 17:55:27 2015-05-28 17:25:42 2015-05-29 17:19:29 2015-06-01 17:12:33 2015-06-02 17:42:17 2015-06-03 17:11:39 2015-06-04 17:33:35 2015-06-05 16:32:56 2015-06-08 17:06:31 2015-06-09 17:31:47 2015-06-10 17:24:30 2015-06-11 17:25:59 2015-06-12 17:24:49 2015-06-15 17:31:18 2015-06-16 16:51:27 2015-06-17 16:57:41 2015-06-18 17:24:34 2015-06-19 16:52:14 2015-06-22 17:44:28 2015-06-23 16:33:07 2015-06-24 17:36:56 2015-06-25 16:44:23 2015-06-26 17:01:52 2015-06-29 17:12:10 2015-06-30 16:54:35 2015-07-01 16:13:03 2015-07-02 17:05:36 2015-07-03 17:23:50 2015-07-06 17:12:57 2015-07-07 17:57:50 2015-07-08 17:10:27 2015-07-09 17:19:00 2015-07-10 17:57:38 2015-07-13 17:28:43 2015-07-14 17:58:30 2015-07-15 17:50:46 2015-07-16 17:05:11 NaT 2015-07-20 16:52:41 2015-07-21 16:39:59 NaT 2015-07-23 17:45:50 2015-07-24 16:50:37 2015-07-27 17:21:13 2015-07-28 16:24:54 2015-07-29 17:42:13 2015-07-30 17:14:45 2015-07-31 17:04:26 2015-08-03 17:50:27 2015-08-04 16:55:41 NaT 2015-08-06 17:00:50 2015-08-07 17:12:50 2015-08-10 17:42:07 2015-08-11 17:39:24 2015-08-12 16:45:38 2015-08-13 16:36:57 2015-08-14 17:02:53 2015-08-17 17:17:49 2015-08-18 17:48:39 2015-08-19 17:17:00 2015-08-20 17:14:47 2015-08-21 16:45:06 NaT 2015-08-25 17:02:23 2015-08-26 16:59:12 2015-08-27 16:46:28 NaT 2015-08-31 16:49:45 2015-09-01 17:08:51 2015-09-02 16:28:03 2015-09-03 17:14:09 2015-09-04 17:17:52 2015-09-07 17:26:41 2015-09-08 17:37:28 2015-09-09 17:39:07 2015-09-10 16:36:34 2015-09-11 18:06:15 2015-09-14 17:36:42 2015-09-15 16:39:57 2015-09-16 16:42:51 NaT 2015-09-18 17:47:54 2015-09-21 17:15:09 2015-09-22 16:57:27 2015-09-23 16:45:31 2015-09-24 17:49:36 NaT 2015-09-28 17:02:26 2015-09-29 17:00:39 2015-09-30 17:21:44 2015-10-01 16:47:16 NaT NaT 2015-10-06 17:03:32 2015-10-07 17:32:56 2015-10-08 17:00:37 2015-10-09 17:05:07 2015-10-12 17:34:50 2015-10-13 17:25:05 2015-10-14 16:49:13 2015-10-15 16:53:25 2015-10-16 16:44:32 2015-10-19 17:03:32 2015-10-20 16:41:21 2015-10-21 17:33:33 2015-10-22 16:37:16 2015-10-23 17:01:41 2015-10-26 17:25:23 2015-10-27 16:35:01 2015-10-28 16:42:00 2015-10-29 16:57:12 NaT 2015-11-02 17:04:40 2015-11-03 17:44:36 2015-11-04 17:02:18 2015-11-05 17:38:56 NaT NaT NaT NaT 2015-11-12 17:44:45 2015-11-13 16:21:17 2015-11-16 16:43:22 2015-11-17 16:56:08 2015-11-18 17:19:37 2015-11-19 18:03:01 2015-11-20 17:33:33 2015-11-23 17:03:38 2015-11-24 17:49:18 2015-11-25 16:41:49 2015-11-26 17:18:08 2015-11-27 17:06:19 2015-11-30 17:08:57 2015-12-01 17:09:31 2015-12-02 17:13:35 2015-12-03 16:36:51 2015-12-04 16:47:21 2015-12-07 17:24:39 2015-12-08 17:21:05 2015-12-09 17:17:58 2015-12-10 17:10:02 2015-12-11 16:44:02 2015-12-14 17:23:57 2015-12-15 16:37:15 2015-12-16 17:40:56 2015-12-17 17:21:57 2015-12-18 17:55:23 2015-12-21 16:49:09 2015-12-22 17:24:00 2015-12-23 17:36:35 2015-12-24 16:48:21 NaT 2015-12-28 17:19:34 2015-12-29 16:58:16 2015-12-30 17:40:11 2015-12-31 17:09:14
5 NaT 2015-01-02 18:31:37 2015-01-05 17:49:15 2015-01-06 17:26:25 2015-01-07 17:37:59 2015-01-08 17:59:28 2015-01-09 17:44:08 2015-01-12 18:51:21 2015-01-13 18:14:58 NaT 2015-01-15 18:21:48 2015-01-16 18:28:03 NaT 2015-01-20 17:59:24 2015-01-21 18:41:38 2015-01-22 18:27:37 2015-01-23 16:53:11 NaT 2015-01-27 17:52:43 2015-01-28 17:40:46 2015-01-29 18:16:26 2015-01-30 18:26:33 2015-02-02 17:53:21 2015-02-03 17:57:43 2015-02-04 18:03:13 2015-02-05 18:07:43 2015-02-06 17:38:14 2015-02-09 18:26:28 2015-02-10 17:24:32 2015-02-11 18:01:25 2015-02-12 17:42:32 2015-02-13 18:28:09 2015-02-16 17:51:18 2015-02-17 18:03:31 2015-02-18 17:43:37 2015-02-19 17:29:07 2015-02-20 18:16:03 2015-02-23 18:13:52 2015-02-24 17:58:08 2015-02-25 17:57:18 2015-02-26 17:40:40 2015-02-27 18:05:49 2015-03-02 18:41:28 2015-03-03 17:39:12 2015-03-04 18:04:54 NaT 2015-03-06 18:04:19 2015-03-09 18:18:21 2015-03-10 18:06:41 2015-03-11 17:56:40 2015-03-12 17:39:04 2015-03-13 18:15:55 2015-03-16 17:46:49 2015-03-17 18:13:29 2015-03-18 17:55:43 2015-03-19 17:51:13 2015-03-20 17:32:26 2015-03-23 17:53:16 2015-03-24 17:18:09 2015-03-25 18:07:26 2015-03-26 17:53:06 2015-03-27 17:41:37 2015-03-30 17:41:26 2015-03-31 18:16:42 2015-04-01 17:48:19 2015-04-02 18:00:21 2015-04-03 17:41:24 2015-04-06 18:26:22 2015-04-07 17:51:27 2015-04-08 17:43:22 2015-04-09 18:06:31 2015-04-10 18:06:04 2015-04-13 17:51:39 2015-04-14 18:29:43 2015-04-15 18:02:54 2015-04-16 17:59:12 2015-04-17 17:55:12 2015-04-20 18:14:32 NaT 2015-04-22 17:38:59 2015-04-23 17:50:36 2015-04-24 18:08:50 2015-04-27 18:58:57 2015-04-28 17:10:52 2015-04-29 18:00:03 2015-04-30 18:07:51 NaT 2015-05-04 17:36:37 2015-05-05 18:29:57 2015-05-06 19:01:52 2015-05-07 17:27:09 2015-05-08 18:00:47 2015-05-11 17:30:39 2015-05-12 18:24:49 2015-05-13 17:38:22 2015-05-14 18:05:41 2015-05-15 17:15:55 2015-05-18 17:39:56 2015-05-19 18:00:22 2015-05-20 18:17:20 2015-05-21 17:37:11 2015-05-22 18:23:13 2015-05-25 19:05:36 2015-05-26 18:14:29 2015-05-27 18:03:27 2015-05-28 17:36:50 2015-05-29 18:00:28 2015-06-01 18:26:09 2015-06-02 17:55:32 2015-06-03 18:16:53 2015-06-04 17:24:37 2015-06-05 18:01:10 2015-06-08 18:04:36 2015-06-09 18:25:23 2015-06-10 18:07:35 NaT 2015-06-12 18:43:31 2015-06-15 17:52:52 2015-06-16 18:31:37 2015-06-17 18:08:18 2015-06-18 18:42:59 2015-06-19 17:23:54 2015-06-22 18:46:28 2015-06-23 17:40:27 2015-06-24 17:45:55 2015-06-25 17:53:28 2015-06-26 17:44:16 2015-06-29 18:05:00 2015-06-30 18:10:45 2015-07-01 17:31:06 2015-07-02 17:23:40 2015-07-03 17:43:59 2015-07-06 17:11:03 2015-07-07 18:19:10 2015-07-08 17:36:00 2015-07-09 18:43:20 2015-07-10 18:21:59 2015-07-13 17:56:12 2015-07-14 17:26:34 2015-07-15 17:53:49 2015-07-16 17:22:12 NaT 2015-07-20 18:06:55 2015-07-21 18:01:34 2015-07-22 17:40:42 2015-07-23 17:55:48 2015-07-24 17:50:56 2015-07-27 18:23:29 2015-07-28 18:21:29 2015-07-29 18:35:09 2015-07-30 18:32:28 2015-07-31 17:50:13 2015-08-03 18:07:57 2015-08-04 18:06:41 2015-08-05 17:25:16 2015-08-06 17:32:20 2015-08-07 17:36:27 2015-08-10 18:05:13 2015-08-11 18:33:12 NaT 2015-08-13 18:29:03 2015-08-14 17:53:07 2015-08-17 18:31:46 2015-08-18 17:28:23 2015-08-19 18:01:59 2015-08-20 17:34:12 2015-08-21 17:55:17 2015-08-24 18:14:19 2015-08-25 18:50:22 2015-08-26 17:48:10 2015-08-27 17:47:07 2015-08-28 17:37:21 2015-08-31 17:38:06 2015-09-01 17:28:46 2015-09-02 18:37:08 2015-09-03 19:22:01 2015-09-04 18:11:10 2015-09-07 17:35:38 2015-09-08 18:10:45 2015-09-09 17:50:46 2015-09-10 18:06:16 2015-09-11 17:27:08 2015-09-14 17:57:45 2015-09-15 17:31:24 2015-09-16 17:42:22 NaT 2015-09-18 17:20:26 2015-09-21 17:38:34 2015-09-22 18:04:31 2015-09-23 18:02:25 2015-09-24 18:01:54 2015-09-25 18:19:49 2015-09-28 18:09:08 2015-09-29 18:07:01 2015-09-30 17:51:17 2015-10-01 17:37:41 NaT 2015-10-05 18:31:59 2015-10-06 17:42:05 2015-10-07 17:28:47 2015-10-08 17:51:33 2015-10-09 17:17:12 2015-10-12 18:27:44 2015-10-13 17:41:27 2015-10-14 17:58:58 2015-10-15 17:56:00 2015-10-16 17:47:56 2015-10-19 17:58:30 2015-10-20 18:24:21 2015-10-21 17:59:41 2015-10-22 18:14:04 2015-10-23 18:26:48 2015-10-26 17:42:45 2015-10-27 18:11:11 2015-10-28 18:04:54 2015-10-29 18:52:40 2015-10-30 17:29:23 2015-11-02 18:45:59 2015-11-03 17:45:51 2015-11-04 18:12:50 2015-11-05 18:22:40 2015-11-06 18:09:24 NaT NaT NaT 2015-11-12 18:01:24 2015-11-13 17:47:07 2015-11-16 18:13:04 2015-11-17 17:25:29 2015-11-18 18:02:00 2015-11-19 17:43:56 2015-11-20 18:32:04 2015-11-23 18:38:54 2015-11-24 18:28:51 2015-11-25 17:55:41 2015-11-26 17:32:31 2015-11-27 17:58:08 2015-11-30 18:26:15 2015-12-01 18:07:48 2015-12-02 17:02:52 2015-12-03 17:28:55 2015-12-04 17:47:17 2015-12-07 18:07:48 2015-12-08 17:50:08 2015-12-09 17:53:44 2015-12-10 18:14:09 2015-12-11 19:08:11 2015-12-14 17:55:40 2015-12-15 18:07:30 2015-12-16 18:17:11 2015-12-17 18:05:47 2015-12-18 17:52:48 2015-12-21 17:43:35 2015-12-22 18:07:57 2015-12-23 18:00:49 2015-12-24 17:59:22 NaT 2015-12-28 17:44:59 2015-12-29 18:47:00 2015-12-30 17:15:33 2015-12-31 17:42:14
In [14]:
df_out_time.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4410 entries, 1 to 4410
Columns: 261 entries, 2015-01-01 to 2015-12-31
dtypes: datetime64[ns](261)
memory usage: 8.8 MB

6 Transforming data

6.1 Processing working time

In [15]:
# Calculating working duration by taking difference between in and out times
df_work_duration = (df_out_time - df_in_time).replace({np.nan: dt.timedelta(0)}).applymap(pd.Timedelta.total_seconds)
df_work_duration.head()
Out[15]:
2015-01-01 2015-01-02 2015-01-05 2015-01-06 2015-01-07 2015-01-08 2015-01-09 2015-01-12 2015-01-13 2015-01-14 2015-01-15 2015-01-16 2015-01-19 2015-01-20 2015-01-21 2015-01-22 2015-01-23 2015-01-26 2015-01-27 2015-01-28 2015-01-29 2015-01-30 2015-02-02 2015-02-03 2015-02-04 2015-02-05 2015-02-06 2015-02-09 2015-02-10 2015-02-11 2015-02-12 2015-02-13 2015-02-16 2015-02-17 2015-02-18 2015-02-19 2015-02-20 2015-02-23 2015-02-24 2015-02-25 2015-02-26 2015-02-27 2015-03-02 2015-03-03 2015-03-04 2015-03-05 2015-03-06 2015-03-09 2015-03-10 2015-03-11 2015-03-12 2015-03-13 2015-03-16 2015-03-17 2015-03-18 2015-03-19 2015-03-20 2015-03-23 2015-03-24 2015-03-25 2015-03-26 2015-03-27 2015-03-30 2015-03-31 2015-04-01 2015-04-02 2015-04-03 2015-04-06 2015-04-07 2015-04-08 2015-04-09 2015-04-10 2015-04-13 2015-04-14 2015-04-15 2015-04-16 2015-04-17 2015-04-20 2015-04-21 2015-04-22 2015-04-23 2015-04-24 2015-04-27 2015-04-28 2015-04-29 2015-04-30 2015-05-01 2015-05-04 2015-05-05 2015-05-06 2015-05-07 2015-05-08 2015-05-11 2015-05-12 2015-05-13 2015-05-14 2015-05-15 2015-05-18 2015-05-19 2015-05-20 2015-05-21 2015-05-22 2015-05-25 2015-05-26 2015-05-27 2015-05-28 2015-05-29 2015-06-01 2015-06-02 2015-06-03 2015-06-04 2015-06-05 2015-06-08 2015-06-09 2015-06-10 2015-06-11 2015-06-12 2015-06-15 2015-06-16 2015-06-17 2015-06-18 2015-06-19 2015-06-22 2015-06-23 2015-06-24 2015-06-25 2015-06-26 2015-06-29 2015-06-30 2015-07-01 2015-07-02 2015-07-03 2015-07-06 2015-07-07 2015-07-08 2015-07-09 2015-07-10 2015-07-13 2015-07-14 2015-07-15 2015-07-16 2015-07-17 2015-07-20 2015-07-21 2015-07-22 2015-07-23 2015-07-24 2015-07-27 2015-07-28 2015-07-29 2015-07-30 2015-07-31 2015-08-03 2015-08-04 2015-08-05 2015-08-06 2015-08-07 2015-08-10 2015-08-11 2015-08-12 2015-08-13 2015-08-14 2015-08-17 2015-08-18 2015-08-19 2015-08-20 2015-08-21 2015-08-24 2015-08-25 2015-08-26 2015-08-27 2015-08-28 2015-08-31 2015-09-01 2015-09-02 2015-09-03 2015-09-04 2015-09-07 2015-09-08 2015-09-09 2015-09-10 2015-09-11 2015-09-14 2015-09-15 2015-09-16 2015-09-17 2015-09-18 2015-09-21 2015-09-22 2015-09-23 2015-09-24 2015-09-25 2015-09-28 2015-09-29 2015-09-30 2015-10-01 2015-10-02 2015-10-05 2015-10-06 2015-10-07 2015-10-08 2015-10-09 2015-10-12 2015-10-13 2015-10-14 2015-10-15 2015-10-16 2015-10-19 2015-10-20 2015-10-21 2015-10-22 2015-10-23 2015-10-26 2015-10-27 2015-10-28 2015-10-29 2015-10-30 2015-11-02 2015-11-03 2015-11-04 2015-11-05 2015-11-06 2015-11-09 2015-11-10 2015-11-11 2015-11-12 2015-11-13 2015-11-16 2015-11-17 2015-11-18 2015-11-19 2015-11-20 2015-11-23 2015-11-24 2015-11-25 2015-11-26 2015-11-27 2015-11-30 2015-12-01 2015-12-02 2015-12-03 2015-12-04 2015-12-07 2015-12-08 2015-12-09 2015-12-10 2015-12-11 2015-12-14 2015-12-15 2015-12-16 2015-12-17 2015-12-18 2015-12-21 2015-12-22 2015-12-23 2015-12-24 2015-12-25 2015-12-28 2015-12-29 2015-12-30 2015-12-31
1 0.000 25,950.000 25,883.000 26,679.000 25,224.000 26,243.000 26,944.000 26,146.000 28,192.000 0.000 26,449.000 26,163.000 0.000 24,393.000 25,542.000 25,382.000 27,377.000 0.000 27,465.000 25,628.000 26,689.000 24,659.000 24,846.000 25,931.000 27,379.000 27,235.000 26,892.000 27,365.000 26,162.000 25,895.000 26,767.000 25,939.000 27,381.000 26,698.000 28,223.000 0.000 28,196.000 26,788.000 26,010.000 27,301.000 25,923.000 26,589.000 25,765.000 24,335.000 27,880.000 0.000 28,134.000 26,671.000 24,923.000 25,780.000 25,488.000 26,317.000 0.000 24,894.000 25,912.000 28,252.000 25,785.000 27,419.000 26,107.000 26,804.000 0.000 27,201.000 26,485.000 28,314.000 26,413.000 27,571.000 25,876.000 25,347.000 27,504.000 26,738.000 28,091.000 25,617.000 26,453.000 25,728.000 26,855.000 27,925.000 26,225.000 27,703.000 25,112.000 27,090.000 26,411.000 27,225.000 26,071.000 28,401.000 27,690.000 27,029.000 0.000 26,678.000 26,311.000 26,844.000 25,426.000 27,414.000 0.000 26,801.000 27,009.000 28,446.000 24,541.000 0.000 26,399.000 26,350.000 26,875.000 25,664.000 27,263.000 25,919.000 28,360.000 26,812.000 0.000 0.000 27,782.000 27,051.000 25,351.000 0.000 25,462.000 0.000 25,471.000 27,270.000 26,717.000 27,343.000 27,152.000 27,094.000 27,077.000 26,096.000 25,842.000 24,886.000 25,606.000 27,913.000 25,388.000 27,582.000 26,291.000 27,805.000 27,426.000 25,772.000 28,611.000 27,545.000 26,973.000 24,973.000 26,527.000 26,195.000 25,660.000 27,448.000 26,186.000 0.000 25,514.000 26,114.000 28,364.000 24,054.000 26,930.000 26,904.000 24,138.000 25,735.000 26,036.000 26,570.000 25,018.000 25,844.000 27,629.000 27,020.000 26,729.000 27,374.000 27,092.000 27,399.000 26,528.000 25,710.000 25,400.000 27,462.000 24,915.000 25,989.000 27,831.000 26,959.000 26,700.000 27,733.000 27,680.000 27,197.000 25,212.000 0.000 25,874.000 26,761.000 27,106.000 26,580.000 26,439.000 27,108.000 26,554.000 26,403.000 24,726.000 25,682.000 26,400.000 0.000 26,828.000 26,750.000 27,554.000 25,358.000 26,354.000 27,836.000 27,520.000 27,332.000 26,522.000 28,036.000 0.000 25,301.000 27,651.000 26,635.000 26,957.000 27,380.000 25,778.000 0.000 26,959.000 27,262.000 24,559.000 27,170.000 25,338.000 0.000 0.000 25,813.000 27,886.000 25,383.000 27,732.000 25,627.000 26,327.000 24,705.000 27,341.000 25,000.000 27,498.000 25,185.000 0.000 0.000 0.000 25,843.000 26,378.000 28,026.000 25,543.000 26,678.000 25,179.000 28,145.000 26,345.000 27,314.000 26,700.000 26,764.000 25,513.000 25,247.000 26,514.000 27,272.000 27,304.000 25,633.000 24,802.000 28,587.000 28,614.000 25,909.000 27,547.000 27,140.000 0.000 27,187.000 0.000 0.000 26,421.000 26,625.000 23,417.000 27,347.000 0.000 27,986.000 26,334.000 28,004.000 25,489.000
2 0.000 29,193.000 26,837.000 0.000 26,629.000 26,700.000 25,743.000 27,401.000 26,204.000 0.000 27,407.000 27,819.000 27,279.000 27,370.000 28,460.000 26,555.000 28,004.000 0.000 26,882.000 25,883.000 26,134.000 25,413.000 27,472.000 27,477.000 27,519.000 27,494.000 28,443.000 0.000 28,924.000 27,496.000 28,672.000 27,840.000 27,717.000 26,665.000 28,454.000 28,486.000 26,727.000 29,720.000 28,080.000 27,908.000 28,248.000 29,233.000 27,654.000 27,697.000 28,539.000 0.000 26,423.000 29,268.000 27,686.000 28,752.000 27,081.000 28,151.000 25,939.000 26,514.000 26,163.000 27,711.000 26,749.000 28,613.000 0.000 26,355.000 28,532.000 27,987.000 26,647.000 28,702.000 27,822.000 27,914.000 27,423.000 26,537.000 26,904.000 26,579.000 25,516.000 28,516.000 29,494.000 29,427.000 26,910.000 26,058.000 27,860.000 28,023.000 30,480.000 28,813.000 28,649.000 27,117.000 27,501.000 27,384.000 0.000 29,095.000 0.000 28,055.000 27,785.000 28,289.000 27,390.000 25,845.000 27,094.000 28,502.000 26,929.000 0.000 26,631.000 0.000 29,096.000 25,935.000 27,773.000 28,882.000 29,087.000 29,229.000 28,685.000 26,200.000 28,990.000 29,186.000 27,419.000 28,057.000 24,915.000 27,591.000 27,301.000 26,747.000 29,131.000 28,729.000 0.000 28,981.000 27,961.000 29,612.000 28,381.000 27,743.000 28,803.000 28,058.000 27,103.000 28,605.000 27,647.000 28,686.000 25,601.000 27,721.000 28,108.000 29,318.000 27,790.000 28,255.000 28,882.000 25,587.000 27,733.000 0.000 27,040.000 27,513.000 29,675.000 0.000 26,939.000 27,518.000 28,105.000 28,562.000 28,303.000 0.000 28,337.000 28,576.000 26,560.000 26,111.000 27,990.000 28,275.000 28,126.000 30,761.000 28,531.000 27,990.000 26,120.000 27,946.000 29,039.000 28,244.000 29,510.000 26,955.000 28,417.000 27,585.000 27,097.000 26,914.000 26,920.000 26,965.000 28,415.000 27,244.000 28,212.000 24,869.000 26,996.000 28,473.000 28,125.000 29,496.000 0.000 27,283.000 28,409.000 26,945.000 26,848.000 26,692.000 28,208.000 0.000 26,656.000 27,733.000 26,154.000 27,727.000 29,102.000 29,302.000 27,706.000 28,907.000 28,338.000 26,724.000 0.000 29,158.000 25,416.000 28,617.000 29,703.000 27,472.000 25,521.000 29,889.000 27,256.000 0.000 27,472.000 28,588.000 27,258.000 25,481.000 30,812.000 24,211.000 29,338.000 28,476.000 27,170.000 28,326.000 27,911.000 26,672.000 27,744.000 28,558.000 26,509.000 26,760.000 0.000 0.000 0.000 27,556.000 29,535.000 29,703.000 28,805.000 28,544.000 0.000 28,299.000 27,692.000 27,716.000 27,072.000 28,454.000 28,637.000 27,114.000 26,685.000 29,223.000 28,943.000 28,738.000 27,032.000 27,777.000 28,704.000 28,158.000 26,849.000 27,130.000 26,767.000 29,719.000 28,542.000 28,451.000 27,914.000 27,764.000 26,768.000 0.000 0.000 27,413.000 28,737.000 28,750.000 29,618.000
3 0.000 24,093.000 26,156.000 23,059.000 24,354.000 26,442.000 24,701.000 26,707.000 25,199.000 0.000 26,778.000 25,957.000 25,460.000 24,914.000 24,490.000 26,923.000 24,918.000 0.000 26,097.000 22,701.000 25,999.000 25,287.000 25,456.000 23,473.000 26,511.000 22,890.000 24,864.000 23,975.000 24,571.000 24,429.000 23,241.000 25,721.000 25,282.000 24,067.000 25,369.000 25,057.000 24,791.000 27,917.000 24,306.000 24,985.000 27,191.000 24,678.000 26,975.000 25,396.000 25,879.000 0.000 24,376.000 24,649.000 24,743.000 25,487.000 26,192.000 24,557.000 24,338.000 25,635.000 26,340.000 23,564.000 0.000 24,363.000 24,841.000 25,135.000 26,706.000 0.000 24,172.000 26,922.000 24,883.000 24,600.000 24,294.000 24,859.000 26,046.000 25,057.000 26,862.000 24,319.000 26,156.000 26,039.000 26,175.000 25,418.000 27,402.000 26,818.000 25,707.000 26,806.000 0.000 24,892.000 25,210.000 26,435.000 25,069.000 25,382.000 0.000 24,435.000 22,789.000 25,602.000 24,741.000 24,737.000 25,317.000 25,567.000 24,592.000 26,194.000 23,797.000 26,056.000 26,691.000 26,772.000 25,618.000 24,598.000 26,208.000 22,557.000 25,745.000 22,706.000 24,277.000 24,440.000 25,839.000 26,329.000 24,530.000 24,064.000 27,506.000 24,962.000 26,411.000 25,960.000 25,631.000 25,550.000 24,895.000 25,179.000 26,132.000 24,985.000 23,820.000 25,714.000 0.000 23,655.000 25,406.000 24,374.000 23,722.000 24,179.000 24,667.000 24,943.000 28,679.000 24,347.000 24,693.000 28,777.000 25,377.000 22,856.000 26,620.000 25,942.000 24,968.000 0.000 24,915.000 25,941.000 23,920.000 24,994.000 24,689.000 25,615.000 27,821.000 26,617.000 25,124.000 25,961.000 25,364.000 24,488.000 0.000 24,784.000 24,884.000 26,091.000 24,395.000 25,356.000 24,297.000 25,374.000 24,832.000 26,255.000 25,320.000 24,292.000 25,241.000 25,004.000 24,133.000 23,962.000 25,009.000 23,941.000 24,265.000 26,442.000 26,051.000 25,726.000 25,491.000 24,572.000 25,711.000 24,479.000 27,149.000 23,731.000 26,505.000 24,426.000 24,969.000 0.000 25,140.000 25,982.000 24,781.000 25,791.000 25,326.000 25,119.000 28,123.000 26,782.000 24,335.000 26,253.000 0.000 25,271.000 27,064.000 25,784.000 24,953.000 24,898.000 25,510.000 24,092.000 25,501.000 24,449.000 25,254.000 23,596.000 25,095.000 25,209.000 25,284.000 24,727.000 26,095.000 27,097.000 23,258.000 24,494.000 23,867.000 27,685.000 25,352.000 24,132.000 24,075.000 23,600.000 0.000 0.000 0.000 26,517.000 25,185.000 24,487.000 26,255.000 26,169.000 22,314.000 27,219.000 24,998.000 23,991.000 25,429.000 26,846.000 0.000 25,120.000 24,973.000 24,255.000 24,361.000 25,447.000 24,475.000 26,965.000 27,126.000 0.000 25,918.000 25,547.000 24,304.000 24,850.000 25,986.000 24,429.000 25,789.000 24,486.000 24,229.000 24,659.000 0.000 25,286.000 26,780.000 27,140.000 24,431.000
4 0.000 26,418.000 26,251.000 24,995.000 24,910.000 24,663.000 25,895.000 25,195.000 26,303.000 0.000 24,756.000 24,868.000 23,467.000 25,844.000 24,138.000 25,241.000 25,725.000 0.000 26,117.000 27,163.000 24,873.000 25,991.000 25,444.000 26,770.000 25,118.000 25,745.000 25,723.000 26,402.000 0.000 25,419.000 26,428.000 25,724.000 26,461.000 25,598.000 24,741.000 27,085.000 24,112.000 25,847.000 25,431.000 22,610.000 26,489.000 26,150.000 0.000 26,652.000 23,269.000 0.000 0.000 25,766.000 26,630.000 26,976.000 27,447.000 24,542.000 24,208.000 25,950.000 25,595.000 25,223.000 25,943.000 25,012.000 24,936.000 26,446.000 28,126.000 0.000 25,352.000 27,408.000 27,035.000 24,890.000 25,450.000 26,633.000 25,400.000 25,990.000 25,383.000 25,099.000 24,603.000 0.000 23,384.000 25,485.000 25,144.000 26,301.000 25,480.000 26,126.000 27,493.000 26,797.000 26,547.000 24,937.000 25,719.000 25,635.000 0.000 27,014.000 24,218.000 28,238.000 25,763.000 27,772.000 25,685.000 26,188.000 25,829.000 26,148.000 25,727.000 24,793.000 27,031.000 23,843.000 25,957.000 0.000 25,965.000 26,543.000 27,297.000 27,604.000 25,612.000 25,076.000 28,086.000 26,340.000 26,825.000 24,434.000 26,219.000 25,927.000 26,116.000 26,760.000 25,636.000 26,398.000 24,859.000 26,165.000 27,907.000 26,274.000 27,248.000 24,616.000 26,878.000 25,043.000 25,693.000 25,750.000 25,918.000 24,714.000 26,537.000 27,027.000 26,556.000 27,196.000 25,010.000 26,190.000 27,410.000 26,560.000 27,914.000 26,937.000 26,087.000 0.000 24,713.000 24,578.000 0.000 26,664.000 26,616.000 26,076.000 25,242.000 28,269.000 26,035.000 26,188.000 27,552.000 26,222.000 0.000 25,379.000 26,640.000 26,777.000 27,560.000 25,832.000 24,389.000 26,696.000 25,732.000 27,620.000 26,322.000 26,191.000 24,634.000 0.000 25,828.000 25,520.000 24,023.000 0.000 25,143.000 26,906.000 25,165.000 26,259.000 25,533.000 25,353.000 26,151.000 26,453.000 26,230.000 26,131.000 28,407.000 24,508.000 25,503.000 0.000 27,059.000 25,616.000 25,155.000 25,452.000 27,323.000 0.000 25,239.000 24,895.000 24,675.000 24,818.000 0.000 0.000 25,818.000 27,001.000 26,993.000 24,732.000 26,222.000 25,606.000 24,897.000 25,815.000 24,483.000 25,556.000 26,958.000 25,376.000 24,457.000 27,032.000 26,760.000 24,815.000 25,429.000 25,413.000 0.000 25,475.000 26,080.000 24,572.000 27,047.000 0.000 0.000 0.000 0.000 27,379.000 24,249.000 25,259.000 23,730.000 26,224.000 27,346.000 27,139.000 25,678.000 28,112.000 25,681.000 24,837.000 25,022.000 26,316.000 26,203.000 25,447.000 25,966.000 24,884.000 26,657.000 26,215.000 24,759.000 27,047.000 25,080.000 25,401.000 25,198.000 26,327.000 26,841.000 27,465.000 24,648.000 26,375.000 26,689.000 25,506.000 0.000 26,810.000 26,700.000 26,519.000 25,679.000
5 0.000 29,000.000 28,757.000 27,657.000 28,102.000 27,584.000 27,798.000 30,114.000 29,726.000 0.000 29,736.000 30,253.000 0.000 28,135.000 29,603.000 29,847.000 26,317.000 0.000 28,544.000 28,543.000 29,085.000 28,903.000 29,394.000 28,904.000 28,788.000 29,126.000 27,751.000 29,472.000 27,644.000 28,669.000 27,615.000 31,394.000 28,520.000 28,449.000 29,139.000 28,246.000 29,631.000 28,090.000 29,228.000 28,916.000 27,755.000 28,682.000 31,295.000 28,621.000 29,062.000 0.000 29,458.000 30,083.000 29,033.000 29,317.000 29,101.000 28,987.000 28,043.000 28,621.000 29,182.000 27,864.000 27,970.000 27,629.000 27,631.000 28,230.000 28,025.000 28,434.000 27,717.000 28,094.000 28,935.000 29,166.000 28,180.000 30,080.000 28,935.000 27,295.000 29,324.000 28,608.000 28,337.000 30,701.000 29,064.000 29,406.000 28,364.000 29,956.000 0.000 26,973.000 28,189.000 28,232.000 31,214.000 26,975.000 27,532.000 28,026.000 0.000 28,479.000 30,012.000 30,794.000 26,505.000 27,626.000 28,048.000 31,072.000 26,905.000 28,671.000 28,122.000 27,077.000 29,942.000 28,580.000 28,278.000 29,358.000 31,121.000 30,604.000 29,208.000 27,892.000 29,949.000 29,023.000 28,077.000 29,492.000 27,458.000 29,598.000 29,367.000 28,105.000 28,131.000 0.000 29,854.000 29,267.000 29,094.000 30,059.000 30,550.000 27,401.000 31,340.000 28,156.000 27,179.000 28,443.000 29,669.000 28,588.000 29,221.000 27,751.000 28,722.000 27,358.000 27,002.000 28,532.000 28,141.000 28,562.000 29,723.000 28,562.000 28,179.000 29,368.000 28,033.000 0.000 27,647.000 29,966.000 27,758.000 27,968.000 29,260.000 30,264.000 27,474.000 27,614.000 30,156.000 28,833.000 27,890.000 29,871.000 28,323.000 27,377.000 28,084.000 28,348.000 30,082.000 0.000 30,172.000 29,081.000 29,541.000 27,189.000 29,797.000 28,402.000 28,977.000 29,695.000 30,336.000 29,286.000 27,991.000 27,697.000 26,926.000 27,628.000 30,054.000 31,782.000 29,204.000 28,115.000 28,552.000 29,124.000 30,252.000 27,938.000 29,347.000 28,031.000 28,025.000 0.000 27,309.000 28,246.000 28,571.000 29,890.000 29,572.000 30,814.000 29,473.000 29,797.000 28,171.000 27,543.000 0.000 30,621.000 28,091.000 27,587.000 28,755.000 27,607.000 31,200.000 28,619.000 28,388.000 29,244.000 28,112.000 27,417.000 30,603.000 29,938.000 29,951.000 29,162.000 29,751.000 29,937.000 28,749.000 29,092.000 27,315.000 30,275.000 27,082.000 29,220.000 29,261.000 29,552.000 0.000 0.000 0.000 30,055.000 28,315.000 28,633.000 28,460.000 28,570.000 27,372.000 30,830.000 30,403.000 30,755.000 28,406.000 27,460.000 29,240.000 30,373.000 28,480.000 26,640.000 27,598.000 28,326.000 29,174.000 30,726.000 28,275.000 28,516.000 31,305.000 27,322.000 29,495.000 29,266.000 29,952.000 28,453.000 27,594.000 28,647.000 28,033.000 29,698.000 0.000 27,584.000 29,767.000 28,632.000 28,865.000
In [16]:
# Adding Employee ID
df_work_duration.reset_index(inplace = True)
df_work_duration.rename(columns={'index': 'EmployeeID'},inplace=True)
df_work_duration
Out[16]:
EmployeeID 2015-01-01 2015-01-02 2015-01-05 2015-01-06 2015-01-07 2015-01-08 2015-01-09 2015-01-12 2015-01-13 2015-01-14 2015-01-15 2015-01-16 2015-01-19 2015-01-20 2015-01-21 2015-01-22 2015-01-23 2015-01-26 2015-01-27 2015-01-28 2015-01-29 2015-01-30 2015-02-02 2015-02-03 2015-02-04 2015-02-05 2015-02-06 2015-02-09 2015-02-10 2015-02-11 2015-02-12 2015-02-13 2015-02-16 2015-02-17 2015-02-18 2015-02-19 2015-02-20 2015-02-23 2015-02-24 2015-02-25 2015-02-26 2015-02-27 2015-03-02 2015-03-03 2015-03-04 2015-03-05 2015-03-06 2015-03-09 2015-03-10 2015-03-11 2015-03-12 2015-03-13 2015-03-16 2015-03-17 2015-03-18 2015-03-19 2015-03-20 2015-03-23 2015-03-24 2015-03-25 2015-03-26 2015-03-27 2015-03-30 2015-03-31 2015-04-01 2015-04-02 2015-04-03 2015-04-06 2015-04-07 2015-04-08 2015-04-09 2015-04-10 2015-04-13 2015-04-14 2015-04-15 2015-04-16 2015-04-17 2015-04-20 2015-04-21 2015-04-22 2015-04-23 2015-04-24 2015-04-27 2015-04-28 2015-04-29 2015-04-30 2015-05-01 2015-05-04 2015-05-05 2015-05-06 2015-05-07 2015-05-08 2015-05-11 2015-05-12 2015-05-13 2015-05-14 2015-05-15 2015-05-18 2015-05-19 2015-05-20 2015-05-21 2015-05-22 2015-05-25 2015-05-26 2015-05-27 2015-05-28 2015-05-29 2015-06-01 2015-06-02 2015-06-03 2015-06-04 2015-06-05 2015-06-08 2015-06-09 2015-06-10 2015-06-11 2015-06-12 2015-06-15 2015-06-16 2015-06-17 2015-06-18 2015-06-19 2015-06-22 2015-06-23 2015-06-24 2015-06-25 2015-06-26 2015-06-29 2015-06-30 2015-07-01 2015-07-02 2015-07-03 2015-07-06 2015-07-07 2015-07-08 2015-07-09 2015-07-10 2015-07-13 2015-07-14 2015-07-15 2015-07-16 2015-07-17 2015-07-20 2015-07-21 2015-07-22 2015-07-23 2015-07-24 2015-07-27 2015-07-28 2015-07-29 2015-07-30 2015-07-31 2015-08-03 2015-08-04 2015-08-05 2015-08-06 2015-08-07 2015-08-10 2015-08-11 2015-08-12 2015-08-13 2015-08-14 2015-08-17 2015-08-18 2015-08-19 2015-08-20 2015-08-21 2015-08-24 2015-08-25 2015-08-26 2015-08-27 2015-08-28 2015-08-31 2015-09-01 2015-09-02 2015-09-03 2015-09-04 2015-09-07 2015-09-08 2015-09-09 2015-09-10 2015-09-11 2015-09-14 2015-09-15 2015-09-16 2015-09-17 2015-09-18 2015-09-21 2015-09-22 2015-09-23 2015-09-24 2015-09-25 2015-09-28 2015-09-29 2015-09-30 2015-10-01 2015-10-02 2015-10-05 2015-10-06 2015-10-07 2015-10-08 2015-10-09 2015-10-12 2015-10-13 2015-10-14 2015-10-15 2015-10-16 2015-10-19 2015-10-20 2015-10-21 2015-10-22 2015-10-23 2015-10-26 2015-10-27 2015-10-28 2015-10-29 2015-10-30 2015-11-02 2015-11-03 2015-11-04 2015-11-05 2015-11-06 2015-11-09 2015-11-10 2015-11-11 2015-11-12 2015-11-13 2015-11-16 2015-11-17 2015-11-18 2015-11-19 2015-11-20 2015-11-23 2015-11-24 2015-11-25 2015-11-26 2015-11-27 2015-11-30 2015-12-01 2015-12-02 2015-12-03 2015-12-04 2015-12-07 2015-12-08 2015-12-09 2015-12-10 2015-12-11 2015-12-14 2015-12-15 2015-12-16 2015-12-17 2015-12-18 2015-12-21 2015-12-22 2015-12-23 2015-12-24 2015-12-25 2015-12-28 2015-12-29 2015-12-30 2015-12-31
0 1 0.000 25,950.000 25,883.000 26,679.000 25,224.000 26,243.000 26,944.000 26,146.000 28,192.000 0.000 26,449.000 26,163.000 0.000 24,393.000 25,542.000 25,382.000 27,377.000 0.000 27,465.000 25,628.000 26,689.000 24,659.000 24,846.000 25,931.000 27,379.000 27,235.000 26,892.000 27,365.000 26,162.000 25,895.000 26,767.000 25,939.000 27,381.000 26,698.000 28,223.000 0.000 28,196.000 26,788.000 26,010.000 27,301.000 25,923.000 26,589.000 25,765.000 24,335.000 27,880.000 0.000 28,134.000 26,671.000 24,923.000 25,780.000 25,488.000 26,317.000 0.000 24,894.000 25,912.000 28,252.000 25,785.000 27,419.000 26,107.000 26,804.000 0.000 27,201.000 26,485.000 28,314.000 26,413.000 27,571.000 25,876.000 25,347.000 27,504.000 26,738.000 28,091.000 25,617.000 26,453.000 25,728.000 26,855.000 27,925.000 26,225.000 27,703.000 25,112.000 27,090.000 26,411.000 27,225.000 26,071.000 28,401.000 27,690.000 27,029.000 0.000 26,678.000 26,311.000 26,844.000 25,426.000 27,414.000 0.000 26,801.000 27,009.000 28,446.000 24,541.000 0.000 26,399.000 26,350.000 26,875.000 25,664.000 27,263.000 25,919.000 28,360.000 26,812.000 0.000 0.000 27,782.000 27,051.000 25,351.000 0.000 25,462.000 0.000 25,471.000 27,270.000 26,717.000 27,343.000 27,152.000 27,094.000 27,077.000 26,096.000 25,842.000 24,886.000 25,606.000 27,913.000 25,388.000 27,582.000 26,291.000 27,805.000 27,426.000 25,772.000 28,611.000 27,545.000 26,973.000 24,973.000 26,527.000 26,195.000 25,660.000 27,448.000 26,186.000 0.000 25,514.000 26,114.000 28,364.000 24,054.000 26,930.000 26,904.000 24,138.000 25,735.000 26,036.000 26,570.000 25,018.000 25,844.000 27,629.000 27,020.000 26,729.000 27,374.000 27,092.000 27,399.000 26,528.000 25,710.000 25,400.000 27,462.000 24,915.000 25,989.000 27,831.000 26,959.000 26,700.000 27,733.000 27,680.000 27,197.000 25,212.000 0.000 25,874.000 26,761.000 27,106.000 26,580.000 26,439.000 27,108.000 26,554.000 26,403.000 24,726.000 25,682.000 26,400.000 0.000 26,828.000 26,750.000 27,554.000 25,358.000 26,354.000 27,836.000 27,520.000 27,332.000 26,522.000 28,036.000 0.000 25,301.000 27,651.000 26,635.000 26,957.000 27,380.000 25,778.000 0.000 26,959.000 27,262.000 24,559.000 27,170.000 25,338.000 0.000 0.000 25,813.000 27,886.000 25,383.000 27,732.000 25,627.000 26,327.000 24,705.000 27,341.000 25,000.000 27,498.000 25,185.000 0.000 0.000 0.000 25,843.000 26,378.000 28,026.000 25,543.000 26,678.000 25,179.000 28,145.000 26,345.000 27,314.000 26,700.000 26,764.000 25,513.000 25,247.000 26,514.000 27,272.000 27,304.000 25,633.000 24,802.000 28,587.000 28,614.000 25,909.000 27,547.000 27,140.000 0.000 27,187.000 0.000 0.000 26,421.000 26,625.000 23,417.000 27,347.000 0.000 27,986.000 26,334.000 28,004.000 25,489.000
1 2 0.000 29,193.000 26,837.000 0.000 26,629.000 26,700.000 25,743.000 27,401.000 26,204.000 0.000 27,407.000 27,819.000 27,279.000 27,370.000 28,460.000 26,555.000 28,004.000 0.000 26,882.000 25,883.000 26,134.000 25,413.000 27,472.000 27,477.000 27,519.000 27,494.000 28,443.000 0.000 28,924.000 27,496.000 28,672.000 27,840.000 27,717.000 26,665.000 28,454.000 28,486.000 26,727.000 29,720.000 28,080.000 27,908.000 28,248.000 29,233.000 27,654.000 27,697.000 28,539.000 0.000 26,423.000 29,268.000 27,686.000 28,752.000 27,081.000 28,151.000 25,939.000 26,514.000 26,163.000 27,711.000 26,749.000 28,613.000 0.000 26,355.000 28,532.000 27,987.000 26,647.000 28,702.000 27,822.000 27,914.000 27,423.000 26,537.000 26,904.000 26,579.000 25,516.000 28,516.000 29,494.000 29,427.000 26,910.000 26,058.000 27,860.000 28,023.000 30,480.000 28,813.000 28,649.000 27,117.000 27,501.000 27,384.000 0.000 29,095.000 0.000 28,055.000 27,785.000 28,289.000 27,390.000 25,845.000 27,094.000 28,502.000 26,929.000 0.000 26,631.000 0.000 29,096.000 25,935.000 27,773.000 28,882.000 29,087.000 29,229.000 28,685.000 26,200.000 28,990.000 29,186.000 27,419.000 28,057.000 24,915.000 27,591.000 27,301.000 26,747.000 29,131.000 28,729.000 0.000 28,981.000 27,961.000 29,612.000 28,381.000 27,743.000 28,803.000 28,058.000 27,103.000 28,605.000 27,647.000 28,686.000 25,601.000 27,721.000 28,108.000 29,318.000 27,790.000 28,255.000 28,882.000 25,587.000 27,733.000 0.000 27,040.000 27,513.000 29,675.000 0.000 26,939.000 27,518.000 28,105.000 28,562.000 28,303.000 0.000 28,337.000 28,576.000 26,560.000 26,111.000 27,990.000 28,275.000 28,126.000 30,761.000 28,531.000 27,990.000 26,120.000 27,946.000 29,039.000 28,244.000 29,510.000 26,955.000 28,417.000 27,585.000 27,097.000 26,914.000 26,920.000 26,965.000 28,415.000 27,244.000 28,212.000 24,869.000 26,996.000 28,473.000 28,125.000 29,496.000 0.000 27,283.000 28,409.000 26,945.000 26,848.000 26,692.000 28,208.000 0.000 26,656.000 27,733.000 26,154.000 27,727.000 29,102.000 29,302.000 27,706.000 28,907.000 28,338.000 26,724.000 0.000 29,158.000 25,416.000 28,617.000 29,703.000 27,472.000 25,521.000 29,889.000 27,256.000 0.000 27,472.000 28,588.000 27,258.000 25,481.000 30,812.000 24,211.000 29,338.000 28,476.000 27,170.000 28,326.000 27,911.000 26,672.000 27,744.000 28,558.000 26,509.000 26,760.000 0.000 0.000 0.000 27,556.000 29,535.000 29,703.000 28,805.000 28,544.000 0.000 28,299.000 27,692.000 27,716.000 27,072.000 28,454.000 28,637.000 27,114.000 26,685.000 29,223.000 28,943.000 28,738.000 27,032.000 27,777.000 28,704.000 28,158.000 26,849.000 27,130.000 26,767.000 29,719.000 28,542.000 28,451.000 27,914.000 27,764.000 26,768.000 0.000 0.000 27,413.000 28,737.000 28,750.000 29,618.000
2 3 0.000 24,093.000 26,156.000 23,059.000 24,354.000 26,442.000 24,701.000 26,707.000 25,199.000 0.000 26,778.000 25,957.000 25,460.000 24,914.000 24,490.000 26,923.000 24,918.000 0.000 26,097.000 22,701.000 25,999.000 25,287.000 25,456.000 23,473.000 26,511.000 22,890.000 24,864.000 23,975.000 24,571.000 24,429.000 23,241.000 25,721.000 25,282.000 24,067.000 25,369.000 25,057.000 24,791.000 27,917.000 24,306.000 24,985.000 27,191.000 24,678.000 26,975.000 25,396.000 25,879.000 0.000 24,376.000 24,649.000 24,743.000 25,487.000 26,192.000 24,557.000 24,338.000 25,635.000 26,340.000 23,564.000 0.000 24,363.000 24,841.000 25,135.000 26,706.000 0.000 24,172.000 26,922.000 24,883.000 24,600.000 24,294.000 24,859.000 26,046.000 25,057.000 26,862.000 24,319.000 26,156.000 26,039.000 26,175.000 25,418.000 27,402.000 26,818.000 25,707.000 26,806.000 0.000 24,892.000 25,210.000 26,435.000 25,069.000 25,382.000 0.000 24,435.000 22,789.000 25,602.000 24,741.000 24,737.000 25,317.000 25,567.000 24,592.000 26,194.000 23,797.000 26,056.000 26,691.000 26,772.000 25,618.000 24,598.000 26,208.000 22,557.000 25,745.000 22,706.000 24,277.000 24,440.000 25,839.000 26,329.000 24,530.000 24,064.000 27,506.000 24,962.000 26,411.000 25,960.000 25,631.000 25,550.000 24,895.000 25,179.000 26,132.000 24,985.000 23,820.000 25,714.000 0.000 23,655.000 25,406.000 24,374.000 23,722.000 24,179.000 24,667.000 24,943.000 28,679.000 24,347.000 24,693.000 28,777.000 25,377.000 22,856.000 26,620.000 25,942.000 24,968.000 0.000 24,915.000 25,941.000 23,920.000 24,994.000 24,689.000 25,615.000 27,821.000 26,617.000 25,124.000 25,961.000 25,364.000 24,488.000 0.000 24,784.000 24,884.000 26,091.000 24,395.000 25,356.000 24,297.000 25,374.000 24,832.000 26,255.000 25,320.000 24,292.000 25,241.000 25,004.000 24,133.000 23,962.000 25,009.000 23,941.000 24,265.000 26,442.000 26,051.000 25,726.000 25,491.000 24,572.000 25,711.000 24,479.000 27,149.000 23,731.000 26,505.000 24,426.000 24,969.000 0.000 25,140.000 25,982.000 24,781.000 25,791.000 25,326.000 25,119.000 28,123.000 26,782.000 24,335.000 26,253.000 0.000 25,271.000 27,064.000 25,784.000 24,953.000 24,898.000 25,510.000 24,092.000 25,501.000 24,449.000 25,254.000 23,596.000 25,095.000 25,209.000 25,284.000 24,727.000 26,095.000 27,097.000 23,258.000 24,494.000 23,867.000 27,685.000 25,352.000 24,132.000 24,075.000 23,600.000 0.000 0.000 0.000 26,517.000 25,185.000 24,487.000 26,255.000 26,169.000 22,314.000 27,219.000 24,998.000 23,991.000 25,429.000 26,846.000 0.000 25,120.000 24,973.000 24,255.000 24,361.000 25,447.000 24,475.000 26,965.000 27,126.000 0.000 25,918.000 25,547.000 24,304.000 24,850.000 25,986.000 24,429.000 25,789.000 24,486.000 24,229.000 24,659.000 0.000 25,286.000 26,780.000 27,140.000 24,431.000
3 4 0.000 26,418.000 26,251.000 24,995.000 24,910.000 24,663.000 25,895.000 25,195.000 26,303.000 0.000 24,756.000 24,868.000 23,467.000 25,844.000 24,138.000 25,241.000 25,725.000 0.000 26,117.000 27,163.000 24,873.000 25,991.000 25,444.000 26,770.000 25,118.000 25,745.000 25,723.000 26,402.000 0.000 25,419.000 26,428.000 25,724.000 26,461.000 25,598.000 24,741.000 27,085.000 24,112.000 25,847.000 25,431.000 22,610.000 26,489.000 26,150.000 0.000 26,652.000 23,269.000 0.000 0.000 25,766.000 26,630.000 26,976.000 27,447.000 24,542.000 24,208.000 25,950.000 25,595.000 25,223.000 25,943.000 25,012.000 24,936.000 26,446.000 28,126.000 0.000 25,352.000 27,408.000 27,035.000 24,890.000 25,450.000 26,633.000 25,400.000 25,990.000 25,383.000 25,099.000 24,603.000 0.000 23,384.000 25,485.000 25,144.000 26,301.000 25,480.000 26,126.000 27,493.000 26,797.000 26,547.000 24,937.000 25,719.000 25,635.000 0.000 27,014.000 24,218.000 28,238.000 25,763.000 27,772.000 25,685.000 26,188.000 25,829.000 26,148.000 25,727.000 24,793.000 27,031.000 23,843.000 25,957.000 0.000 25,965.000 26,543.000 27,297.000 27,604.000 25,612.000 25,076.000 28,086.000 26,340.000 26,825.000 24,434.000 26,219.000 25,927.000 26,116.000 26,760.000 25,636.000 26,398.000 24,859.000 26,165.000 27,907.000 26,274.000 27,248.000 24,616.000 26,878.000 25,043.000 25,693.000 25,750.000 25,918.000 24,714.000 26,537.000 27,027.000 26,556.000 27,196.000 25,010.000 26,190.000 27,410.000 26,560.000 27,914.000 26,937.000 26,087.000 0.000 24,713.000 24,578.000 0.000 26,664.000 26,616.000 26,076.000 25,242.000 28,269.000 26,035.000 26,188.000 27,552.000 26,222.000 0.000 25,379.000 26,640.000 26,777.000 27,560.000 25,832.000 24,389.000 26,696.000 25,732.000 27,620.000 26,322.000 26,191.000 24,634.000 0.000 25,828.000 25,520.000 24,023.000 0.000 25,143.000 26,906.000 25,165.000 26,259.000 25,533.000 25,353.000 26,151.000 26,453.000 26,230.000 26,131.000 28,407.000 24,508.000 25,503.000 0.000 27,059.000 25,616.000 25,155.000 25,452.000 27,323.000 0.000 25,239.000 24,895.000 24,675.000 24,818.000 0.000 0.000 25,818.000 27,001.000 26,993.000 24,732.000 26,222.000 25,606.000 24,897.000 25,815.000 24,483.000 25,556.000 26,958.000 25,376.000 24,457.000 27,032.000 26,760.000 24,815.000 25,429.000 25,413.000 0.000 25,475.000 26,080.000 24,572.000 27,047.000 0.000 0.000 0.000 0.000 27,379.000 24,249.000 25,259.000 23,730.000 26,224.000 27,346.000 27,139.000 25,678.000 28,112.000 25,681.000 24,837.000 25,022.000 26,316.000 26,203.000 25,447.000 25,966.000 24,884.000 26,657.000 26,215.000 24,759.000 27,047.000 25,080.000 25,401.000 25,198.000 26,327.000 26,841.000 27,465.000 24,648.000 26,375.000 26,689.000 25,506.000 0.000 26,810.000 26,700.000 26,519.000 25,679.000
4 5 0.000 29,000.000 28,757.000 27,657.000 28,102.000 27,584.000 27,798.000 30,114.000 29,726.000 0.000 29,736.000 30,253.000 0.000 28,135.000 29,603.000 29,847.000 26,317.000 0.000 28,544.000 28,543.000 29,085.000 28,903.000 29,394.000 28,904.000 28,788.000 29,126.000 27,751.000 29,472.000 27,644.000 28,669.000 27,615.000 31,394.000 28,520.000 28,449.000 29,139.000 28,246.000 29,631.000 28,090.000 29,228.000 28,916.000 27,755.000 28,682.000 31,295.000 28,621.000 29,062.000 0.000 29,458.000 30,083.000 29,033.000 29,317.000 29,101.000 28,987.000 28,043.000 28,621.000 29,182.000 27,864.000 27,970.000 27,629.000 27,631.000 28,230.000 28,025.000 28,434.000 27,717.000 28,094.000 28,935.000 29,166.000 28,180.000 30,080.000 28,935.000 27,295.000 29,324.000 28,608.000 28,337.000 30,701.000 29,064.000 29,406.000 28,364.000 29,956.000 0.000 26,973.000 28,189.000 28,232.000 31,214.000 26,975.000 27,532.000 28,026.000 0.000 28,479.000 30,012.000 30,794.000 26,505.000 27,626.000 28,048.000 31,072.000 26,905.000 28,671.000 28,122.000 27,077.000 29,942.000 28,580.000 28,278.000 29,358.000 31,121.000 30,604.000 29,208.000 27,892.000 29,949.000 29,023.000 28,077.000 29,492.000 27,458.000 29,598.000 29,367.000 28,105.000 28,131.000 0.000 29,854.000 29,267.000 29,094.000 30,059.000 30,550.000 27,401.000 31,340.000 28,156.000 27,179.000 28,443.000 29,669.000 28,588.000 29,221.000 27,751.000 28,722.000 27,358.000 27,002.000 28,532.000 28,141.000 28,562.000 29,723.000 28,562.000 28,179.000 29,368.000 28,033.000 0.000 27,647.000 29,966.000 27,758.000 27,968.000 29,260.000 30,264.000 27,474.000 27,614.000 30,156.000 28,833.000 27,890.000 29,871.000 28,323.000 27,377.000 28,084.000 28,348.000 30,082.000 0.000 30,172.000 29,081.000 29,541.000 27,189.000 29,797.000 28,402.000 28,977.000 29,695.000 30,336.000 29,286.000 27,991.000 27,697.000 26,926.000 27,628.000 30,054.000 31,782.000 29,204.000 28,115.000 28,552.000 29,124.000 30,252.000 27,938.000 29,347.000 28,031.000 28,025.000 0.000 27,309.000 28,246.000 28,571.000 29,890.000 29,572.000 30,814.000 29,473.000 29,797.000 28,171.000 27,543.000 0.000 30,621.000 28,091.000 27,587.000 28,755.000 27,607.000 31,200.000 28,619.000 28,388.000 29,244.000 28,112.000 27,417.000 30,603.000 29,938.000 29,951.000 29,162.000 29,751.000 29,937.000 28,749.000 29,092.000 27,315.000 30,275.000 27,082.000 29,220.000 29,261.000 29,552.000 0.000 0.000 0.000 30,055.000 28,315.000 28,633.000 28,460.000 28,570.000 27,372.000 30,830.000 30,403.000 30,755.000 28,406.000 27,460.000 29,240.000 30,373.000 28,480.000 26,640.000 27,598.000 28,326.000 29,174.000 30,726.000 28,275.000 28,516.000 31,305.000 27,322.000 29,495.000 29,266.000 29,952.000 28,453.000 27,594.000 28,647.000 28,033.000 29,698.000 0.000 27,584.000 29,767.000 28,632.000 28,865.000
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4405 4406 0.000 29,225.000 31,827.000 30,238.000 31,842.000 29,605.000 29,811.000 30,098.000 31,533.000 0.000 30,080.000 32,427.000 32,608.000 30,989.000 30,883.000 31,355.000 28,849.000 0.000 31,754.000 31,744.000 29,399.000 29,905.000 29,846.000 29,690.000 29,814.000 30,777.000 28,306.000 30,474.000 30,337.000 29,972.000 31,772.000 32,186.000 33,306.000 31,217.000 30,516.000 30,207.000 29,009.000 31,269.000 32,335.000 27,945.000 30,556.000 30,747.000 30,285.000 31,358.000 31,194.000 0.000 29,803.000 31,539.000 30,646.000 32,137.000 30,981.000 30,054.000 32,121.000 30,357.000 31,227.000 31,854.000 32,451.000 30,899.000 30,258.000 30,815.000 30,495.000 30,096.000 31,025.000 30,455.000 30,599.000 32,050.000 30,010.000 29,728.000 32,451.000 30,170.000 28,746.000 29,654.000 30,314.000 27,991.000 30,872.000 31,760.000 30,723.000 30,859.000 29,947.000 30,852.000 32,025.000 31,943.000 30,551.000 32,070.000 31,179.000 32,076.000 0.000 29,359.000 32,573.000 0.000 29,447.000 0.000 29,411.000 29,281.000 31,228.000 30,163.000 32,516.000 31,506.000 31,197.000 30,013.000 30,214.000 30,172.000 0.000 30,177.000 31,703.000 31,544.000 31,264.000 31,730.000 33,537.000 28,630.000 31,525.000 29,042.000 32,401.000 30,606.000 31,829.000 31,311.000 30,293.000 29,895.000 31,373.000 30,067.000 29,812.000 31,003.000 29,303.000 29,966.000 31,923.000 32,426.000 31,179.000 30,876.000 29,462.000 29,776.000 30,412.000 31,855.000 30,557.000 30,489.000 31,168.000 31,648.000 30,346.000 29,069.000 30,407.000 31,792.000 30,593.000 0.000 30,968.000 30,568.000 29,198.000 29,778.000 31,505.000 31,476.000 32,078.000 30,226.000 31,432.000 29,069.000 33,890.000 29,541.000 30,961.000 29,909.000 31,255.000 29,640.000 31,272.000 29,908.000 29,475.000 32,800.000 30,611.000 30,845.000 31,776.000 30,889.000 30,149.000 30,193.000 28,983.000 29,618.000 30,504.000 29,698.000 28,954.000 29,883.000 31,078.000 32,523.000 0.000 28,994.000 30,997.000 29,985.000 28,823.000 31,124.000 30,137.000 30,341.000 30,874.000 0.000 31,523.000 31,229.000 30,695.000 30,565.000 30,609.000 31,108.000 29,752.000 33,183.000 30,650.000 28,282.000 0.000 33,086.000 30,009.000 30,881.000 30,078.000 31,734.000 30,369.000 30,159.000 30,074.000 0.000 29,304.000 31,007.000 30,643.000 31,324.000 29,638.000 31,404.000 31,327.000 32,361.000 31,251.000 32,555.000 32,162.000 32,418.000 29,466.000 30,078.000 31,104.000 30,021.000 0.000 0.000 0.000 29,857.000 30,902.000 29,902.000 29,335.000 29,640.000 31,179.000 0.000 30,394.000 29,495.000 32,187.000 30,866.000 31,791.000 29,094.000 31,892.000 29,613.000 29,192.000 30,656.000 30,543.000 31,156.000 31,262.000 29,263.000 29,928.000 30,129.000 29,075.000 29,912.000 30,617.000 29,099.000 29,381.000 29,853.000 30,976.000 30,949.000 0.000 30,536.000 32,669.000 32,197.000 30,629.000
4406 4407 0.000 22,520.000 0.000 19,417.000 20,620.000 22,296.000 20,572.000 21,790.000 22,652.000 0.000 21,649.000 20,133.000 20,237.000 23,461.000 21,930.000 0.000 20,852.000 0.000 24,295.000 22,284.000 22,641.000 23,244.000 0.000 21,254.000 21,616.000 21,768.000 20,147.000 22,540.000 21,578.000 21,617.000 22,650.000 20,718.000 21,925.000 0.000 21,964.000 21,631.000 23,694.000 24,216.000 22,333.000 22,376.000 20,657.000 18,801.000 22,899.000 21,404.000 22,752.000 0.000 22,706.000 22,165.000 22,005.000 22,146.000 23,443.000 22,486.000 21,912.000 22,525.000 21,935.000 22,550.000 21,225.000 21,066.000 23,678.000 20,803.000 22,956.000 23,824.000 20,162.000 22,106.000 23,847.000 21,914.000 22,314.000 22,525.000 21,892.000 20,797.000 22,435.000 21,872.000 22,537.000 20,527.000 23,014.000 22,209.000 20,217.000 23,419.000 21,475.000 20,738.000 22,614.000 22,265.000 21,124.000 22,634.000 21,447.000 20,363.000 0.000 21,163.000 23,257.000 23,085.000 21,352.000 23,375.000 21,760.000 21,862.000 23,035.000 22,589.000 22,365.000 22,088.000 20,557.000 24,533.000 21,761.000 21,574.000 23,188.000 22,435.000 22,208.000 21,830.000 21,911.000 21,628.000 19,906.000 20,229.000 21,439.000 20,468.000 20,683.000 23,096.000 22,438.000 21,787.000 22,922.000 22,970.000 23,260.000 23,175.000 20,969.000 22,613.000 20,780.000 22,273.000 21,284.000 21,459.000 21,133.000 20,746.000 21,643.000 23,239.000 22,394.000 22,463.000 21,816.000 20,129.000 21,361.000 21,370.000 20,998.000 22,411.000 21,534.000 21,926.000 21,395.000 0.000 0.000 21,246.000 21,735.000 21,831.000 21,846.000 21,772.000 21,609.000 22,519.000 23,284.000 20,206.000 22,729.000 22,260.000 22,871.000 19,253.000 22,984.000 22,189.000 21,983.000 21,435.000 22,250.000 23,321.000 22,080.000 22,586.000 22,890.000 21,836.000 22,062.000 22,366.000 22,436.000 21,320.000 21,799.000 22,649.000 22,319.000 22,655.000 0.000 21,330.000 20,885.000 22,136.000 23,583.000 23,314.000 23,145.000 21,723.000 21,764.000 21,972.000 23,142.000 0.000 23,402.000 21,100.000 22,038.000 21,280.000 20,491.000 21,899.000 21,169.000 22,215.000 22,667.000 19,874.000 0.000 21,545.000 22,084.000 21,238.000 0.000 23,808.000 21,564.000 22,292.000 22,374.000 0.000 21,273.000 22,511.000 21,080.000 21,838.000 21,953.000 22,476.000 21,699.000 21,647.000 24,410.000 22,215.000 21,880.000 18,849.000 22,012.000 21,073.000 19,613.000 22,133.000 0.000 0.000 0.000 22,428.000 22,204.000 21,129.000 21,362.000 23,603.000 21,708.000 23,261.000 20,859.000 21,460.000 22,066.000 22,521.000 21,116.000 21,842.000 21,650.000 22,913.000 22,270.000 20,017.000 21,990.000 22,346.000 21,511.000 23,325.000 21,660.000 22,499.000 23,394.000 21,784.000 21,541.000 21,330.000 20,990.000 21,329.000 21,998.000 21,574.000 0.000 20,411.000 23,610.000 20,528.000 22,131.000
4407 4408 0.000 26,194.000 27,307.000 27,509.000 28,750.000 26,267.000 25,314.000 27,993.000 26,922.000 0.000 29,242.000 27,717.000 28,229.000 26,429.000 29,573.000 25,887.000 26,676.000 0.000 26,549.000 0.000 26,548.000 0.000 25,254.000 29,001.000 28,156.000 26,948.000 26,206.000 27,990.000 29,174.000 28,000.000 25,554.000 0.000 27,691.000 29,617.000 27,662.000 26,840.000 28,792.000 28,816.000 28,235.000 26,802.000 30,359.000 28,098.000 26,914.000 28,758.000 30,514.000 0.000 0.000 28,119.000 28,833.000 26,868.000 28,799.000 26,836.000 28,401.000 30,119.000 26,925.000 28,657.000 27,009.000 27,156.000 28,378.000 28,994.000 27,780.000 25,974.000 0.000 27,000.000 27,117.000 26,871.000 28,834.000 27,156.000 26,741.000 29,776.000 0.000 29,466.000 26,665.000 28,029.000 28,081.000 27,950.000 25,114.000 27,466.000 28,462.000 29,446.000 26,924.000 27,131.000 0.000 27,852.000 28,773.000 26,717.000 0.000 28,064.000 27,737.000 28,716.000 26,668.000 28,994.000 28,125.000 28,118.000 27,060.000 0.000 27,772.000 28,548.000 28,541.000 0.000 0.000 27,751.000 29,796.000 28,642.000 29,404.000 27,114.000 28,124.000 27,583.000 28,189.000 27,511.000 27,942.000 27,210.000 28,531.000 27,189.000 29,183.000 29,218.000 25,107.000 25,589.000 26,507.000 27,822.000 26,878.000 0.000 26,668.000 28,231.000 28,880.000 26,967.000 27,741.000 26,919.000 26,712.000 25,916.000 28,551.000 27,153.000 29,696.000 27,921.000 27,511.000 28,317.000 28,203.000 28,891.000 27,550.000 28,524.000 30,217.000 0.000 28,657.000 26,918.000 28,600.000 25,665.000 0.000 28,783.000 26,542.000 26,275.000 0.000 27,922.000 27,555.000 26,618.000 27,171.000 30,005.000 28,154.000 27,127.000 27,780.000 24,918.000 29,393.000 0.000 28,236.000 27,737.000 0.000 28,389.000 27,514.000 28,313.000 27,381.000 27,350.000 28,906.000 29,022.000 29,114.000 27,111.000 26,802.000 28,293.000 26,969.000 27,294.000 27,093.000 30,059.000 27,449.000 27,054.000 27,780.000 28,915.000 26,159.000 0.000 27,330.000 27,021.000 27,454.000 28,447.000 27,798.000 26,043.000 27,369.000 26,660.000 24,718.000 27,174.000 0.000 26,653.000 27,589.000 28,431.000 26,533.000 28,064.000 29,272.000 26,176.000 27,944.000 28,525.000 28,352.000 29,129.000 29,321.000 28,556.000 26,358.000 27,596.000 29,542.000 29,060.000 28,326.000 26,741.000 27,560.000 26,826.000 26,687.000 26,904.000 26,419.000 27,727.000 0.000 0.000 0.000 29,880.000 29,467.000 27,691.000 26,651.000 28,389.000 27,090.000 28,073.000 27,168.000 29,582.000 0.000 28,532.000 25,294.000 27,633.000 25,696.000 29,042.000 28,136.000 28,000.000 27,258.000 0.000 27,224.000 28,690.000 27,389.000 27,607.000 26,860.000 27,660.000 27,667.000 28,028.000 28,318.000 27,913.000 27,908.000 26,647.000 0.000 27,844.000 0.000 27,938.000 29,125.000
4408 4409 0.000 34,292.000 34,513.000 35,238.000 33,808.000 33,136.000 0.000 33,459.000 34,559.000 0.000 34,144.000 33,621.000 33,457.000 35,222.000 34,902.000 34,432.000 33,498.000 0.000 34,568.000 32,317.000 35,427.000 33,344.000 33,175.000 33,821.000 0.000 34,350.000 33,238.000 34,301.000 34,319.000 34,063.000 34,308.000 34,295.000 34,272.000 35,627.000 34,233.000 33,854.000 32,543.000 34,484.000 34,778.000 36,521.000 35,868.000 34,624.000 0.000 32,694.000 34,280.000 0.000 34,146.000 36,930.000 33,489.000 32,640.000 35,661.000 31,374.000 33,252.000 33,395.000 33,947.000 33,703.000 33,020.000 32,381.000 34,354.000 36,694.000 34,546.000 33,817.000 33,972.000 35,888.000 32,917.000 34,578.000 33,453.000 35,393.000 32,952.000 34,498.000 32,877.000 34,343.000 35,047.000 33,626.000 34,430.000 35,893.000 35,024.000 32,566.000 34,488.000 34,082.000 35,689.000 35,188.000 35,023.000 33,871.000 35,515.000 35,031.000 0.000 36,262.000 36,181.000 32,677.000 34,249.000 34,837.000 34,045.000 33,077.000 32,930.000 34,032.000 37,083.000 36,233.000 33,711.000 33,530.000 35,766.000 33,341.000 31,968.000 35,986.000 33,860.000 34,350.000 34,349.000 36,157.000 32,132.000 32,235.000 31,619.000 35,509.000 35,190.000 33,862.000 34,119.000 35,217.000 34,321.000 35,515.000 35,700.000 34,801.000 0.000 35,247.000 32,367.000 33,804.000 35,280.000 33,424.000 32,946.000 36,169.000 33,785.000 32,590.000 33,732.000 33,671.000 32,318.000 33,894.000 36,265.000 34,039.000 33,964.000 35,357.000 34,690.000 32,280.000 34,320.000 0.000 34,307.000 34,920.000 35,510.000 34,323.000 35,054.000 33,920.000 33,662.000 33,024.000 35,092.000 33,787.000 33,061.000 34,287.000 33,270.000 36,425.000 35,859.000 32,251.000 34,080.000 35,367.000 34,425.000 34,339.000 34,989.000 34,327.000 35,919.000 33,420.000 33,558.000 35,531.000 32,834.000 36,049.000 33,008.000 35,795.000 0.000 32,844.000 31,949.000 31,064.000 32,989.000 33,955.000 34,317.000 33,860.000 34,159.000 33,167.000 35,054.000 0.000 33,500.000 0.000 35,939.000 32,858.000 32,975.000 33,963.000 32,622.000 34,397.000 36,170.000 35,058.000 31,642.000 34,231.000 0.000 34,723.000 32,292.000 34,851.000 34,295.000 0.000 33,697.000 34,096.000 34,110.000 31,697.000 33,497.000 33,781.000 34,789.000 34,075.000 33,913.000 34,862.000 33,795.000 35,676.000 33,811.000 32,689.000 35,906.000 33,839.000 36,443.000 34,295.000 35,693.000 33,835.000 0.000 0.000 0.000 33,929.000 35,264.000 33,635.000 34,227.000 34,271.000 33,868.000 34,398.000 34,429.000 31,701.000 35,007.000 34,737.000 32,391.000 0.000 34,988.000 34,267.000 33,076.000 33,229.000 32,620.000 32,636.000 34,059.000 34,862.000 33,332.000 37,047.000 34,944.000 33,118.000 33,318.000 35,891.000 33,627.000 33,741.000 33,270.000 34,912.000 0.000 36,191.000 32,444.000 33,698.000 34,618.000
4409 4410 0.000 24,610.000 26,208.000 24,220.000 24,632.000 26,736.000 25,099.000 24,706.000 26,968.000 0.000 25,532.000 24,682.000 23,255.000 24,929.000 25,229.000 0.000 0.000 0.000 25,666.000 25,686.000 24,227.000 24,405.000 24,190.000 25,404.000 24,985.000 26,148.000 25,776.000 25,377.000 23,688.000 24,534.000 24,728.000 24,709.000 23,804.000 25,085.000 26,052.000 24,326.000 23,242.000 25,188.000 23,373.000 25,184.000 24,330.000 26,359.000 26,421.000 24,624.000 23,374.000 0.000 26,397.000 25,490.000 25,192.000 0.000 25,317.000 24,938.000 24,880.000 0.000 24,233.000 24,128.000 24,165.000 0.000 24,751.000 0.000 25,285.000 26,458.000 26,834.000 23,961.000 25,325.000 27,166.000 25,928.000 24,997.000 23,940.000 25,500.000 0.000 26,309.000 24,106.000 23,952.000 26,064.000 25,235.000 24,102.000 23,112.000 25,658.000 0.000 24,743.000 24,084.000 27,218.000 25,476.000 24,842.000 25,981.000 0.000 25,206.000 24,023.000 0.000 23,701.000 24,573.000 26,302.000 25,683.000 25,829.000 25,593.000 25,355.000 27,099.000 26,244.000 24,718.000 24,217.000 25,766.000 25,273.000 0.000 25,894.000 26,643.000 24,571.000 24,168.000 24,256.000 24,594.000 24,755.000 24,908.000 23,843.000 27,591.000 26,366.000 26,476.000 22,805.000 26,773.000 23,933.000 23,760.000 24,800.000 24,228.000 24,390.000 27,306.000 25,033.000 27,899.000 25,946.000 24,723.000 28,298.000 25,531.000 25,367.000 24,553.000 25,436.000 24,813.000 24,900.000 25,029.000 26,530.000 25,374.000 23,098.000 26,228.000 23,252.000 0.000 26,166.000 0.000 25,629.000 23,449.000 24,622.000 25,029.000 23,448.000 26,764.000 0.000 24,402.000 24,131.000 23,557.000 24,801.000 25,179.000 24,986.000 24,374.000 25,660.000 0.000 25,413.000 25,987.000 25,940.000 26,689.000 26,900.000 25,128.000 21,654.000 24,466.000 25,580.000 22,929.000 25,509.000 25,977.000 25,099.000 25,378.000 25,228.000 25,098.000 23,692.000 23,635.000 27,281.000 25,215.000 23,434.000 25,159.000 26,729.000 24,443.000 26,914.000 0.000 26,206.000 0.000 24,784.000 25,093.000 25,815.000 23,999.000 26,090.000 23,164.000 25,534.000 25,862.000 0.000 26,000.000 26,075.000 25,834.000 26,163.000 26,251.000 25,841.000 25,398.000 23,269.000 25,197.000 27,055.000 27,311.000 25,149.000 25,555.000 26,179.000 24,848.000 26,902.000 23,311.000 26,377.000 25,935.000 24,972.000 24,023.000 27,186.000 24,881.000 22,919.000 26,037.000 0.000 0.000 0.000 25,116.000 23,605.000 0.000 24,744.000 24,190.000 25,059.000 25,270.000 27,159.000 25,556.000 24,991.000 23,396.000 25,591.000 25,814.000 26,513.000 25,364.000 27,078.000 27,547.000 24,853.000 24,516.000 0.000 24,866.000 24,993.000 23,303.000 25,440.000 23,578.000 25,643.000 25,389.000 25,207.000 0.000 24,192.000 22,704.000 0.000 27,213.000 25,864.000 25,026.000 23,390.000

4410 rows × 262 columns

In [17]:
# Calculating average working duration in seconds for each employee
df_work_duration['Avg_duration_sec'] = df_work_duration.mean(axis = 1)
df_avg_work_duration = df_work_duration[['EmployeeID', 'Avg_duration_sec']]
df_avg_work_duration.head()
Out[17]:
EmployeeID Avg_duration_sec
0 1 23,505.626
1 2 25,030.679
2 3 23,320.374
3 4 23,228.458
4 5 26,952.103

6.2 Recoding values

In [18]:
# General data
df_general_data.Education = df_general_data.Education.replace({1: 'Below College', 2: 'College', 3: 'Bachelor', 4: 'Master', 5: 'Doctor'})

# Employee survey data
df_employee_survey_data.EnvironmentSatisfaction = df_employee_survey_data.EnvironmentSatisfaction.replace({1: 'Low', 2: 'Medium', 3:'High', 4: 'Very High'})
df_employee_survey_data.JobSatisfaction = df_employee_survey_data.JobSatisfaction.replace({1: 'Low', 2: 'Medium', 3:'High', 4: 'Very High'})
df_employee_survey_data.WorkLifeBalance = df_employee_survey_data.WorkLifeBalance.replace({1: 'Bad', 2: 'Good', 3:'Better', 4: 'Best'})

# Manager survey data
df_manager_survey_data.JobInvolvement = df_manager_survey_data.JobInvolvement.replace({1: 'Low', 2: 'Medium', 3:'High', 4: 'Very High'})
df_manager_survey_data.PerformanceRating = df_manager_survey_data.PerformanceRating.replace({1: 'Low', 2: 'Good', 3:'Excellent', 4: 'Outstanding'})

df_general_data.head()
Out[18]:
Age Attrition BusinessTravel Department DistanceFromHome Education EducationField EmployeeCount EmployeeID Gender JobLevel JobRole MaritalStatus MonthlyIncome NumCompaniesWorked Over18 PercentSalaryHike StandardHours StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager
0 51 No Travel_Rarely Sales 6 College Life Sciences 1 1 Female 1 Healthcare Representative Married 131160 1.000 Y 11 8 0 1.000 6 1 0 0
1 31 Yes Travel_Frequently Research & Development 10 Below College Life Sciences 1 2 Female 1 Research Scientist Single 41890 0.000 Y 23 8 1 6.000 3 5 1 4
2 32 No Travel_Frequently Research & Development 17 Master Other 1 3 Male 4 Sales Executive Married 193280 1.000 Y 15 8 3 5.000 2 5 0 3
3 38 No Non-Travel Research & Development 2 Doctor Life Sciences 1 4 Male 3 Human Resources Married 83210 3.000 Y 11 8 3 13.000 5 8 7 5
4 32 No Travel_Rarely Research & Development 10 Below College Medical 1 5 Male 1 Sales Executive Single 23420 4.000 Y 12 8 2 9.000 2 6 0 4
In [19]:
df_employee_survey_data.head()
Out[19]:
EmployeeID EnvironmentSatisfaction JobSatisfaction WorkLifeBalance
0 1 High Very High Good
1 2 High Medium Best
2 3 Medium Medium Bad
3 4 Very High Very High Better
4 5 Very High Low Better
In [20]:
df_manager_survey_data.head()
Out[20]:
EmployeeID JobInvolvement PerformanceRating
0 1 High Excellent
1 2 Medium Outstanding
2 3 High Excellent
3 4 Medium Excellent
4 5 High Excellent

6.3 Joining tables

In [21]:
df_master_data = df_general_data.merge(df_employee_survey_data, left_on = 'EmployeeID', right_on = 'EmployeeID')
df_master_data = df_master_data.merge(df_manager_survey_data, left_on = 'EmployeeID', right_on = 'EmployeeID')
df_master_data = df_master_data.merge(df_avg_work_duration, left_on = 'EmployeeID', right_on = 'EmployeeID')

df_master_data.head()
Out[21]:
Age Attrition BusinessTravel Department DistanceFromHome Education EducationField EmployeeCount EmployeeID Gender JobLevel JobRole MaritalStatus MonthlyIncome NumCompaniesWorked Over18 PercentSalaryHike StandardHours StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement PerformanceRating Avg_duration_sec
0 51 No Travel_Rarely Sales 6 College Life Sciences 1 1 Female 1 Healthcare Representative Married 131160 1.000 Y 11 8 0 1.000 6 1 0 0 High Very High Good High Excellent 23,505.626
1 31 Yes Travel_Frequently Research & Development 10 Below College Life Sciences 1 2 Female 1 Research Scientist Single 41890 0.000 Y 23 8 1 6.000 3 5 1 4 High Medium Best Medium Outstanding 25,030.679
2 32 No Travel_Frequently Research & Development 17 Master Other 1 3 Male 4 Sales Executive Married 193280 1.000 Y 15 8 3 5.000 2 5 0 3 Medium Medium Bad High Excellent 23,320.374
3 38 No Non-Travel Research & Development 2 Doctor Life Sciences 1 4 Male 3 Human Resources Married 83210 3.000 Y 11 8 3 13.000 5 8 7 5 Very High Very High Better Medium Excellent 23,228.458
4 32 No Travel_Rarely Research & Development 10 Below College Medical 1 5 Male 1 Sales Executive Single 23420 4.000 Y 12 8 2 9.000 2 6 0 4 Very High Low Better High Excellent 26,952.103

6.4 Summary statistics

In [22]:
df_master_data.describe().round(decimals = 2)
Out[22]:
Age DistanceFromHome EmployeeCount EmployeeID JobLevel MonthlyIncome NumCompaniesWorked PercentSalaryHike StandardHours StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager Avg_duration_sec
count 4,410.000 4,410.000 4,410.000 4,410.000 4,410.000 4,410.000 4,391.000 4,410.000 4,410.000 4,410.000 4,401.000 4,410.000 4,410.000 4,410.000 4,410.000 4,410.000
mean 36.920 9.190 1.000 2,205.500 2.060 65,029.310 2.690 15.210 8.000 0.790 11.280 2.800 7.010 2.190 4.120 25,033.590
std 9.130 8.110 0.000 1,273.200 1.110 47,068.890 2.500 3.660 0.000 0.850 7.780 1.290 6.130 3.220 3.570 4,553.010
min 18.000 1.000 1.000 1.000 1.000 10,090.000 0.000 11.000 8.000 0.000 0.000 0.000 0.000 0.000 0.000 18,541.080
25% 30.000 2.000 1.000 1,103.250 1.000 29,110.000 1.000 12.000 8.000 0.000 6.000 2.000 3.000 0.000 2.000 21,491.850
50% 36.000 7.000 1.000 2,205.500 2.000 49,190.000 2.000 14.000 8.000 1.000 10.000 3.000 5.000 1.000 3.000 23,999.790
75% 43.000 14.000 1.000 3,307.750 3.000 83,800.000 4.000 18.000 8.000 1.000 15.000 3.000 9.000 3.000 7.000 27,040.800
max 60.000 29.000 1.000 4,410.000 5.000 199,990.000 9.000 25.000 8.000 3.000 40.000 6.000 40.000 15.000 17.000 37,428.560
In [23]:
df_master_data.isnull().sum()
Out[23]:
Age                         0
Attrition                   0
BusinessTravel              0
Department                  0
DistanceFromHome            0
Education                   0
EducationField              0
EmployeeCount               0
EmployeeID                  0
Gender                      0
JobLevel                    0
JobRole                     0
MaritalStatus               0
MonthlyIncome               0
NumCompaniesWorked         19
Over18                      0
PercentSalaryHike           0
StandardHours               0
StockOptionLevel            0
TotalWorkingYears           9
TrainingTimesLastYear       0
YearsAtCompany              0
YearsSinceLastPromotion     0
YearsWithCurrManager        0
EnvironmentSatisfaction    25
JobSatisfaction            20
WorkLifeBalance            38
JobInvolvement              0
PerformanceRating           0
Avg_duration_sec            0
dtype: int64
In [24]:
df_master_data.dropna(inplace = True)

6.5 Final table

In [25]:
df_final = df_master_data.drop(['EmployeeID', 'Over18', 'StandardHours', 'EmployeeCount'], axis = 1)
df_final.head()
Out[25]:
Age Attrition BusinessTravel Department DistanceFromHome Education EducationField Gender JobLevel JobRole MaritalStatus MonthlyIncome NumCompaniesWorked PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement PerformanceRating Avg_duration_sec
0 51 No Travel_Rarely Sales 6 College Life Sciences Female 1 Healthcare Representative Married 131160 1.000 11 0 1.000 6 1 0 0 High Very High Good High Excellent 23,505.626
1 31 Yes Travel_Frequently Research & Development 10 Below College Life Sciences Female 1 Research Scientist Single 41890 0.000 23 1 6.000 3 5 1 4 High Medium Best Medium Outstanding 25,030.679
2 32 No Travel_Frequently Research & Development 17 Master Other Male 4 Sales Executive Married 193280 1.000 15 3 5.000 2 5 0 3 Medium Medium Bad High Excellent 23,320.374
3 38 No Non-Travel Research & Development 2 Doctor Life Sciences Male 3 Human Resources Married 83210 3.000 11 3 13.000 5 8 7 5 Very High Very High Better Medium Excellent 23,228.458
4 32 No Travel_Rarely Research & Development 10 Below College Medical Male 1 Sales Executive Single 23420 4.000 12 2 9.000 2 6 0 4 Very High Low Better High Excellent 26,952.103
In [26]:
df_final.shape
Out[26]:
(4300, 26)
In [27]:
df_categorical = ['Attrition', 'BusinessTravel', 'Department', 
                  'Education', 'EducationField', 'Gender', 
                  'JobRole', 'MaritalStatus', 'EnvironmentSatisfaction', 
                  'JobSatisfaction', 'WorkLifeBalance', 'JobInvolvement', 'PerformanceRating']
In [28]:
df_numerical = df_final.columns[~df_final.columns.isin(df_categorical)]
df_numerical
Out[28]:
Index(['Age', 'DistanceFromHome', 'JobLevel', 'MonthlyIncome',
       'NumCompaniesWorked', 'PercentSalaryHike', 'StockOptionLevel',
       'TotalWorkingYears', 'TrainingTimesLastYear', 'YearsAtCompany',
       'YearsSinceLastPromotion', 'YearsWithCurrManager', 'Avg_duration_sec'],
      dtype='object')
In [29]:
# categorical variables
df_final[df_categorical].head()
Out[29]:
Attrition BusinessTravel Department Education EducationField Gender JobRole MaritalStatus EnvironmentSatisfaction JobSatisfaction WorkLifeBalance JobInvolvement PerformanceRating
0 No Travel_Rarely Sales College Life Sciences Female Healthcare Representative Married High Very High Good High Excellent
1 Yes Travel_Frequently Research & Development Below College Life Sciences Female Research Scientist Single High Medium Best Medium Outstanding
2 No Travel_Frequently Research & Development Master Other Male Sales Executive Married Medium Medium Bad High Excellent
3 No Non-Travel Research & Development Doctor Life Sciences Male Human Resources Married Very High Very High Better Medium Excellent
4 No Travel_Rarely Research & Development Below College Medical Male Sales Executive Single Very High Low Better High Excellent
In [30]:
# Numerical variables
df_final[df_numerical].head()
Out[30]:
Age DistanceFromHome JobLevel MonthlyIncome NumCompaniesWorked PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager Avg_duration_sec
0 51 6 1 131160 1.000 11 0 1.000 6 1 0 0 23,505.626
1 31 10 1 41890 0.000 23 1 6.000 3 5 1 4 25,030.679
2 32 17 4 193280 1.000 15 3 5.000 2 5 0 3 23,320.374
3 38 2 3 83210 3.000 11 3 13.000 5 8 7 5 23,228.458
4 32 10 1 23420 4.000 12 2 9.000 2 6 0 4 26,952.103

7 Exploratory data analysis

Exploratory analysis discovers characteristics of the variables and the relationships among them. Usually, it includes describing relevant variables statistically using visualizations. In this analysis, univariate and bivariate analyses are presented.

7.1 Univariate analysis

Univariate analysis is the simplest form of analyzing data. "Uni" means "one", so in other words your data has only one variable. It doesn't deal with causes or relationships (unlike regression ) and it's major purpose is to describe; It takes data, summarizes that data and finds patterns in the data.

In [31]:
df_final.describe()
Out[31]:
Age DistanceFromHome JobLevel MonthlyIncome NumCompaniesWorked PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager Avg_duration_sec
count 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000 4,300.000
mean 36.927 9.198 2.067 65,059.844 2.690 15.211 0.795 11.285 2.796 7.026 2.190 4.133 25,040.507
std 9.147 8.097 1.107 47,045.399 2.496 3.663 0.854 7.790 1.290 6.148 3.231 3.566 4,557.704
min 18.000 1.000 1.000 10,090.000 0.000 11.000 0.000 0.000 0.000 0.000 0.000 0.000 18,541.076
25% 30.000 2.000 1.000 29,260.000 1.000 12.000 0.000 6.000 2.000 3.000 0.000 2.000 21,488.723
50% 36.000 7.000 2.000 49,360.000 2.000 14.000 1.000 10.000 3.000 5.000 1.000 3.000 24,010.996
75% 43.000 14.000 3.000 83,802.500 4.000 18.000 1.000 15.000 3.000 9.250 3.000 7.000 27,080.194
max 60.000 29.000 5.000 199,990.000 9.000 25.000 3.000 40.000 6.000 40.000 15.000 17.000 37,428.557
In [32]:
# Attrition
df_final.Attrition.value_counts(normalize = True)
Out[32]:
No    0.838
Yes   0.162
Name: Attrition, dtype: float64
In [33]:
# Attrition
fig = px.histogram(df_final, x = 'Attrition', title = 'Attrition Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [34]:
# BusinessTravel
df_final.BusinessTravel.value_counts(normalize = True)
Out[34]:
Travel_Rarely       0.710
Travel_Frequently   0.188
Non-Travel          0.102
Name: BusinessTravel, dtype: float64
In [35]:
# BusinessTravel
fig = px.histogram(df_final, x = 'BusinessTravel', title = 'Business Travel Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [36]:
# Department
df_final.Department.value_counts(normalize = True)
Out[36]:
Research & Development   0.653
Sales                    0.304
Human Resources          0.043
Name: Department, dtype: float64
In [37]:
# Department
fig = px.histogram(df_final, x = 'Department', title = 'Department Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [38]:
# Education
df_final.Education.value_counts(normalize = True)
Out[38]:
Bachelor        0.388
Master          0.272
College         0.191
Below College   0.116
Doctor          0.033
Name: Education, dtype: float64
In [39]:
# Education
fig = px.histogram(df_final, x = 'Education', title = 'Education Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [40]:
# EducationField
df_final.EducationField.value_counts(normalize = True)
Out[40]:
Life Sciences      0.411
Medical            0.317
Marketing          0.109
Technical Degree   0.089
Other              0.055
Human Resources    0.019
Name: EducationField, dtype: float64
In [41]:
# EducationField
fig = px.histogram(df_final, y = 'EducationField', title = 'Education Field Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), 
                  xaxis = dict(showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [42]:
# Gender
df_final.Gender.value_counts(normalize = True)
Out[42]:
Male     0.598
Female   0.402
Name: Gender, dtype: float64
In [43]:
# Gender
fig = px.histogram(df_final, x = 'Gender', title = 'GenderDistribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [44]:
# JobRole
df_final.JobRole.value_counts(normalize = True)
Out[44]:
Sales Executive             0.222
Research Scientist          0.200
Laboratory Technician       0.176
Manufacturing Director      0.098
Healthcare Representative   0.088
Manager                     0.070
Sales Representative        0.056
Research Director           0.055
Human Resources             0.036
Name: JobRole, dtype: float64
In [45]:
# JobRole
fig = px.histogram(df_final, y = 'JobRole', title = 'Job Role Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), 
                  xaxis = dict(showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [46]:
# MaritalStatus
df_final.MaritalStatus.value_counts(normalize = True)
Out[46]:
Married    0.458
Single     0.321
Divorced   0.221
Name: MaritalStatus, dtype: float64
In [47]:
# MaritalStatus
fig = px.histogram(df_final, x = 'MaritalStatus', title = 'Marital Status Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [48]:
# EnvironmentSatisfaction
df_final.EnvironmentSatisfaction.value_counts(normalize = True)
Out[48]:
High        0.307
Very High   0.305
Medium      0.195
Low         0.193
Name: EnvironmentSatisfaction, dtype: float64
In [49]:
# EnvironmentSatisfaction
fig = px.histogram(df_final, x = 'EnvironmentSatisfaction', title = 'Environment Satisfaction Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [50]:
# EnvironmentSatisfaction
df_final.JobSatisfaction.value_counts(normalize = True)
Out[50]:
Very High   0.310
High        0.301
Low         0.197
Medium      0.191
Name: JobSatisfaction, dtype: float64
In [51]:
# JobSatisfaction
fig = px.histogram(df_final, x = 'JobSatisfaction', title = 'Job Satisfaction Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [52]:
# WorkLifeBalance
df_final.WorkLifeBalance.value_counts(normalize = True)
Out[52]:
Better   0.607
Good     0.234
Best     0.105
Bad      0.055
Name: WorkLifeBalance, dtype: float64
In [53]:
# WorkLifeBalance
fig = px.histogram(df_final, x = 'WorkLifeBalance', title = 'Work Life Balance Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [54]:
# JobInvolvement
df_final.JobInvolvement.value_counts(normalize = True)
Out[54]:
High        0.590
Medium      0.257
Very High   0.098
Low         0.056
Name: JobInvolvement, dtype: float64
In [55]:
# JobInvolvement
fig = px.histogram(df_final, x = 'JobInvolvement', title = 'Job Involvement Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [56]:
# PerformanceRating
df_final.PerformanceRating.value_counts(normalize = True)
Out[56]:
Excellent     0.846
Outstanding   0.154
Name: PerformanceRating, dtype: float64
In [57]:
# PerformanceRating
fig = px.histogram(df_final, x = 'PerformanceRating', title = 'Performance Rating Distribution', text_auto = True, width = 400, height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()

7.2 Bivariate analysis

Bivariate analysis is a kind of statistical analysis when two variables are observed against each other. The changes are analyzed between the two variables to understand to what extent the change has occurred. In this case, the relationships between attrition and other factors are examined.

Choosing the right visualization depends on the variables being visualized. In this case, a transformation on attrition is necessary to make it more convenient to visualize the relationships between attrition and other factors.

In [58]:
# Categorical variables
df_categorical
Out[58]:
['Attrition',
 'BusinessTravel',
 'Department',
 'Education',
 'EducationField',
 'Gender',
 'JobRole',
 'MaritalStatus',
 'EnvironmentSatisfaction',
 'JobSatisfaction',
 'WorkLifeBalance',
 'JobInvolvement',
 'PerformanceRating']
In [59]:
# Attrition vs BusinessTravel
df_tab_BT = pd.crosstab(df_final.BusinessTravel, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_BT, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [60]:
# Attrition odds for BusinessTravel
df_odd_BT = df_tab_BT['Yes'] / df_tab_BT['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_BT, 
             x = df_odd_BT.index,
             text_auto = '.2f',
             title = 'Attrition odd by Business Travel Status',
             labels = {'y': 'Attrition odd', 'x': 'Business Travel Frequency'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [61]:
# Attrition vs Education
df_tab_E = pd.crosstab(df_final.Education, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_E, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [62]:
# Attrition odds for Education
df_odd_E = df_tab_E['Yes'] / df_tab_E['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_E, 
             x = df_odd_E.index,
             text_auto = '.2f',
             title = 'Attrition odd by Education Level',
             labels = {'y': 'Attrition odd', 'x': 'Education Level'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [63]:
# Attrition vs EducationField
df_tab_EF = pd.crosstab(df_final.EducationField, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_EF, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [64]:
# Attrition odds for EducationField
df_odd_EF = df_tab_EF['Yes'] / df_tab_EF['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_EF, 
             x = df_odd_EF.index,
             text_auto = '.2f',
             title = 'Attrition odd by Education Fields',
             labels = {'y': 'Attrition odd', 'x': 'Education Field'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [65]:
# Attrition vs Gender
df_tab_G = pd.crosstab(df_final.Gender, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_G, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [66]:
# Attrition odds for Gender
df_odd_G = df_tab_G['Yes'] / df_tab_G['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_G, 
             x = df_odd_G.index,
             text_auto = '.2f',
             title = 'Attrition odd by Genders',
             labels = {'y': 'Attrition odd', 'x': 'Gender'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [67]:
# Attrition vs JobRole
df_tab_JR = pd.crosstab(df_final.JobRole, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_JR, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [68]:
# Attrition odds for JobRole
df_odd_JR = df_tab_JR['Yes'] / df_tab_JR['No']

# Visualizing Attrition odds
fig = px.bar(x = df_odd_JR, 
             y = df_odd_JR.index,
             text_auto = '.2f',
             title = 'Attrition odd by Job Roles',
             labels = {'y': 'Attrition odd', 'x': 'Job Role'},
             width = 600,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), 
                  xaxis = dict(showticklabels = False), 
                  autosize = True, 
                  plot_bgcolor = 'white')

# Showing figure
fig.show()
In [69]:
# Attrition vs MaritalStatus
df_tab_MS = pd.crosstab(df_final.MaritalStatus, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_MS, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [70]:
# Attrition odds for MaritalStatus
df_odd_MS = df_tab_MS['Yes'] / df_tab_MS['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_MS, 
             x = df_odd_MS.index,
             text_auto = '.2f',
             title = 'Attrition odd by Marital Status',
             labels = {'y': 'Attrition odd', 'x': 'Marital Status'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [71]:
# Attrition vs EnvironmentSatisfaction
df_tab_ES = pd.crosstab(df_final.EnvironmentSatisfaction, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_ES, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [72]:
# Attrition odds for EnvironmentSatisfaction
df_odd_ES = df_tab_ES['Yes'] / df_tab_ES['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_ES, 
             x = df_odd_ES.index,
             text_auto = '.2f',
             title = 'Attrition odd by Environment Satisfaction',
             labels = {'y': 'Attrition odd', 'x': 'Environment Satisfaction'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [73]:
# Attrition vs JobSatisfaction
df_tab_JS = pd.crosstab(df_final.JobSatisfaction, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_JS, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [74]:
# Attrition odds for JobSatisfaction
df_odd_JS = df_tab_JS['Yes'] / df_tab_JS['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_JS, 
             x = df_odd_JS.index,
             text_auto = '.2f',
             title = 'Attrition odd by Job Satisfaction',
             labels = {'y': 'Attrition odd', 'x': 'Job Satisfaction'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [75]:
# Attrition vs WorkLifeBalance
df_tab_WLB = pd.crosstab(df_final.WorkLifeBalance, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_WLB, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [76]:
# Attrition odds for WorkLifeBalance
df_odd_WLB = df_tab_WLB['Yes'] / df_tab_WLB['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_WLB, 
             x = df_odd_WLB.index,
             text_auto = '.2f',
             title = 'Attrition odd by Work Life Balance',
             labels = {'y': 'Attrition odd', 'x': 'Work Life Balance'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [77]:
# Attrition vs JobInvolvement
df_tab_JI = pd.crosstab(df_final.JobInvolvement, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_JI, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [78]:
# Attrition odds for JobInvolvement
df_odd_JI = df_tab_JI['Yes'] / df_tab_JI['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_JI, 
             x = df_odd_JI.index,
             text_auto = '.2f',
             title = 'Attrition odd by Job Involvement',
             labels = {'y': 'Attrition odd', 'x': 'Job Involvement'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [79]:
# Attrition vs PerformanceRating
df_tab_PR = pd.crosstab(df_final.PerformanceRating, df_final.Attrition, normalize = True)

# heatmap
fig = px.imshow(df_tab_PR, text_auto = '.2f', width = 400, height = 400)
fig.show()
In [80]:
# Attrition odds for PerformanceRating
df_odd_PR = df_tab_PR['Yes'] / df_tab_PR['No']

# Visualizing Attrition odds
fig = px.bar(y = df_odd_PR, 
             x = df_odd_PR.index,
             text_auto = '.2f',
             title = 'Attrition odd by Performance Rating',
             labels = {'y': 'Attrition odd', 'x': 'Performance Rating'},
             width = 400,
             height = 400)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = False), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [81]:
# Numerical variables
df_numerical
Out[81]:
Index(['Age', 'DistanceFromHome', 'JobLevel', 'MonthlyIncome',
       'NumCompaniesWorked', 'PercentSalaryHike', 'StockOptionLevel',
       'TotalWorkingYears', 'TrainingTimesLastYear', 'YearsAtCompany',
       'YearsSinceLastPromotion', 'YearsWithCurrManager', 'Avg_duration_sec'],
      dtype='object')
In [82]:
# Attrition odd by Ages
df_tab_A = pd.crosstab(df_final.Age, df_final.Attrition, normalize = True)
df_odd_A = df_tab_A['Yes'] / df_tab_A['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_A, 
              x = df_odd_A.index,
              title = 'Attrition odd by Ages',
              labels = {'y': 'Attrition odd', 'x': 'Age'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [83]:
# Attrition odd by DistanceFromHome
df_tab_DFH = pd.crosstab(df_final.DistanceFromHome, df_final.Attrition, normalize = True)
df_odd_DFH = df_tab_DFH['Yes'] / df_tab_DFH['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_DFH, 
              x = df_odd_DFH.index,
              title = 'Attrition odd by Distance From Home',
              labels = {'y': 'Attrition odd', 'x': 'Distance From Home'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [84]:
# Attrition odd by JobLevel
df_tab_JL = pd.crosstab(df_final.JobLevel, df_final.Attrition, normalize = True)
df_odd_JL = df_tab_JL['Yes'] / df_tab_JL['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_JL, 
              x = df_odd_JL.index,
              title = 'Attrition odd by Job Level',
              labels = {'y': 'Attrition odd', 'x': 'Job Level'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [85]:
# Attrition odd by MonthlyIncome
df_tab_MI = pd.crosstab(pd.cut(df_final.MonthlyIncome, bins = 10), df_final.Attrition, normalize = True)
df_odd_MI = df_tab_MI['Yes'] / df_tab_MI['No']
df_odd_MI.index = ['(9900.1, 29080.0]', '(29080.0, 48070.0]', '(48070.0, 67060.0]', '(67060.0, 86050.0]', '(86050.0, 105040.0]',
                  '(105040.0, 124030.0]', '(124030.0, 143020.0]', '(143020.0, 162010.0]', '(162010.0, 181000.0]', '(181000.0, 199990.0]']

# Visualizing Attrition odds
fig = px.line(y = df_odd_MI, 
              x = df_odd_MI.index,
              title = 'Attrition odd by Monthly Income bins',
              labels = {'y': 'Attrition odd', 'x': 'Monthly Income'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [86]:
# Attrition odd by NumCompaniesWorked
df_tab_NCW = pd.crosstab(df_final.NumCompaniesWorked, df_final.Attrition, normalize = True)
df_odd_NCW = df_tab_NCW['Yes'] / df_tab_NCW['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_NCW, 
              x = df_odd_NCW.index,
              title = 'Attrition odd by Numbers of Companies Worked',
              labels = {'y': 'Attrition odd', 'x': 'Num Companies Worked'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [87]:
# Attrition odd by PercentSalaryHike
df_tab_PSH = pd.crosstab(df_final.PercentSalaryHike, df_final.Attrition, normalize = True)
df_odd_PSH = df_tab_PSH['Yes'] / df_tab_PSH['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_PSH, 
              x = df_odd_PSH.index,
              title = 'Attrition odd by Percent Salary Hike',
              labels = {'y': 'Attrition odd', 'x': 'Percent Salary Hike'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [88]:
# Attrition odd by StockOptionLevel
df_tab_SOL = pd.crosstab(df_final.StockOptionLevel, df_final.Attrition, normalize = True)
df_odd_SOL = df_tab_SOL['Yes'] / df_tab_SOL['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_SOL, 
              x = df_odd_SOL.index,
              title = 'Attrition odd by Stock Option Levels',
              labels = {'y': 'Attrition odd', 'x': 'Stock Option Level'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [89]:
# Attrition odd by TotalWorkingYears
df_tab_TWY = pd.crosstab(df_final.TotalWorkingYears, df_final.Attrition, normalize = True)
df_odd_TWY = df_tab_TWY['Yes'] / df_tab_TWY['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_TWY, 
              x = df_odd_TWY.index,
              title = 'Attrition odd by Total Working Years',
              labels = {'y': 'Attrition odd', 'x': 'Total Working Years'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [90]:
# Attrition odd by TrainingTimesLastYear
df_tab_TTLY = pd.crosstab(df_final.TrainingTimesLastYear, df_final.Attrition, normalize = True)
df_odd_TTLY = df_tab_TTLY['Yes'] / df_tab_TTLY['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_TTLY, 
              x = df_odd_TTLY.index,
              title = 'Attrition odd by Training Times Last Year',
              labels = {'y': 'Attrition odd', 'x': 'Training Times Last Year'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [91]:
# Attrition odd by YearsAtCompany
df_tab_YAC = pd.crosstab(df_final.YearsAtCompany, df_final.Attrition, normalize = True)
df_odd_YAC = df_tab_YAC['Yes'] / df_tab_YAC['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_YAC, 
              x = df_odd_YAC.index,
              title = 'Attrition odd by Years At Company',
              labels = {'y': 'Attrition odd', 'x': 'Years At Company'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [92]:
# Attrition odd by YearsSinceLastPromotion
df_tab_YSLP = pd.crosstab(df_final.YearsSinceLastPromotion, df_final.Attrition, normalize = True)
df_odd_YSLP = df_tab_YSLP['Yes'] / df_tab_YSLP['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_YSLP, 
              x = df_odd_YSLP.index,
              title = 'Attrition odd by Years Since Last Promotion',
              labels = {'y': 'Attrition odd', 'x': 'Years Since Last Promotion'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [93]:
# Attrition odd by YearsWithCurrManager
df_tab_YWCM = pd.crosstab(df_final.YearsWithCurrManager, df_final.Attrition, normalize = True)
df_odd_YWCM = df_tab_YWCM['Yes'] / df_tab_YWCM['No']

# Visualizing Attrition odds
fig = px.line(y = df_odd_YWCM, 
              x = df_odd_YWCM.index,
              title = 'Attrition odd by Years With Current Manager',
              labels = {'y': 'Attrition odd', 'x': 'Years With Current Manager'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()
In [94]:
# Attrition odd by Avg_duration_sec
df_tab_AWD = pd.crosstab(pd.cut(df_final.Avg_duration_sec, bins = 10), df_final.Attrition, normalize = True)
df_odd_AWD = df_tab_AWD['Yes'] / df_tab_AWD['No']
df_odd_AWD.index = ['(18522.189, 20429.824]', '20429.824, 22318.573]', '(22318.573, 24207.321]', '(24207.321, 26096.069]', '(26096.069, 27984.817]',
                  '(27984.817, 29873.565]', '(29873.565, 31762.313]', '(31762.313, 33651.061]', '(33651.061, 35539.809]', '(35539.809, 37428.557]']

# Visualizing Attrition odds
fig = px.line(y = df_odd_AWD, 
              x = df_odd_AWD.index,
              title = 'Attrition odd by Average working duration (sec)',
              labels = {'y': 'Attrition odd', 'x': 'Average working duration (sec)'},
              width = 500,
              height = 500)

# Updating layout to reduce clutter
fig.update_layout(yaxis = dict(showgrid = False, zeroline = False, showline = False, showticklabels = True), autosize = True, plot_bgcolor = 'white')

# Showing figure
fig.show()

8 Regression analysis

The idea of an explanatory model is that the model explains the volatility in the dependent variable using volatilities in the independent variables. In this analysis, the logistic regression model is employed to explain the attrition using other factors. This model form is chosen due to the binary nature of the dependent variable. Attrition is a binary variable that only has two values $0$ and $1$, the model actually does not aim to explain the changes in attrition but to explain the changes in its probability using volatilities of independent factors.

The problem of probability is that it is bounded between $0$ and $1$, so a normal linear regression is not appropriate for modelling such a relationship. The logistic model does not assume a direct linear relationship between the probability of attrition and other factors, instead it assumes a linear relationship between the log odd probability of attrition and other factors. Probability odd of a binary factor $X$ is defined as the ratio

$$\text{Probability odd of } X = \frac{P(X = 1)}{1 - P(X = 1)}$$

So the log odd probability is the log of this ratio. When $y$ with observations $y_i, i = 1,...,k$ is given as a binary dependent variable, and $x_1, x_2, ..., x_n$ with observations $x_{1i}, ... , x_{ni}, i = 1, ..., k$ as $n$ different explanatory variables, denote the effect of the factor $x_j$ on $y$ as $\beta_j$, the logistic regression model has the following form

$$\log \frac{P(y = 1|x_1, x_2, ..., x_n)}{1 - P(y = 1|x_1, x_2, ..., x_n)} = \beta_0 + \sum^n_{j = 1}\beta_j x_j, \text{ for all observations } i = 1, ..., k$$

Potential problems of fitting such a model to the available data might be:

  • Omitted variable bias occurs when a statistical model leaves out one or more relevant variables
  • Multicollinearity refers to a situation in which two or more explanatory variables in a multiple regression model are highly linearly related

Omitted variable bias is hard to resolve because we don't have enough information about which factors really affect the log odd attrition, so it is necessary to consider all available variables then exclude irrelevant variables from the model. Multicollinearity usually happens when categorical variables are analyzed because the categories are highly correlated, so the easy treatment is to exclude one group in the categories in the final model. Furhtermore, it is often useful to check the correlations the explanatory variables using a correlation heatmap. There are some variables that we might not be interested in, but it is still useful to include them in the model to control for their impacts on the dependent variable.

Firstly, categorical variables should be transformed with dummies coding and some numerical variables should be rescaled so that they have the same magnitude with other variables. The rescaling is neccessary to avoid some technical issues with the calculation.

8.1 Preparing data for logistic regression

In [95]:
# generate dummies from categorical variables
df_logit_data = pd.get_dummies(df_final, columns = df_categorical)
df_logit_data.head()
Out[95]:
Age DistanceFromHome JobLevel MonthlyIncome NumCompaniesWorked PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager Avg_duration_sec Attrition_No Attrition_Yes BusinessTravel_Non-Travel BusinessTravel_Travel_Frequently BusinessTravel_Travel_Rarely Department_Human Resources Department_Research & Development Department_Sales Education_Bachelor Education_Below College Education_College Education_Doctor Education_Master EducationField_Human Resources EducationField_Life Sciences EducationField_Marketing EducationField_Medical EducationField_Other EducationField_Technical Degree Gender_Female Gender_Male JobRole_Healthcare Representative JobRole_Human Resources JobRole_Laboratory Technician JobRole_Manager JobRole_Manufacturing Director JobRole_Research Director JobRole_Research Scientist JobRole_Sales Executive JobRole_Sales Representative MaritalStatus_Divorced MaritalStatus_Married MaritalStatus_Single EnvironmentSatisfaction_High EnvironmentSatisfaction_Low EnvironmentSatisfaction_Medium EnvironmentSatisfaction_Very High JobSatisfaction_High JobSatisfaction_Low JobSatisfaction_Medium JobSatisfaction_Very High WorkLifeBalance_Bad WorkLifeBalance_Best WorkLifeBalance_Better WorkLifeBalance_Good JobInvolvement_High JobInvolvement_Low JobInvolvement_Medium JobInvolvement_Very High PerformanceRating_Excellent PerformanceRating_Outstanding
0 51 6 1 131160 1.000 11 0 1.000 6 1 0 0 23,505.626 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0
1 31 10 1 41890 0.000 23 1 6.000 3 5 1 4 25,030.679 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1
2 32 17 4 193280 1.000 15 3 5.000 2 5 0 3 23,320.374 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0
3 38 2 3 83210 3.000 11 3 13.000 5 8 7 5 23,228.458 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0
4 32 10 1 23420 4.000 12 2 9.000 2 6 0 4 26,952.103 1 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 1 0

The baseline is then defined as:

  • BusinessTravel = 'BusinessTravel_Non-Travel'
  • Department = 'Department_Human Resources'
  • Education = 'Education_Below College'
  • EducationField = 'EducationField_Human Resources'
  • Gender = 'Gender_Female'
  • JobRole = 'JobRole_Human Resources'
  • MaritalStatus = 'MaritalStatus_Single'
  • EnvironmentSatisfaction = 'EnvironmentSatisfaction_Low'
  • JobSatisfaction = 'JobSatisfaction_Low'
  • WorkLifeBalance = 'WorkLifeBalance_Bad'
  • JobInvolvement = 'JobInvolvement_Low'
  • PerformanceRating = 'PerformanceRating_Excellent'
In [96]:
# dropping columns belonging to the baseline to avoid multicollinearity!
df_baseline = ['Attrition_No', 
               'BusinessTravel_Non-Travel', 
               'Department_Human Resources', 
               'Education_Below College', 
               'EducationField_Human Resources',
               'Gender_Female', 
               'JobRole_Human Resources', 
               'MaritalStatus_Single', 
               'EnvironmentSatisfaction_Low', 
               'JobSatisfaction_Low', 
               'WorkLifeBalance_Bad',
               'JobInvolvement_Low', 
               'PerformanceRating_Excellent']
df_logit_data.drop(df_baseline, axis = 1, inplace = True)
df_logit_data.head()
Out[96]:
Age DistanceFromHome JobLevel MonthlyIncome NumCompaniesWorked PercentSalaryHike StockOptionLevel TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsSinceLastPromotion YearsWithCurrManager Avg_duration_sec Attrition_Yes BusinessTravel_Travel_Frequently BusinessTravel_Travel_Rarely Department_Research & Development Department_Sales Education_Bachelor Education_College Education_Doctor Education_Master EducationField_Life Sciences EducationField_Marketing EducationField_Medical EducationField_Other EducationField_Technical Degree Gender_Male JobRole_Healthcare Representative JobRole_Laboratory Technician JobRole_Manager JobRole_Manufacturing Director JobRole_Research Director JobRole_Research Scientist JobRole_Sales Executive JobRole_Sales Representative MaritalStatus_Divorced MaritalStatus_Married EnvironmentSatisfaction_High EnvironmentSatisfaction_Medium EnvironmentSatisfaction_Very High JobSatisfaction_High JobSatisfaction_Medium JobSatisfaction_Very High WorkLifeBalance_Best WorkLifeBalance_Better WorkLifeBalance_Good JobInvolvement_High JobInvolvement_Medium JobInvolvement_Very High PerformanceRating_Outstanding
0 51 6 1 131160 1.000 11 0 1.000 6 1 0 0 23,505.626 0 0 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 1 0 0 0
1 31 10 1 41890 0.000 23 1 6.000 3 5 1 4 25,030.679 1 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 1
2 32 17 4 193280 1.000 15 3 5.000 2 5 0 3 23,320.374 0 1 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0
3 38 2 3 83210 3.000 11 3 13.000 5 8 7 5 23,228.458 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 1 0 0
4 32 10 1 23420 4.000 12 2 9.000 2 6 0 4 26,952.103 0 0 1 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0
In [97]:
# Rescaling variables
df_logit_data['MonthlyIncome'] = df_logit_data.MonthlyIncome / 1000
df_logit_data['Avg_duration_sec'] = df_logit_data.Avg_duration_sec / 3600
df_logit_data.rename(columns = {'Avg_duration_sec': 'Avg_duration_hr'}, inplace = 1)

8.2 Fitting explanatory models

Firstly, a model including all factors considered. Fitting the model including all factors can be done in Python as follows

In [98]:
# Prepare data for the full model
df_x = df_logit_data.drop(['Attrition_Yes'], axis = 1)
df_x = sm.add_constant(df_x)
df_y = df_logit_data['Attrition_Yes']

# Train Test Split
df_x_train, df_x_valid, df_y_train, df_y_valid = train_test_split(df_x, df_y, train_size = 0.8, random_state = 2)

# Checking correlations
fig = px.imshow(df_x.drop('const', axis = 1).corr(), text_auto = '.2f', width = 1500, height = 1500)
fig.show()
In [99]:
# fitting the full model
model_logit_full = sm.GLM(df_y_train, df_x_train, family = sm.families.Binomial())
result_logit_full = model_logit_full.fit(method = 'newton')
result_logit_full.summary()
Out[99]:
Generalized Linear Model Regression Results
Dep. Variable: Attrition_Yes No. Observations: 3440
Model: GLM Df Residuals: 3389
Model Family: Binomial Df Model: 50
Link Function: Logit Scale: 1.0000
Method: newton Log-Likelihood: -1153.0
Date: Mon, 29 Aug 2022 Deviance: 2306.1
Time: 00:11:35 Pearson chi2: 5.40e+03
No. Iterations: 4 Pseudo R-squ. (CS): 0.1867
Covariance Type: nonrobust
coef std err z P>|z| [0.025 0.975]
const 0.8988 0.806 1.115 0.265 -0.681 2.479
Age -0.0328 0.009 -3.835 0.000 -0.050 -0.016
DistanceFromHome -0.0071 0.007 -1.029 0.304 -0.021 0.006
JobLevel -0.1080 0.051 -2.105 0.035 -0.209 -0.007
MonthlyIncome -0.0023 0.001 -1.885 0.059 -0.005 9.04e-05
NumCompaniesWorked 0.1509 0.023 6.456 0.000 0.105 0.197
PercentSalaryHike 0.0082 0.024 0.348 0.728 -0.038 0.054
StockOptionLevel -0.0846 0.065 -1.303 0.193 -0.212 0.043
TotalWorkingYears -0.0870 0.015 -5.658 0.000 -0.117 -0.057
TrainingTimesLastYear -0.1558 0.043 -3.602 0.000 -0.241 -0.071
YearsAtCompany 0.0431 0.022 1.973 0.048 0.000 0.086
YearsSinceLastPromotion 0.1789 0.025 7.216 0.000 0.130 0.228
YearsWithCurrManager -0.1793 0.028 -6.400 0.000 -0.234 -0.124
Avg_duration_hr 0.4684 0.041 11.474 0.000 0.388 0.548
BusinessTravel_Travel_Frequently 1.5829 0.245 6.468 0.000 1.103 2.063
BusinessTravel_Travel_Rarely 0.8201 0.229 3.586 0.000 0.372 1.268
Department_Research & Development -0.6684 0.322 -2.075 0.038 -1.300 -0.037
Department_Sales -0.8224 0.339 -2.426 0.015 -1.487 -0.158
Education_Bachelor -0.1246 0.183 -0.681 0.496 -0.483 0.234
Education_College 0.0169 0.199 0.085 0.933 -0.374 0.408
Education_Doctor -0.5862 0.374 -1.567 0.117 -1.319 0.147
Education_Master -0.1102 0.191 -0.576 0.565 -0.485 0.265
EducationField_Life Sciences -0.6559 0.434 -1.510 0.131 -1.507 0.195
EducationField_Marketing -0.8862 0.479 -1.850 0.064 -1.825 0.053
EducationField_Medical -0.8116 0.435 -1.866 0.062 -1.664 0.041
EducationField_Other -0.9081 0.485 -1.871 0.061 -1.860 0.043
EducationField_Technical Degree -0.9362 0.467 -2.007 0.045 -1.851 -0.022
Gender_Male 0.0336 0.113 0.298 0.766 -0.187 0.254
JobRole_Healthcare Representative 0.0453 0.359 0.126 0.899 -0.658 0.748
JobRole_Laboratory Technician 0.3221 0.328 0.982 0.326 -0.321 0.965
JobRole_Manager -0.1275 0.372 -0.343 0.732 -0.856 0.601
JobRole_Manufacturing Director -0.4609 0.361 -1.276 0.202 -1.169 0.247
JobRole_Research Director 0.8607 0.361 2.385 0.017 0.153 1.568
JobRole_Research Scientist 0.3895 0.323 1.206 0.228 -0.243 1.022
JobRole_Sales Executive 0.4848 0.321 1.510 0.131 -0.144 1.114
JobRole_Sales Representative -0.0980 0.378 -0.259 0.796 -0.840 0.643
MaritalStatus_Divorced -1.0988 0.161 -6.842 0.000 -1.414 -0.784
MaritalStatus_Married -0.8681 0.122 -7.103 0.000 -1.108 -0.629
EnvironmentSatisfaction_High -0.8254 0.149 -5.522 0.000 -1.118 -0.532
EnvironmentSatisfaction_Medium -0.7021 0.166 -4.222 0.000 -1.028 -0.376
EnvironmentSatisfaction_Very High -1.0805 0.155 -6.983 0.000 -1.384 -0.777
JobSatisfaction_High -0.5521 0.149 -3.713 0.000 -0.844 -0.261
JobSatisfaction_Medium -0.5458 0.167 -3.274 0.001 -0.872 -0.219
JobSatisfaction_Very High -1.1144 0.158 -7.075 0.000 -1.423 -0.806
WorkLifeBalance_Best -1.2556 0.254 -4.945 0.000 -1.753 -0.758
WorkLifeBalance_Better -1.3119 0.205 -6.407 0.000 -1.713 -0.911
WorkLifeBalance_Good -0.9940 0.220 -4.520 0.000 -1.425 -0.563
JobInvolvement_High -0.4064 0.211 -1.924 0.054 -0.820 0.008
JobInvolvement_Medium -0.1043 0.225 -0.464 0.643 -0.545 0.336
JobInvolvement_Very High -0.0610 0.261 -0.234 0.815 -0.572 0.450
PerformanceRating_Outstanding 0.0567 0.236 0.241 0.810 -0.405 0.519
In [100]:
# Calculating in sample accuracy for the full model
tab_valid_full = pd.DataFrame({'EmployeeID': df_y_train.index,'Attrition': df_y_train})
tab_valid_full['Forecast Attrition Probability'] = result_logit_full.predict(df_x_train)
tab_valid_full['Forecast Attrition'] = tab_valid_full['Forecast Attrition Probability'].map(lambda x: 1 if x > 0.5 else 0)
metrics.accuracy_score(tab_valid_full.Attrition, tab_valid_full['Forecast Attrition'])
Out[100]:
0.8625

Unlike normal linear regression, logistic regression doesn't offer $R^2$ as a technical statistic to explain how well the model is in explaining the variations of the dependent variable. This might be explained by the difference in binary nature of the dependent variable. In-sample accuracy can be used as a substitute technical statistics to diagnose how well the model performs. The intuition is that the model is used to forecast attrition of the sample used to train it, and this forecast accuracy is used as the metrics to evaluate how well the model performs technically. In this case, the in-sample accuracy is approximately 86 percents, which is pretty high and signifies that the model does a good job in explaining the Attrition using other factors.

However, this is only a technical metrics to diagnose how well the model performs, it does not show whether the results actually make sense. To understand the results, we should look at the actual effects of the factors on the dependent variable and whether they are statistically significant.

In [101]:
# Extracting statistically insignificant variables
table_coef_full = result_logit_full.summary().tables[1]
table_coef_full = pd.read_html(table_coef_full.as_html(), header=0, index_col=0)[0]
table_coef_full[table_coef_full['P>|z|'] > 0.05]
Out[101]:
coef std err z P>|z| [0.025 0.975]
const 0.899 0.806 1.115 0.265 -0.681 2.479
DistanceFromHome -0.007 0.007 -1.029 0.304 -0.021 0.006
MonthlyIncome -0.002 0.001 -1.885 0.059 -0.005 0.000
PercentSalaryHike 0.008 0.024 0.348 0.728 -0.038 0.054
StockOptionLevel -0.085 0.065 -1.303 0.193 -0.212 0.043
Education_Bachelor -0.125 0.183 -0.681 0.496 -0.483 0.234
Education_College 0.017 0.199 0.085 0.933 -0.374 0.408
Education_Doctor -0.586 0.374 -1.567 0.117 -1.319 0.147
Education_Master -0.110 0.191 -0.576 0.565 -0.485 0.265
EducationField_Life Sciences -0.656 0.434 -1.510 0.131 -1.507 0.195
EducationField_Marketing -0.886 0.479 -1.850 0.064 -1.825 0.053
EducationField_Medical -0.812 0.435 -1.866 0.062 -1.664 0.041
EducationField_Other -0.908 0.485 -1.871 0.061 -1.860 0.043
Gender_Male 0.034 0.113 0.298 0.766 -0.187 0.254
JobRole_Healthcare Representative 0.045 0.359 0.126 0.899 -0.658 0.748
JobRole_Laboratory Technician 0.322 0.328 0.982 0.326 -0.321 0.965
JobRole_Manager -0.128 0.372 -0.343 0.732 -0.856 0.601
JobRole_Manufacturing Director -0.461 0.361 -1.276 0.202 -1.169 0.247
JobRole_Research Scientist 0.390 0.323 1.206 0.228 -0.243 1.022
JobRole_Sales Executive 0.485 0.321 1.510 0.131 -0.144 1.114
JobRole_Sales Representative -0.098 0.378 -0.259 0.796 -0.840 0.643
JobInvolvement_High -0.406 0.211 -1.924 0.054 -0.820 0.008
JobInvolvement_Medium -0.104 0.225 -0.464 0.643 -0.545 0.336
JobInvolvement_Very High -0.061 0.261 -0.234 0.815 -0.572 0.450
PerformanceRating_Outstanding 0.057 0.236 0.241 0.810 -0.405 0.519

From the p-values, it can be seen that the following factors do not have significant effects on the Attrition odd

  • DistanceFromHome
  • MonthlyIncome
  • PercentSalaryHike
  • StockOptionLevel
  • Education
  • EducationField
  • Gender
  • JobRole
  • JobInvolvement
  • PerformanceRating

From the correlation heatmap, it can be seen that DistanceFromHome, MonthlyIncome, PercentSalaryHike, and StockOptionLevel do not have considerable correlations (whose absolute values are much lower than 0.1) with other factors, so they do not need to be included in the model as controlling factors. Some categories in other categorical variables (Education, EducationField, Gender, JobRole, JobInvolvement, and PerformanceRating) are correlated, but they are not correlated to other variables, so they can also be excluded from the model.

From this observation, a reduced model can be constructed

In [102]:
# Extracting insignificant variables
df_insignificant = table_coef_full[table_coef_full['P>|z|'] > 0.05].index
df_insignificant
Out[102]:
Index(['const', 'DistanceFromHome', 'MonthlyIncome', 'PercentSalaryHike',
       'StockOptionLevel', 'Education_Bachelor', 'Education_College',
       'Education_Doctor', 'Education_Master', 'EducationField_Life Sciences',
       'EducationField_Marketing', 'EducationField_Medical',
       'EducationField_Other', 'Gender_Male',
       'JobRole_Healthcare Representative', 'JobRole_Laboratory Technician',
       'JobRole_Manager', 'JobRole_Manufacturing Director',
       'JobRole_Research Scientist', 'JobRole_Sales Executive',
       'JobRole_Sales Representative', 'JobInvolvement_High',
       'JobInvolvement_Medium', 'JobInvolvement_Very High',
       'PerformanceRating_Outstanding'],
      dtype='object')
In [103]:
# Extracting data for the reduced model
df_logit_reduced_data = df_logit_data.loc[:, ~df_logit_data.columns.isin(df_insignificant)]

# Prepare data for the reduced model
df_x_reduced = df_logit_reduced_data.drop('Attrition_Yes', axis = 1)
df_x_reduced = sm.add_constant(df_x_reduced)
df_y_reduced = df_logit_reduced_data['Attrition_Yes']

# Train Test Split
df_x_train_reduced, df_x_valid_reduced, df_y_train_reduced, df_y_valid_reduced = train_test_split(df_x_reduced, df_y_reduced, train_size = 0.8, random_state = 2)

# Checking correlations
fig = px.imshow(df_x_reduced.drop('const', axis = 1).corr(), text_auto = '.2f', width = 1500, height = 1300)
fig.show()
In [104]:
# Fitting reduced model
model_logit_reduced = sm.GLM(df_y_train_reduced, df_x_train_reduced, family = sm.families.Binomial())
result_logit_reduced = model_logit_reduced.fit(method = 'Newton')
result_logit_reduced.summary()
Out[104]:
Generalized Linear Model Regression Results
Dep. Variable: Attrition_Yes No. Observations: 3440
Model: GLM Df Residuals: 3413
Model Family: Binomial Df Model: 26
Link Function: Logit Scale: 1.0000
Method: Newton Log-Likelihood: -1180.1
Date: Mon, 29 Aug 2022 Deviance: 2360.3
Time: 00:11:35 Pearson chi2: 4.92e+03
No. Iterations: 4 Pseudo R-squ. (CS): 0.1738
Covariance Type: nonrobust
coef std err z P>|z| [0.025 0.975]
const -0.0732 0.531 -0.138 0.890 -1.113 0.967
Age -0.0305 0.008 -3.700 0.000 -0.047 -0.014
JobLevel -0.1301 0.050 -2.591 0.010 -0.229 -0.032
NumCompaniesWorked 0.1455 0.023 6.449 0.000 0.101 0.190
TotalWorkingYears -0.0822 0.015 -5.469 0.000 -0.112 -0.053
TrainingTimesLastYear -0.1381 0.042 -3.323 0.001 -0.220 -0.057
YearsAtCompany 0.0367 0.022 1.691 0.091 -0.006 0.079
YearsSinceLastPromotion 0.1630 0.024 6.698 0.000 0.115 0.211
YearsWithCurrManager -0.1647 0.027 -6.026 0.000 -0.218 -0.111
Avg_duration_hr 0.4725 0.040 11.874 0.000 0.395 0.551
BusinessTravel_Travel_Frequently 1.6007 0.244 6.552 0.000 1.122 2.080
BusinessTravel_Travel_Rarely 0.8996 0.229 3.929 0.000 0.451 1.348
Department_Research & Development -1.0192 0.215 -4.749 0.000 -1.440 -0.599
Department_Sales -1.1537 0.227 -5.083 0.000 -1.599 -0.709
EducationField_Technical Degree -0.1943 0.200 -0.974 0.330 -0.585 0.197
JobRole_Research Director 0.6024 0.206 2.924 0.003 0.199 1.006
MaritalStatus_Divorced -1.1440 0.155 -7.387 0.000 -1.448 -0.840
MaritalStatus_Married -0.8944 0.118 -7.563 0.000 -1.126 -0.663
EnvironmentSatisfaction_High -0.8299 0.147 -5.663 0.000 -1.117 -0.543
EnvironmentSatisfaction_Medium -0.6798 0.161 -4.227 0.000 -0.995 -0.365
EnvironmentSatisfaction_Very High -1.0508 0.151 -6.956 0.000 -1.347 -0.755
JobSatisfaction_High -0.5181 0.144 -3.594 0.000 -0.801 -0.236
JobSatisfaction_Medium -0.4779 0.162 -2.952 0.003 -0.795 -0.161
JobSatisfaction_Very High -1.0723 0.154 -6.985 0.000 -1.373 -0.771
WorkLifeBalance_Best -1.0702 0.246 -4.358 0.000 -1.552 -0.589
WorkLifeBalance_Better -1.2202 0.198 -6.156 0.000 -1.609 -0.832
WorkLifeBalance_Good -0.8960 0.212 -4.229 0.000 -1.311 -0.481
In [105]:
# Calculating in sample accuracy for the reduced model
tab_valid_reduced = pd.DataFrame({'EmployeeID': df_y_train_reduced.index,'Attrition': df_y_train_reduced})
tab_valid_reduced['Forecast Attrition Probability'] = result_logit_reduced.predict(df_x_train_reduced)
tab_valid_reduced['Forecast Attrition'] = tab_valid_reduced['Forecast Attrition Probability'].map(lambda x: 1 if x > 0.5 else 0)
metrics.accuracy_score(tab_valid_reduced.Attrition, tab_valid_reduced['Forecast Attrition'])
Out[105]:
0.8616279069767442

From the in-sample accuracy, it can be seen that excluding many explanatory factors from the model doesn't change its performance significantly. This means that the excluded factors do not contribute much to explaining the attrition odd. Fitting results of the reduced model show that the effects of EducationField_Technical Degree are not statistically significant in this model. Because this factor is not correlated strongly with other factors either, it can be eliminated from the model to create a minimum model.

In [106]:
# Extract data for the min model
df_logit_min_data = df_logit_reduced_data.loc[:, ~df_logit_reduced_data.columns.isin(['EducationField_Technical Degree'])]

# Prepare data for the min model
df_x_min = df_logit_min_data .drop('Attrition_Yes', axis = 1)
df_x_min = sm.add_constant(df_x_min)
df_y_min = df_logit_min_data ['Attrition_Yes']

# Train Test Split
df_x_train_min, df_x_valid_min, df_y_train_min, df_y_valid_min = train_test_split(df_x_min, df_y_min, train_size = 0.8, random_state = 2)

# Checking correlations
fig = px.imshow(df_x_min.drop('const', axis = 1).corr(), text_auto = '.2f', width = 1500, height = 1300)
fig.show()
In [107]:
# Fitting min model
model_logit_min = sm.GLM(df_y_train_min, df_x_train_min, family = sm.families.Binomial())
result_logit_min = model_logit_min.fit(method = 'Newton')
result_logit_min.summary()
Out[107]:
Generalized Linear Model Regression Results
Dep. Variable: Attrition_Yes No. Observations: 3440
Model: GLM Df Residuals: 3414
Model Family: Binomial Df Model: 25
Link Function: Logit Scale: 1.0000
Method: Newton Log-Likelihood: -1180.6
Date: Mon, 29 Aug 2022 Deviance: 2361.2
Time: 00:11:35 Pearson chi2: 4.96e+03
No. Iterations: 4 Pseudo R-squ. (CS): 0.1735
Covariance Type: nonrobust
coef std err z P>|z| [0.025 0.975]
const -0.0608 0.531 -0.115 0.909 -1.101 0.979
Age -0.0311 0.008 -3.784 0.000 -0.047 -0.015
JobLevel -0.1289 0.050 -2.568 0.010 -0.227 -0.031
NumCompaniesWorked 0.1463 0.023 6.485 0.000 0.102 0.191
TotalWorkingYears -0.0826 0.015 -5.494 0.000 -0.112 -0.053
TrainingTimesLastYear -0.1372 0.042 -3.303 0.001 -0.219 -0.056
YearsAtCompany 0.0372 0.022 1.717 0.086 -0.005 0.080
YearsSinceLastPromotion 0.1616 0.024 6.661 0.000 0.114 0.209
YearsWithCurrManager -0.1645 0.027 -6.009 0.000 -0.218 -0.111
Avg_duration_hr 0.4726 0.040 11.872 0.000 0.395 0.551
BusinessTravel_Travel_Frequently 1.6000 0.244 6.554 0.000 1.122 2.078
BusinessTravel_Travel_Rarely 0.8959 0.229 3.917 0.000 0.448 1.344
Department_Research & Development -1.0234 0.214 -4.773 0.000 -1.444 -0.603
Department_Sales -1.1593 0.227 -5.112 0.000 -1.604 -0.715
JobRole_Research Director 0.6024 0.206 2.921 0.003 0.198 1.007
MaritalStatus_Divorced -1.1430 0.155 -7.381 0.000 -1.446 -0.839
MaritalStatus_Married -0.8905 0.118 -7.543 0.000 -1.122 -0.659
EnvironmentSatisfaction_High -0.8337 0.146 -5.692 0.000 -1.121 -0.547
EnvironmentSatisfaction_Medium -0.6825 0.161 -4.244 0.000 -0.998 -0.367
EnvironmentSatisfaction_Very High -1.0509 0.151 -6.958 0.000 -1.347 -0.755
JobSatisfaction_High -0.5223 0.144 -3.627 0.000 -0.804 -0.240
JobSatisfaction_Medium -0.4803 0.162 -2.968 0.003 -0.797 -0.163
JobSatisfaction_Very High -1.0753 0.153 -7.012 0.000 -1.376 -0.775
WorkLifeBalance_Best -1.0730 0.246 -4.370 0.000 -1.554 -0.592
WorkLifeBalance_Better -1.2206 0.198 -6.161 0.000 -1.609 -0.832
WorkLifeBalance_Good -0.9032 0.212 -4.267 0.000 -1.318 -0.488
In [108]:
# Calculating in sample accuracy for the min model
tab_valid_min = pd.DataFrame({'EmployeeID': df_y_train_min.index,'Attrition': df_y_train_min})
tab_valid_min['Forecast Attrition Probability'] = result_logit_min.predict(df_x_train_min)
tab_valid_min['Forecast Attrition'] = tab_valid_min['Forecast Attrition Probability'].map(lambda x: 1 if x > 0.5 else 0)
metrics.accuracy_score(tab_valid_min.Attrition, tab_valid_min['Forecast Attrition'])
Out[108]:
0.861046511627907

Again, excluding EducationField_Technical Degree does not reduce the technical performance of the model. Even though YearsAtCompany is not statistically significant at 5-percent level, the variable is strongly correlated with many other variables, it should be included in the model as a controlling factor. Effect of the constant is not significant, this means attrition odd of the baseline is not significantly different from 1 when other factors being equal.

As the model cannot be reduced further, the minimum model is the best model in terms of explanatory power that can be achieved by tuning the input variables, but it might not be the best in forecasting. A statistical model can be used either to explain or to forecast. In this case, the HR department might also want to forecast the attrition of current employees in order to make strategic decisions. Hence, it is essential to validate all the discussed models in terms of their forecasting power.

9 Validating forecasting power and model choice

Forecasting is different from explaining. To evaluate the forecasting performance of the model, out-of-sample data is necessary. That's why the original data is divided into training and validating sets. The model is fitted using the training set, and the validating set is used to validate the model performance with new data. This is similar to calculating in-sample accuracy, the only difference is out-of-sample data is used instead of in-sample data. Out-of-sample accuracy for the full model is calculated as follows.

In [109]:
# generate forecast table for the full model
tab_valid_full = pd.DataFrame({'EmployeeID': df_y_valid.index,'Attrition': df_y_valid})
tab_valid_full['Forecast Attrition Probability'] = result_logit_full.predict(df_x_valid)
tab_valid_full['Forecast Attrition'] = tab_valid_full['Forecast Attrition Probability'].map(lambda x: 1 if x > 0.5 else 0)
tab_valid_full.head()
Out[109]:
EmployeeID Attrition Forecast Attrition Probability Forecast Attrition
1808 1808 0 0.195 0
21 21 0 0.014 0
2214 2214 0 0.277 0
1057 1057 0 0.062 0
3291 3291 0 0.080 0
In [110]:
# confusion matrix for the full model
mat_confusion_full = metrics.confusion_matrix(tab_valid_full.Attrition, tab_valid_full['Forecast Attrition'])
mat_confusion_full
Out[110]:
array([[693,  20],
       [107,  40]], dtype=int64)
In [111]:
# overall accuracy of the full model
metrics.accuracy_score(tab_valid_full.Attrition, tab_valid_full['Forecast Attrition'])
Out[111]:
0.8523255813953489

Out-of-sample accuracy for the reduced model

In [112]:
# generate forecast table for the reduced model
tab_valid_reduced = pd.DataFrame({'EmployeeID': df_y_valid_reduced.index,'Attrition': df_y_valid_reduced})
tab_valid_reduced['Forecast Attrition Probability'] = result_logit_reduced.predict(df_x_valid_reduced)
tab_valid_reduced['Forecast Attrition'] = tab_valid_reduced['Forecast Attrition Probability'].map(lambda x: 1 if x > 0.5 else 0)
tab_valid_reduced.head()
Out[112]:
EmployeeID Attrition Forecast Attrition Probability Forecast Attrition
1808 1808 0 0.198 0
21 21 0 0.010 0
2214 2214 0 0.310 0
1057 1057 0 0.063 0
3291 3291 0 0.083 0
In [113]:
# confusion matrix for the reduced model
mat_confusion_reduced = metrics.confusion_matrix(tab_valid_reduced.Attrition, tab_valid_reduced['Forecast Attrition'])
mat_confusion_reduced
Out[113]:
array([[699,  14],
       [106,  41]], dtype=int64)
In [114]:
# overall accuracy of the reduced model
metrics.accuracy_score(tab_valid_reduced.Attrition, tab_valid_reduced['Forecast Attrition'])
Out[114]:
0.8604651162790697

Out-of-sample accuracy for the min model

In [115]:
# generate forecast table for the min model
tab_valid_min = pd.DataFrame({'EmployeeID': df_y_valid_min.index,'Attrition': df_y_valid_min})
tab_valid_min['Forecast Attrition Probability'] = result_logit_min.predict(df_x_valid_min)
tab_valid_min['Forecast Attrition'] = tab_valid_min['Forecast Attrition Probability'].map(lambda x: 1 if x > 0.5 else 0)
tab_valid_min.head()
Out[115]:
EmployeeID Attrition Forecast Attrition Probability Forecast Attrition
1808 1808 0 0.195 0
21 21 0 0.010 0
2214 2214 0 0.306 0
1057 1057 0 0.075 0
3291 3291 0 0.082 0
In [116]:
# confusion matrix for the min model
mat_confusion_min = metrics.confusion_matrix(tab_valid_min.Attrition, tab_valid_min['Forecast Attrition'])
mat_confusion_min
Out[116]:
array([[698,  15],
       [107,  40]], dtype=int64)
In [117]:
# overall accuracy of the min model
metrics.accuracy_score(tab_valid_min.Attrition, tab_valid_min['Forecast Attrition'])
Out[117]:
0.858139534883721

It can be seen that the out-of-sample performance of all three models is quite similar. It is suprising that the reduced model performs better than the full model in terms of out-of-sample accuracy. This fact shows that including more variables in the model does not help improve out-of-sample performance.

When choosing a model, it depends on the intention of the person in charge. The minimum model might be the jack of all trades and can be good for both explaining and forecasting purposes. If the difference of 0.2 percent is important, one can shift his intention to the reduced model due to its slightly higher forecasting performance. When absolute performance matters, model choice can be described shortly as follows

  • The reduced model is good for forecasting the attrition of current employees
  • The minimum model is good for understanding the factors contributing to attrition in general

Detailed explanation of the fitting results is given in the following section.

10 Interpreting results

To understand the fitting results of a logistic model with categorical variables, it is important to consider two important and related concepts:

  • A baseline can be understood as the reference group of a regression involving categorical variables. To avoid multicollinearity, a group is excluded from from all categorical variables and this group is considered as the baseline. In a logistic regression, the fitting results only show how different other categories are from the baseline, not the absolute effects of these factors on the dependent variable
  • A realized effect is the actual impact of the explanatory factor on the dependent variable in terms of magnitude, which might be different from the fitted effects due to the transformation during the calculation

Since the minimum model is chosen as the best explanatory model, the interpretation will focus mostly on the results of this model. For the minimum model, the baseline is defined as:

  • BusinessTravel = 'BusinessTravel_Non-Travel'
  • Department = 'Department_Human Resources'
  • JobRole = 'JobRole_Human Resources'
  • MaritalStatus = 'MaritalStatus_Single'
  • EnvironmentSatisfaction = 'EnvironmentSatisfaction_Low'
  • JobSatisfaction = 'JobSatisfaction_Low'
  • WorkLifeBalance = 'WorkLifeBalance_Bad'

The minimum model only includes the following factors:

  • Age
  • NumCompaniesWorked
  • TrainingTimesLastYear
  • YearsAtCompany
  • YearsSinceLastPromotion
  • YearsWithCurrManager
  • Avg_duration_hr
  • BusinessTravel
  • Department (Research & Development, Sales)
  • JobRole (Research Director)
  • MaritalStatus
  • EnvironmentSatisfaction
  • JobSatisfaction
  • WorkLifeBalance

These are the factors having statistically significant effects on the attrition odd. Since the logit transformation assumes a linear relationship between the log odd of attrition and other factors, the realized effect of these factors on the attrition odd is the exponential of the coefficients.

$$\log\frac{P(Attrition = 1|x_1, ..., x_n)}{1 - P(Attrition = 1|x_1, ..., x_n)} = \beta_0 + \sum^n_{i=1} \beta_i x_i \Leftrightarrow \frac{P(Attrition = 1|x_1, ..., x_n)}{1 - P(Attrition = 1|x_1, ..., x_n)} = e^{\beta_0 + \sum^n_{i=1} \beta_i x_i}$$

Hence, the realized effects of these factors on attrition odd are multiplicative and the effect sizes are equal to the exponential of their coefficients. The realized effects are calculated as follows.

In [118]:
table_coef_min = result_logit_min.summary().tables[1]
table_coef_min = pd.read_html(table_coef_min.as_html(), header=0, index_col=0)[0]
table_coef_min['Realized effect'] = np.exp(table_coef_min.coef)
table_coef_min[['coef', 'Realized effect']]
Out[118]:
coef Realized effect
const -0.061 0.941
Age -0.031 0.969
JobLevel -0.129 0.879
NumCompaniesWorked 0.146 1.158
TotalWorkingYears -0.083 0.921
TrainingTimesLastYear -0.137 0.872
YearsAtCompany 0.037 1.038
YearsSinceLastPromotion 0.162 1.175
YearsWithCurrManager -0.165 0.848
Avg_duration_hr 0.473 1.604
BusinessTravel_Travel_Frequently 1.600 4.953
BusinessTravel_Travel_Rarely 0.896 2.450
Department_Research & Development -1.023 0.359
Department_Sales -1.159 0.314
JobRole_Research Director 0.602 1.826
MaritalStatus_Divorced -1.143 0.319
MaritalStatus_Married -0.890 0.410
EnvironmentSatisfaction_High -0.834 0.434
EnvironmentSatisfaction_Medium -0.682 0.505
EnvironmentSatisfaction_Very High -1.051 0.350
JobSatisfaction_High -0.522 0.593
JobSatisfaction_Medium -0.480 0.619
JobSatisfaction_Very High -1.075 0.341
WorkLifeBalance_Best -1.073 0.342
WorkLifeBalance_Better -1.221 0.295
WorkLifeBalance_Good -0.903 0.405

The realized effect of each factor is interpreted as its multiplicative effect on the baseline. Suppose that the baseline attrition odd is $1.1$, and the realized effect of a factor A is $1.2$, the total effect is $1.1 \times 1.2$. In this case, the baseline attrition odd is represented by the realized effect of the constant. It should be noted that the model includes both numerical and categorical variables, so the baseline is defined as an observation with all numerical features equal to zero and categorical features equal to the baseline groups, which means

  • Age = 0
  • NumCompaniesWorked = 0
  • TrainingTimesLastYear = 0
  • YearsAtCompany = 0
  • YearsSinceLastPromotion = 0
  • YearsWithCurrManager = 0
  • Avg_duration_hr = 0
  • BusinessTravel = 'BusinessTravel_Non-Travel'
  • Department = 'Department_Human Resources'
  • JobRole = 'JobRole_Human Resources'
  • MaritalStatus = 'MaritalStatus_Single'
  • EnvironmentSatisfaction = 'EnvironmentSatisfaction_Low'
  • JobSatisfaction = 'JobSatisfaction_Low'
  • WorkLifeBalance = 'WorkLifeBalance_Bad'

Which is unrealistic in this case, so a statistically insignificant constant makes sense. As the constant is not statistically significant, the baseline attrition odd is one. With this thought in mind, the realized effect of a factor increases the attrition probability when it is greater than one, and the other way around when it is lower than one. The interpretation also depends on the type of variable:

  • For numerical variables, the realized effect is the multiplicative effect of each additional unit on the baseline
  • For categorical variables, the realized effect is the multiplicative effect of the category on the baseline

For example in this case, when the baseline attrition odd is 1, considering only two factors Age, and BusinessTravel and the observed employee is of 18 years old. If the person doesn't travel, the expected attrition odd is calculated as $1 \times 0.969^{18}$, where $0.969$ is the realized of Age in the model.

In [119]:
# attrition odd for Age = 18 and BusinessTravel = Non-Travel
1*(0.969**18)
Out[119]:
0.567319859821517

Which means the probability of attrition is much lower than that of non-attrition. After working for a year, the person now has to travel rarely, the attrition odd then changes as follows $1 \times 0.969^{19} \times 2.450$

In [120]:
# attrition odd for Age = 19 and BusinessTravel = Travel_Rarely
1*(0.969**18)*2.450
Out[120]:
1.3899336565627167

Where $2.450$ is the approximated realized effect of Travel_Rarely. This number is the attrition odd, which means the probability of attrition is approximately 40% higher than that of non-attrition. With this logic, factors raising the attrition odd are

In [121]:
table_coef_min.loc[table_coef_min['Realized effect'] > 1, ['coef', 'Realized effect']]
Out[121]:
coef Realized effect
NumCompaniesWorked 0.146 1.158
YearsAtCompany 0.037 1.038
YearsSinceLastPromotion 0.162 1.175
Avg_duration_hr 0.473 1.604
BusinessTravel_Travel_Frequently 1.600 4.953
BusinessTravel_Travel_Rarely 0.896 2.450
JobRole_Research Director 0.602 1.826

It can be seen that Avg_duration_hr has a strong realized effect on the attrition odd in the numerical variables, and BusinessTravel_Travel_Frequently has a very strong realized effect on the attrition odd in the categorical variables (up to approximately 5 times). Similarly, factors reducing the attrition odd are

In [122]:
table_coef_min.loc[table_coef_min['Realized effect'] < 1, ['coef', 'Realized effect']]
Out[122]:
coef Realized effect
const -0.061 0.941
Age -0.031 0.969
JobLevel -0.129 0.879
TotalWorkingYears -0.083 0.921
TrainingTimesLastYear -0.137 0.872
YearsWithCurrManager -0.165 0.848
Department_Research & Development -1.023 0.359
Department_Sales -1.159 0.314
MaritalStatus_Divorced -1.143 0.319
MaritalStatus_Married -0.890 0.410
EnvironmentSatisfaction_High -0.834 0.434
EnvironmentSatisfaction_Medium -0.682 0.505
EnvironmentSatisfaction_Very High -1.051 0.350
JobSatisfaction_High -0.522 0.593
JobSatisfaction_Medium -0.480 0.619
JobSatisfaction_Very High -1.075 0.341
WorkLifeBalance_Best -1.073 0.342
WorkLifeBalance_Better -1.221 0.295
WorkLifeBalance_Good -0.903 0.405

It can be seen that WorkLifeBalance, JobSatisfaction, EnvironmentSatisfaction are very effective in reducing the attrition odd.

11 Conclusion & executive summary

In this analysis, a logistic model is used to analyze the impacts of many factors on the attrition in order to understand what makes employees quit their job. The analysis employs numpy, pandas, statsmodels, and plotly as the main packages and the main steps include importing and wrangling data, linear regression, model validation, and interpretation of the results. The data includes 6 different tables including information about employees and managers as well as their working time, the final table consists of around 4,400 observations and 26 variables. The regression analysis shows that

  • Average_duration_hr, BusinessTravel, and JobRole_Research Director are the main factors that increase the attrition probability
  • WorkLifeBalance, JobSatisfaction, and EnvironmentSatisfaction are the main factors that help reduce the attrition probability

Furthermore, this model is also good for predicting attrition rate with an out-of-sample accuracy of approximately 86 percents. In conclusion, the attrition rate is well explained and predicted by the employed model.