This article was published as a part of the Data Science Blogathon
“IF YOU ARE NOT TAKING CARE OF YOUR CUSTOMERS, YOUR COMPETITOR WILL” – Bob Hooey
Customer Lifetime Value is the profit that a business will make from a specific customer over the period of their association with the business. Every industry has its own set of metrics that are tracked and measured to help businesses target the right customer and forecast their customer base for the future. The CLV enables various departments in marketing, sales, etc to plan their strategies and offer specific products or customized services to the most valuable customers. It also provides insights to customer service teams with a framework for knowing the effort that needs to be put into nurturing and retaining customers.
The CLV is most effective and adds immense value when it is applied along with other tools such as customer segmentation, pricing & marketing strategy meaning it tells us who are our most profitable customers but it doesn’t tell us which product needs to be sold at what price and quantity. Hence, CLV should be applied judiciously and shouldn’t be the sole criteria for making business decisions. The CLV may change depending on the business model and its objective which means the definition and calculation need to be revisited regularly.
Insurance: The marketing team would be interested in knowing which customers are most likely to pay high premiums and not claim which in turn helps them acquire new customers and also grows their business.
Telecom: The predicted CLV is used to understand the probability of current customers’ loyalty and the likelihood of them continuing with plans or subscriptions.
We will be using a retail dataset(CSV) from UCI machine learning repository to build the python app. The attribute description can be found in the above URL.
Let us load our dataset and take a look at the data.
data = pd.read_excel("./data/Online_Retail.xlsx") data.head(10)
data.info()
Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 InvoiceNo 40 non-null int64 1 StockCode 40 non-null object 2 Description 40 non-null object 3 Quantity 40 non-null int64 4 InvoiceDate 40 non-null datetime64[ns] 5 UnitPrice 40 non-null float64 6 CustomerID 40 non-null int64 7 Country 40 non-null object dtypes: datetime64[ns](1), float64(1), int64(3), object(3)
Let us clean the data and also create new features that we will need in later stages for calculating the CLV.
The data would look as below after completing the above steps. We will process this further as we go along.
Complete code can be accessed from pre-processing.py
We’ll develop our app with Plotly Dash, which is a python framework for building data applications. Let us create a file by name app.py and start with loading libraries.
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns import datetime as dt import numpy as np import dash import dash_table import dash_core_components as dcc import dash_html_components as html from dash.dependencies import Input, Output, State
Cards: All the 4 KPI’s that we are tracking will be at the top of the page. The aesthetics like font size and color are defined. Also, a unique ID for each of the card which will be used later to populate the values.
html.H2('Total Customers', style={ 'font-weight': 'normal'}), html.H2(id='id_total_customer', style = {'color': 'DarkSlateGray'}), ], className='box_emissions'), html.Div([ html.H2('Total Transactions', style={ 'font-weight': 'normal'}), html.H2(id='id_total_transactions', style = {'color': 'DarkSlateGray'}), ], className='box_emissions'), html.Div([ html.H2('Total Sales($)', style={ 'font-weight': 'normal'}), html.H2(id='id_total_sales', style = {'color': 'DarkSlateGray'}), ], className='box_emissions'), html.Div([ html.H2('Avg Order Value($)', style={ 'font-weight': 'normal'}), html.H2(id='id_order_value', style = {'color': 'DarkSlateGray'}), ], className='box_emissions'),
Plots: We have 2 plots, one bar chart showcasing the top-selling products and the second with purchase trends for countries.
df_plot_bar = pp.filtered_data.groupby('Description').agg({'TotalPurchase':'sum'}).sort_values(by = 'TotalPurchase', ascending=False).reset_index().head(5) df_plot_bar['percent'] = round((df_plot_bar['TotalPurchase'] / df_plot_bar['TotalPurchase'].sum()) * 100,2).apply(lambda x : "{:,}".format(x)) fir_plotbar = px.bar(df_plot_bar, y='percent', x='Description', title='TOP SELLING PRODUCTS', text='percent', color='percent',) fir_plotbar.update_traces(texttemplate='%{text:.2s}', textposition='inside') fir_plotbar.update_layout(uniformtext_minsize=8, uniformtext_mode='hide', showlegend=False)
df_plot = df.groupby(['Country','Description','UnitPrice','Quantity']).agg({'TotalPurchase': 'sum'},{'Quantity':'sum'}).reset_index() fig_UnitPriceVsQuantity = px.scatter(df_plot[:25000], x="UnitPrice", y="Quantity", color = 'Country', size='TotalPurchase', size_max=20, log_y= True, log_x= True, title= "PURCHASE TREND ACROSS COUNTRIES")
Note: Similar to the cards and the plots, the rest of the UI components like the sidebar, the table for displaying the results are designed. Please access the completed layout.py code from Github
We define a function update_output_All() which takes the values from the controls as inputs, executes the logic, meaning generated the visuals and the data table, which will be populated on the UI. The interactivity is denied at 2 levels –
def update_output_All(country_selected): try: if (country_selected != 'All' and country_selected != None): df_selectedCountry = pp.filtered_data.loc[pp.filtered_data['Country'] == country_selected] df_selectedCountry_p = pp.filtered_data_group.loc[pp.filtered_data_group['Country'] == country_selected] cnt_transactions = df_selectedCountry.Country.shape[0] cnt_customers = len(df_selectedCountry.CustomerID.unique()) cnt_sales = round(df_selectedCountry.groupby('Country').agg({'TotalPurchase':'sum'})['TotalPurchase'].sum(),2) ......... return [cnt_customers, cnt_transactions, cnt_sales, cnt_avgsales, df_selectedCountry_p.drop(['num_days','num_units'], axis = 1).to_dict('records'), fig_UnitPriceVsQuantity_country, fir_plotbar] else: cnt_transactions = pp.filtered_data.shape[0] cnt_customers = len(pp.filtered_data.CustomerID.unique()) cnt_sales = round(pp.filtered_data.groupby('Country').agg({'TotalPurchase':'sum'})['TotalPurchase'].sum(),2) cnt_avgsales = round(pp.filtered_data_group.groupby('Country').agg({'avg_order_value': 'mean'})['avg_order_value'].mean()) ........ return [cnt_customers, cnt_transactions, cnt_sales,cnt_avgsales, pp.filtered_data_group.drop(['num_days','num_units'], axis = 1).to_dict('records'), pp.fig_UnitPriceVsQuantity, fir_plotbar] except Exception as e: logging.exception('Something went wrong with interaction logic:', e)
Complete code can be accessed from app.py
Hope you liked the blog. Happy Learnings !!!!
You can connect with me – Linkedin
You can find the code for reference – Github
https://unsplash.com/