“Understand your customer better, with data !!”
Did you know it costs five times more to acquire new customers than it does to retain current customers? And did you know existing customers are 50% more likely to try a new product of yours as well as spend 31% more than new customers?. Whether or not you currently have a loyalty program that encourages your customers to return and conduct more business with you, clearly shows the importance and impact of a successful customer loyalty program.
In this blog, we will build an application in python using python-Dash to segment our customers into various categories to help businesses make data-driven decisions to improve customer loyalty.
Here is the snapshot of the app we will build by the end of this blog.
We will be using the data set from Kaggle E-Commerce Data and here is the snapshot of the data.
“Your loyal customers are the ones who are going to buy your products even if they aren’t on sale and recommend your offerings to friends”
We will do behavioral segmentation by Recency, Frequency, and Monetary which will be calculated later. Let us look into the definitions before proceeding further.
Recency: Number of days since the last purchase
Frequency: Number of transactions made over a given period
Monetary: Amount spent over a given period of time
The below code chunk creates new variables along with Recency, Frequency, and Monetary
df['TotalSum'] = df['Quantity'] * df['UnitPrice']# Create snapshot date snapshot_date = df['InvoiceDate'].max() + timedelta(days=1) #print(snapshot_date)# Grouping by CustomerID data_process = df.groupby(['CustomerID']).agg({ 'InvoiceDate': lambda x: (snapshot_date - x.max()).days, 'InvoiceNo': 'count', 'TotalSum': 'sum'})# Rename the columns data_process.rename(columns={'InvoiceDate': 'Recency', 'InvoiceNo': 'Frequency', 'TotalSum': 'MonetaryValue'}, inplace=True)
With intention of not making this blog too long, some of the steps involving exploring data distribution, clean the data, data wrangling codes are not detailed in the blog and can be accessed from Github. At the end of the data processing, we apply thresholds to categorize and generate the dataframe.
if df['RFM_Score'] >= 9: return 'Cant Loose Them' elif ((df['RFM_Score'] >= 8) and (df['RFM_Score'] < 9)): return 'Champions' elif ((df['RFM_Score'] >= 7) and (df['RFM_Score'] < 8)): return 'Loyal' elif ((df['RFM_Score'] >= 6) and (df['RFM_Score'] < 7)): return 'Potential' elif ((df['RFM_Score'] >= 5) and (df['RFM_Score'] < 6)): return 'Promising' elif ((df['RFM_Score'] >= 4) and (df['RFM_Score'] < 5)): return 'Needs Attention' else: return 'Require Activation'# Create a new variable RFM_Level
Now, that we have the data processed and is in the expected format, we will move to build an app in the next section.
We will build our app with Dash which is an open-source python framework for analytic applications. It is built on top of Flask, Plotly.js, and React.js. If you use python for data exploration, analysis, visualization, model building, or reporting then you find it extremely useful for building highly interactive analytic web applications with minimal code. We will explore some key features including DCC & DAQ components, plotly express for visuals and build an app for a customer loyalty program in python.
Let us build a pie chart that is interactive, and updates based on the selections made by the user from the UI. For now, we will have a place holder and later connect it to data with callback() functions.
html.Div( children=[ html.H6("CUSTOMER SEGMENTATION"), dcc.Graph(id="heatmap"), ], )
Similarly, let us have place holder for the distribution chart for Recency and Frequency and again the visual will update based on the country selected by the user from the dropdown.
html.Div( children=[ html.H6("RECENCY & FREQUENCY DISTRIBUTION"), dcc.Graph(id="dist"), ], )
It will be very insightful to have scatter plots of Recency vs Frequency, Recency vs Monetary, and Frequency vs Monetary. Here is the code snippet for one of them for reference and we will do the same for other combinations ensuring the ids are unique for every plot.
html.Div( children=[ html.H4("RECENCY VS MONETARY"), dcc.Graph(id="fig_mr"), ], )
We will need a drop-down with a list of the country which will be populated from the dataset dynamically.
html.H6("Select Country"),
dcc.Dropdown( id="country-dropdown", options=[ {"label": i, "value": i} for i in tmp_df[0]['Country'].unique() ], value='Select...' ),
The other controls like radio buttons, the DAQ toggle switch controls can be added similarly. To know how to add these fields, please refer complete code here.
In the last section, we have designed the front end with widgets and placeholders. Now, these two should interact with each other every time user changes the input, and this can be achieved using callbacks. The callbacks are python functions that are automatically called by Dash whenever an input component’s property changes.
The below code chunk will generate the pie chart that we had seen in the Introduction section. More information on callbacks can be found here
@app.callback( Output("heatmap", 'figure'), [ Input("category-type", "value"), Input("country-dropdown", "value") ] ) def update_pieChart(category, country): """[summary] Args: category ([value]): [category had value selected by user from the radio buttons] country ([value]): [country had the selected country from the dropdown] Returns: [figure]: [return the fig object which is a pie chart] """ try: if country == "Select...": #filtered = df_pieChart[df_pieChart['Country'] == country] fig = px.pie(df_pieChart, values='RFM_Level_cnt', names='RFM_Level', template="ggplot2") fig.update_layout(margin=dict(t=0, b=0, l=0, r=0)) fig.update_layout(legend=dict( orientation="h", # yanchor="bottom", # y=1.02, # xanchor="left", # x=1 )) logging.debug('Piechart generated successful') return fig else: filtered = df_pieChart[df_pieChart['Country'] == country] fig = px.pie(filtered, values='RFM_Level_cnt', names='RFM_Level') fig.update_layout(margin=dict(t=0, b=0, l=0, r=0)) fig.update_layout(legend=dict( orientation="h", # yanchor="bottom", # y=1.02, # xanchor="left", # x=1 )) return fig
Similar callback functions are written to build interactivity between visuals and data generated from respective logics. We can also have tables on the UI and have callbacks implemented which would populate data at the runtime – tables will grow and shrink depending on data volume. You can find the complete code here.
Finally, you can run the app.py and access the application on your localhost at http://127.0.0.1:8050/
The app helps businesses identify, segment, and understand customers better. The companies can provide their frequent customer’s free merchandise, rewards, coupons, or even advance released products to encourage loyalty. They can also identify potential & promising customers, make suitable engagement programs to encourage/improve loyalty and long-term business.
The objective of the blog was to showcase the features of Python-Dash and how easy it is to build a bare-metal UI without HTML/CSS/JavaScript hassles. Add quality/consistent data to the mix and you have an app to make better business decisions.
The app can be extended to make the threshold configurable from the UI, connect it to a live data feed to make decisions on the fly, and possibly have more visuals to showcase product sales across regions, customer segments, etc.
Keep learnings !!!!
You can connect with me – Linkedin
You can find the code for reference – Github
https://dash.plotly.com/
https://dash.plotly.com/dash-daq
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.
Hi Amity! Wonderful article. I went through the github code. dataPreprocess python file is missing. I managed to write the code on my own but the dash app throws "Error loading layout" while hosting. Cannot find what the issue is. Can you update the github repo related to customer loyalty. It will be helpful for me