The DataHour Synopsis: Traversing Journey of an Analytics Problem

ankita184 Last Updated : 14 Jun, 2022

20 min read

Overview on Analytics Problem

Analytics Vidhya has long been at the forefront of imparting data science knowledge to its community. With the intent to make learning data science more engaging to the community, we began with our new initiative- “DataHour”.

DataHour is a series of webinars by top industry experts where they teach and democratize data science knowledge. On 30th May 2022, we were joined by Amitayu Roy for a DataHour session on “Traversing the Journey of an Analytics Problem.”

Amitayu Ray is a Data Science leader with 16 years of experience in Analytics consulting, AI solution design, client management and business development across industry practice areas. Helping telecom and media giants around the world realize value from large scale AI/ML implementation. Empowering organizations with out of the box AI solutions to solve modern and constantly evolving problems.

Amitayu is currently working as a Senior Manager in the field of Applied Intelligence, Strategy and Consulting at Accenture. He works at exchange strategy and consulting. He lead their analytics consulting, data practice, AI and machine learning enablement for the North America geography.

Session Link

Are you excited to dive deeper into the world of Data Science and Machine Learning? We got you covered. Let’s get started with the major highlights of this session: Traversing the Journey of an Analytics Problem.

Introduction

With this session, you’ll learn:

What are the objective of this session? Basically, this session is a data science journey, so, it’s more of a storytelling journey. Taking you of a problem from a business problem to an industrialized value-based solution that you give to your business. And this is what we are focusing on.

Here we’ll be focusing on how kind of value we realize from an analytics problem. It talks more about the processes
associated with a typical analytics problem and an analytic solution. It talks about the evolution of data science over the years and where we are heading. This helps you how do you really translate a business problem and you know how to translate it into a certain analytical form and also tells the outlines, the key responsibility areas for the different rules that are evolving that are there right now.

What this session suggest not to do? We are not doing deep dive into AIML algorithm. This is not a technical session on data engineering on big data cloud platforms. This is a process oriented session to help you understand the end-to-end journey where exactly does your work fit in. We are not going to talk about model performance parameters, feature engineering, etc. We’ll talk at a high level we are not going to get into the details of what exactly is going on whether those okay. We are also not going to talk about the agile approach to deliver AIML project.

Some foundational Updates from Data Science Industry

This is important, because, to understand a journey you need to have some basic foundational updates. For example: In this session there is a group of 250 individuals. If we need to plot this data on a maturity curve then some of you are already advanced in analytics, some are just starting up. So, kept this session as generic as possible. Presenter tried answering/narrating that journey in a manner so that it becomes clear to everyone.

How Data Science have Evolved over the Years?

So, let’s have a quick round up of how data science have evolved over the years.

First Stage

In 90s, Analytics 1.0 emerges. So, this is when the BI emerged for the first time. Data started becoming very important asset. But it was mostly on Excel, VBA, etc. And you know the consumption of that from a business perspective, whoever was consuming that they were just looking at some manual reports. Somebody was opening an excel doing some pivots, etc.

Second Stage

In the early 2000s, from a data side we looked at the data warehouse-the ETL systems. From a report side we looked at the the Dashboards, Tableau and Power Bi. The most dominant system at that point of time was SAS and SPSS. All the statistical models working at that time were Regression Models, time series clustering. Analytics consumed these reports and dashboards were semi-automated. The models were run manually, even, the Feature Engineering was done manually too.

Third Stage

Here people start investing in building stronger data warehouses. Resultantly, emerges Analytics 3.0 in 2010s. By this time people have already realized that data is an important asset. So, big data has started to emerge. Hadoop has already come into picture. The Dashboards, Power BI are mostly automated by now.

The ML models comes into the picture for the first time in late 2000s. Eventually, become a stronger application as compared to SAS. From here building high-end ML models starts. But the models themselves are not fully automated. This means that you have to manually run models, check results, the roc curves, performance matrix, do model validations, etc-full manual check.

Fourth/Present Stage

Now, we are in analytics 4.0, actually, between 4.0 and 5.0. This happened in 2015. The advancement of big data has transformed the way we manage/compute data. All the big players like the Amazons, the Googles, the Microsoft started investing in integrated cloud platforms. They realized the tremendous profit of a cloud platform service which can do end-to-end data management, reporting, visualization, modeling, model implementation, etc.

Web-based reports/Automated reports, Deep Learning, AI, the NLPs, the TensorFlow’s, simulation models, auto MLS became very common. Now, everybody started thinking about productionizing analytics/analytics model. The ML world came into existence around 2018.

Future Stage

Analytics 5.0 is where we are heading. So 2025 is when we are possibly going to reach a very different dimension of analytics. Then, the quantum computing, big data on cloud will become normal for most of the organizations; the visualizations will mostly become ERVR visualizations. In fact, with these visualization – tableau, power bi will also exist. Then we’ll have some production ready AI implementations. All new investing models will become end-to-end industrialized i.e. end-to-end automate.

The “Must Have” AI knowledge Assets for tomorrow

The must have skills\good to have skills are:

Your ability to problem solve
Know SQL and fundamentals of querying your data
Fundamentals of mathematical step and statistical deductions
the basic principles of computational algorithms

On top of that these foundational skills for the next three-five years since it’s a very evolving industry are:

Changing big data engineering framework: It is going to be a critical skill that will probably play a big role.
Knowledge of cloud platform architecture: With the AWS and Google cloud and Azure almost an entire suit of analytics products are available on the cloud.
You need expertise in industry: Build your capability in one domain.
Explainable AI and ML methodologies: For almost a decade now we have just bypassed these questions from businesses saying that these are black box models. All the AIML models are black box models that is no longer going to stay. So we need to come up with approaches by which the model methodologies can be explained.
Implementation of analytics and MLAPs principles

One important take-away is-Upskill and Evolve yourself consistently to stay relevant in the market.

Analytics Problem Journey

Typical industry problem we encounter

Lets start from the problem in its very raw form. What does a problem look like.

Industry problems are extremely vague. Moving forward many of these proposals/business meetings (with the CTOs, the CIOs, the CMOs) gives you a high level problem statement which might not make any sense.
When you ask question against that problem you realize that there are too many unanswered questions.
Nobody is defining a clear outcome, so it’s your job to define that outcome.
In many cases businesses do not give clarity around that issue.

So we as analytics consultants/data scientists need to have that clarity in our mind to be able to answer and address those questions.

Four Major Strategic Priorities

There are four major strategic priorities and this is where they earn their bread from. This is how they generate their money from the business.

Enabling more revenue growth by employing different revenue-oriented strategies.
Reducing optimization cost–Every business has a cost operational/capital cost, here focus is to optimize that cost.
Improve/Re-engineer processes–there might be lots of inefficiencies in a process. So, to reduce them, you need to re-engineer your process.
Improve customer experience

A Customer journey from Industrial Point of View and Role of Data

Imagine yourself as a customer. How do industry view you as a customer and what they do. This is the journey of a customer.

Prospect Assessment: Industry try to identify who is the right customer for them. Example: Zomato has a very robust prospect assessment engine. This helps whom should they onboard i.e. they sends very specific niche kind of messages to people who are their customers to get them on board.
Acquisition: It means how do you bring that customer on board.
Onboarding and Engagement: How to engage more with the customer. Example: Amazon- you see a campaign that amazon offers. Next moment you join amazon. The way amazon make sure that you use their services by sending you niche kind of messages (eg- asking your opinion about a product). That is where the engagement part comes.
Growth Marketing: It is typically when they engage with customers. They are trying to cross sell/upsell something.
Loyalty and Operations: Loyalty-if they figure out that you are going to other competitors also like flipkart, etc; they do get that data and then they try to create differentiated services on the operation. They ensures that the customer service is on their tool. Example-Return request.
Churn and Retention: When you are not engaging enough with amazon or just stopped using amazon. They have a feeling of customer churn and then they figure out new things to retain you.
Feedback and Social Listening: The feedbacks you personally provides or industry get/gather from online media (eg-you published on twitter that you are not happy with amazon service). On the basis of these, they try to improve.
Personalization: It means that they want to give you niche personalized service.

There is analytics applicable everywhere right from a prospect assessment to an acquisition to onboarding, growth, loyalty, etc.

Importance of Outcome and Value/Impact driven by the Solution

Source: Amitayu’s Presentation

Churn is industry agnostic means that it could be for any service providing industry (banking, telecom, retail, consumer goods, e-commerce). Loyalty management and churn and retention are two major functions which
are associated with churns.

There are five major questions for any churn problem anywhere in the world.

Why are they leaving
Who are most likely to leave?
Whom do business want to retain?
What kind of actions should business take to routine
If business have taken an action how should they target.

What we need to address here is- what is the impact or the value that the business is trying to achieve by
addressing this particular problem? As a data scientist, this business approaches us and ask for a befitting solution. The main reason behind churn is cost acquisition. This is the main thing why our organizations typically do these kind of churn analysis because cost of acquisition is very high. So they need to retain customers and ensure that we, data scientists, are able to sort of:

what is the potential reason for churn
how to retain customers
what could be done to increase profit
how to increase revenue with continuity

What is Probable Solution Here?

A. Prevent Revenue Loss

Identify customers who are likely to churn
Compute net value generated by a high-risk customer
Retain high value – high risk customers with suitable retention offers
Prevent revenue loss through retention.

B. Optimize Campaign Cost

Compute cost of retention campaign
Estimate the total budget of retention campaign (Cost*No. of leads)
Identify the right customer to whom this campaign needs to be sent
Calculate the ROI of the campaign, by calculating net revenue saved v/s net cost incurred

Hypothesis Driven Approach-The Most Effective Way of Problem Formulation

Hypothesis driven approach is a proven approach that the consulting firms/the analytics firms have sort of adopted for many years.

A hypothesis seeks to explain why something has happened, or what might happen, under certain conditions. They are often written as if-then statements. So any hypothesis driven approach has five
major ways to solve that problem:

what is my end goal?
how do I reach there?
what is the journey towards that end goal?
and what kind of input information would be required through the journey?
then, what are the key milestones that needs to deliver?
whatever I am doing today I am building complex AI models but is the business able to consume that?

Now, we’ll look how to solve all these questions and draw analytics.

Stages of an Analytical Problem Journey-the Analytics Solution Hierarchy

Solution of the analytical problem is:

Problem Formulation: We build an issue tree. It means you are getting a problem or breaking down the problem into simpler blocks which are easy to understand. Then, make sure you are following a MECE approach (hard to be 100% compliant).
Solution deliveries: Build hypothesis from the hypothesis chart. Then, do analysis outcome from each of these. Lastly, validate the hypothesis on data.
Data Requirement: Key attributes/features required for validation and testing. Then, identify the root variables and sources from where they are collected. Lastly, do assessment of data availability and accessibility.
Analytical Approach: Perform hypothesis testing-either approve, disapprove or iterate. Generate helpful insights from the testing. Combine these hypothesis and use them with AIML models. Then, test accuracy and stability of these models. Lastly, explain what-so-ever the model output is and with proof.
Implementation roadmap: Scale and map is the end-goal. Make sure that data and model pipelines have been built as a part of MLOPs. Then, integrate it with client cloud system and do SIT. Do monitoring after post implementation also. Also, provide end-to-end enablement training to run operations.

Example: How to Perform Issue Tree?

How to make issue tree.

Hypothesis Framework – Build on these Issue Trees

Source: Amitayu’s Presentation

For every problem, we’ll build a issue tree and for these we build a hypothesis framework.

Key Roles Involved Across the Stages

Source: Amitayu’s Presentation

The kinds of roles involved through the journey are:

It is mostly the business analyst, the domain experts, very small representation of data analyst, a very small representation of data scientists, so, all of them are sitting together brainstorming how to translate that business problem into an analytical problem.
Business analyst or a typical consultant plays big role here because you know they will look at the industry practices and will look at the benchmarks. And then accordingly they will try to come up with the hypothesis.

Data engineering is as expected the bulk of the work. There is from a data engineer, data scientist and data analyst also play a small role in that field of space.

The analytical approach which is where the model building, the insights the visualization everything happens. Data scientist has a big role to play as a data analyst. Data engineer also has a significant role to play because they are the ones who are building up the data.

The ml engineers role is to implement these models. This is a new role that is coming up in the industry people who implement models into an existing framework. This kind of a role requires you to have understanding of
models as well as understanding of a technical architecture.

Therefore, we can say particular role plays a big part in the solution implementation. The ml engineer, the business analyst also plays a big part because they are the ones who are bringing all the entities together in making this solution a success.

Data Engineering – The Power-house of Data Science Solutions

Purpose of Existence

AI Data Foundation: Data engineering creates the foundation for all AIs. So bringing together the data from multiple different sources (structured or unstructured) ensuring the data is correlated and is ready for consumption. This is creating the foundation legacy modernization i.e. if you have an old platform like DB2; data engineers job is also to ensure that this legacy platform is modernized into a modern data architecture.
Data Lake Mainstreaming: Data lake is where you store all your information. It’s a consolidated view of all your data sources into a unified platform. Design of a data lake followed by a data warehouse-customer 360. There are lots of compliance related stuff operations – data security and governance that comes here in data like mainstreaming.
Data on cloud: In the next 10 years, all the data will come on cloud. There will be no on-premise systems at all existing. Even the smaller organizations are moving to cloud. So, you need to have the foundational knowledge/know how of the AWS, GCP, etc. And how does it sort of integrate with all other applications.
Data and analytics consumption: A data engineer’s role is evolving from consumption of data and consumption of analytics, dashboards, etc. Now, they ensures that the data is in such a state that can be pulled automatically from the source system, and then fed into a data lake. Then create and publish report by pulling data from data lake. Consumption means that the data engineering team is end-to-end enabling these functionality.

Why We are Saying Data Engineering as Power-house and its Key-trends?

Data engineering is the powerhouse or the mitochondria of the today’s data-driven world. We as data scientists are a data scientist for many years. The data engineers are the ones who holds everything together. The data engineering team is ensuring that all the data flow and should have no gap in that solution which is same as business self-service. So that the business does not have to do anything, everything will be enabled by the data engineers and data scientists.

It’s key-trends are:

Cloud Deployments: Majority of organizations are leveraging cloud solutions to rapidly standup analytics or operational environments.
Governed Data Lakes: They are evolve to become center sources of Data to enterprise via data catalogues to search and shop for data capabilities.
Rapid Insights Discovery: Investing in data exploration capabilities to identify patterns, trends and unknown opportunities.
Business-Self Service: Greater use of search, SQL, NLP and self-service tools for intelligent data preparation, operational intelligence and visualization.
Modern-Hybrid Architecture: Companies are leveraging several technology components for accelerating data movement.
Smart-Data Management: Evolution of intelligent solutions to data management challenges, automation and learning based solutions to integration and data quality.

Key-Pillars to Data Engineering Engagement

Assessment of Existing Data Architecture: To address a churn problem, first thing is assess the architecture of your business-

End-to-end knowledge of their data.
Do a basic data discovery – for hypothesis what kind of data sources do I need
What kind of existing technology stack do they have
Is that data accessible how do i access it

These are the initial steps of building up doing an assessment of the enterprise data architecture. Those who are a little advanced know what a sandbox is – setting up a sandbox and virtual workbench. And connecting your data sources and your applications within the sandbox to your data warehouses and data lakes is also part of the data engineers.

Development of Analytical Data Record: So for churn problem we need to build up an analytical data record; which is a customer level data set which will help me build models. So, the data engineer as well as the data scientists are going to build the data together. This tasks acquisition of the data quality assessment, Creation of the features integration of the lakes-merge with the data warehouse, data dictionaries, and metadata management.

Deployment of Analytical Solution Framework: They create the data pipelines to automate end-to-end solution MLOPs. They are creating the scaling and automations in the data to ensure that the codes are configurable and deployable and using parameter driven batch codes. Performing feature engineering based on use cases and applying meta data management.

What is a Customer360?

This is the output that the data engineer will produce for the purpose of model which is an analytical data record. A customer 360 is a view of the customer by which all the possible attributes of a customer are brought together under one platform like financial usage, service products, channel, etc.
When we build these exhaustive customer records it has thousands and thousands of features that cater to many many different kinds of use cases and not just churn. Churn is just one of those.

From Diagnostics to AI problem- How Does the Problem Evolve?

EDA (Exploratory Data Analysis) – Know as much about the data at your disposal (Churn Problem)

We already defined what is a churn. Generally, if somebody stops their service it’s called a churn. But a churn could be a:

Product churn or a relationship churn: If you are an atl customer you have many different products. You decided to stop the services of your post paid-that’s a product churn. But if you have stopped using service of all of the services-tv, broadband, etc that’s a relationship churn.
Inactivity or a hard churn versus a soft churn: Somebody who is inactive for a long time may often be confused as a churn. Because that person is not doing anything. We might often predict that this person is not going to use my service any longer. But suddenly that person comes back, so, how do you differentiate between that. That is something that you need to understand clearly.
Dealing with returning customers is the same point if a customer is inactive for a long time suddenly comes back. Do you want to call that customer as a churn.
Voluntary versus Involuntary churn: Assume if you are telecom provider and decided to throw out someone yourself because he/she has been a very bad customer. That’s not a churn.
Frauds and delinquents: So frauds are doing certain fraudulent activities. If they leave do you really want to call them as churn.

So all of these questions need to be answered. On this basis, we have three major aspects:

and has to be clearly oriented with an outcome,
and clearly quantified.

Assume that we have defined who is a churn in our data and we are now trying to answer each of these questions.

Testing Your Hypothesis

After knowing who is a churn. We’ll solve-why are customers leaving/unwilling to continue?

We’ll follow six major steps:

Define a hypothesis: Build a hypothesis following issue tree criteria.
Attribute extraction: After hypothesis generation ask the data engineer from where you can get the data. This is called attribute extraction.
Feature creation means you create a certain variable or a derived variable according to your business problem or to test your hypothesis.
Causal forensics which is more around understanding the root cause behind a certain event.
Statistical significance test: when you are doing a root cause it has to be statistically true or mathematically true to say that yes this is genuinely a true hypothesis.
Hypothesis Verdict: Example – customers who have less volume of incidents in your broadband or face less outages in the broadband are less likely to churn. That’s a hypothesis. How to test this. Take all customers, look at the churns, then look at what volume of outages are happening there. If churn is high within that group of population where the outage is high you can certainly say that your hypothesis is true if it is statistically proven.

Value Segmentation – Who has the Most Potential?

You might misunderstood that every customer who is leaving may not be important to us. But this is not true. We determine that by using an approach called as a value segmentation.

Here we’ll categorize customers on the value that they are going to generate for the business which is called the net present value and future which is called the lifetime value. We build a different variety of models here like some basic segmentation clustering models or some very advanced Ai models to get to this outcome. But typically the result of that is something like

We have very high value customers. So if you look at your data and come to more details around the data, these are very high around the r2. They’re very high revenue players – their 10 year is already more than 60 months they

Similarly high value, mid value, and low value- the big chunk of customers in a typical customer base regardless of what industry is, do always come in the low value category and whether or not you want to retain them is a decision that you need to take on an individual basis. But this is where your retention becomes important. You definitely want to retain them.

Value Segmentation - Who has the Most Potential?

Source: Amitayu’s Presentation

AI-ML Powered Modelling Approach

By using this model engine we are able to answer two churn related questions:

Which of business customers are more likely to churn next?
What action should business take to retain them?

There are three major steps in this model engine

First, develop a model
Second, validate a model and
Last, ensure that your model is good enough to be deployed

How you’ll do that? Whatever that model is in development/deployment you need do your future engineering and hypothesis testing. Do a lot of ensemble algorithm selections and specialized algorithms for specialized use cases etc. And most importantly do an iteration. Keep on iterating this process till the time your model looks good.

How do you know your model is looking good? Through validation – You can check model accuracy, stability, and robustness.

Finally value creation – Model explainability, Automated ML frameworks is an important value for businesses today . Businesses today want an agile approach. What is an agile approach? Agile works in sprint so we have to create deliverables in sprints and a typical AIML is a good example of how you can deliver different kinds of solutions/sprints to businesses.

Churn Prediction and Recommendation Model

We are going to build three models for the purpose of this solution

Churn Prediction Model: So, this is a propensity model wherein you will predict that which customers are likely to churn. On the basis of data you will tell the business that these are the customers who have a high likelihood of change the business.

Value Segmentation Model which you should overlay on top of your churn prediction model which will tell you that these are the guys likely to churn but this is the value on top of that. So if you see that a particular customer is high risk and high value you definitely want to maintain that the business is saying that’s good for me.

How to Retain Them?

Offer Recommendation Model: For every segment or even for every single customer we will create an offer

and you can take that offer to that customer and you will be able to retain that customer.

Algorithms used in the models: A churn prediction model will be a classification or a regression problem or simulation. Churn specifies feature engineering.

Segmentation model — Use clustering, an unsupervised appproach ( hierarchical or non-hierarchical clustering) or you use the supervised segmentation like a KNN or some of the advanced AI techniques.

Then in a offer recommendation model you do typically a collaborative filtering or a rule engine. Collaborative filtering is the same algorithm lies in all recommendation systems around the world be it amazon, netflix, youtube and google. You need to have test control mechanism which means that every time you are sending out a recommendation to somebody then the effectiveness of that recommendation needs to be tested.

So if you are sending out an offer to a customer, whether or not the customer accepts that offer tells you that how good your recommendation was. Which is what gives rise to something called as a reinforcement learning. Reinforcement learning reinforces the performance of a model with the help of this approach.

AI-ML Powered personalized Framework

Pivotal element of a client digital transformation across industries.

Source: Amitayu’s Presentation

A churn problem or the end goal for many of the churn problems could be a personalization outcome wherein for every customer who is likely to churn you have a very specialized gift. For every single customer depending upon what is the customer value, what is the offer that you’re planning to generate, through which channel will you target that customer, what message you will take to the customer.

Example: Zomato – send you emails with your name on top. The moment you see that you feel how they’re sending me a very personalized email. Even these days they are going one level deeper. They are sending subject lines based on your other email topics. So zomato has a subject line which says “you have been browsing or you have been looking into amazon”. And immediately you feel – how do they know that and you click on that email. This clicking is called a clickbait.

Generating Analytics Value through Implementation

There are different levels of value that AI generates like an intelligent product, intelligent automation, enhanced interaction, and also creates enhanced judgment, enhanced trust. So all of these are individual value players which ensure that the outcome that you are trying to get out of an analytics use case is getting multiplied. The value that you saw first is for an intelligent product. And for an intelligent automation you should implement your AI solutions/ analytic solutions.

Source: Amitayu’s Presentation

Key Building Blocks for Value Realization

There are five steps you have to understand:

Source: Amitayu’s Presentation

Know your domain unless you walk into the shoes of the business you will never be able to solve a problem. Start simple you cannot solve everything on day one.

Start simple-start delivering smaller values and only then you can reach the goal.

You need to have a digital mindset you need to ensure that as much data you can get you need to incorporate that data and that will only transform businesses.

The value framework is a must when you are thinking about building these AI applications.

There are no silver bullets – there are no shortcuts.

If we are to build something credible you have to follow the steps. It’s not a one day job it takes months to build that credibility but that is what you need to do somebody.

Machine Learning Deployment Life Cycle

Source: Amitayu’s Presentation

Conclusion

I hope you enjoyed the session and understood it very well. Major Takeaways from the session are:

What to do and what ought not to do?
How Data Science has evolved over years?
The must have skills for tomorrow.
Analytics Problem Journey
Data Engineering – The power-house of Data Science Solutions.
From Diagnostics to AI problem- How does the problem evolve.
AI-ML Powered Modelling Approach.
Churn Prediction and Recommendation model
AI-ML Powered personalized Framework
Generating Analytics Value through Implementation

ankita184

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

The DataHour Synopsis: Traversing Journey of an Analytics Problem

Overview on Analytics Problem

Introduction

Some foundational Updates from Data Science Industry

How Data Science have Evolved over the Years?

First Stage

Second Stage

Third Stage

Fourth/Present Stage

Future Stage

The “Must Have” AI knowledge Assets for tomorrow

Analytics Problem Journey

Four Major Strategic Priorities

A Customer journey from Industrial Point of View and Role of Data

Importance of Outcome and Value/Impact driven by the Solution

What is Probable Solution Here?

A. Prevent Revenue Loss

B. Optimize Campaign Cost

Hypothesis Driven Approach-The Most Effective Way of Problem Formulation

Stages of an Analytical Problem Journey-the Analytics Solution Hierarchy

Hypothesis Framework – Build on these Issue Trees

Key Roles Involved Across the Stages

Data Engineering – The Power-house of Data Science Solutions

Purpose of Existence

Why We are Saying Data Engineering as Power-house and its Key-trends?

Key-Pillars to Data Engineering Engagement

What is a Customer360?

From Diagnostics to AI problem- How Does the Problem Evolve?

EDA (Exploratory Data Analysis) – Know as much about the data at your disposal (Churn Problem)

Generating Analytics Value through Implementation

Key Building Blocks for Value Realization

Machine Learning Deployment Life Cycle

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions