In the last article, I shared a framework to help you answer the question, “Should I become a data scientist (or business analyst)?“. For the people, who clear the cut-offs, the next obvious question is “How do I become a data scientist?” In this article, I’ll share what I would have done if I was starting my journey for a career in data science.
Starting a data science career without proper guidance and planning can be confusing. We have compiled a clear-cut free roadmap guide to building a career in Data Science that is created by expert curators at Analytics Vidhya –
I started my career as an analyst without any knowledge about the tools I was going to work on – all I knew was how to create basic models in Excel. I had not heard about Pivot tables and didn’t know something like conditional formatting even existed in Excel!
Thankfully, Capital One hired me for my logical thinking and not for the knowledge of the tools, I would need to use. In the following years, by working with several employers, freelancing, and doing a few pet projects – I learned several tools and techniques – SAS, SPSS, R, and Python included!
Having said that, if I was starting my career today, would I choose the same path? The answer is NO. I would take up a very different path, than what I did. This path would not only cut out the period of confusion I had but also uses some of the dramatic shifts which have happened in the analytics industry in the past few years.
So, I thought, I would share how I would plan out my journey to become a data scientist – if I had to chart out my career path today. Here is how I would plan out my journey (in chronological order):
Step 1: Graduate from a top tier university in a quantitative discipline
Thankfully, this didn’t change much for me. Education makes a huge difference in your prospects to start in this industry. Most of the companies who do fresher hiring, pick out people from the best colleges directly. So, by entering into a top-tier university, you give yourself a very strong chance to enter the data science world.
Ideally, I would take up Computer Science as the subject of study. If I didn’t get a seat in the Computer Science batch, I’ll take up a subject that has close ties with the computational field – e.g. computational neuroscience, Computational Fluid Dynamics, etc.
Step 2: Take up courses on the subject – but do them one at a time
This is probably the biggest change, which would happen in the journey if I was passing out now. If you spend even a year studying the subject by participating in these open courses, you will be in far better shape vs. other people vying to enter the industry. It took me 5+ years of experience to relate to the power that R and Python bring to the table. You can do this today by taking up various courses.
One word of caution here is to be selective on the courses you choose. I would focus on learning one stack – R or Python. I would recommend Python over R today – but that is a personal choice.
Take up a comprehensive course – A comprehensive course is the one which once completed makes you a top-notch data scientist. It must contain all the skills and tools to become a full-stack data scientist, dozens of real-life projects, and mentorship support. Analytics Vidhya’s Blackbelt+ program offers all of it with expert trainers.
A few benefits of this course –
Mastery in 15+ Tools
Expertise in Data Science, Machine Learning & Deep Learning Subjects
Ability to solve real-world industry problems
1:1 Mentorships with Industry Practitioners
Comprehensive & Personalised Learning Path
Dedicated Interview Preparation & Support
Take up a few free courses – Free Courses are a great way to build upon your knowledge in the initial phase of your journey. These courses offer a great introduction to data science concepts. But beware, these courses are for beginners and if you have mastery over a few subjects I would recommend you to move on to specialized courses. Let’s look at the list of a few important free courses –
Introduction to AI and ML – The perfect course to understand and navigate the Artificial intelligence and Machine Learning industry. It mentions all the skills, tools, career path to become an AI and ML professional.
Python for Data Science – Python is one of the most powerful and most widely used languages to build machine learning models. This course is great for Python beginners and also provides free certification!
Introduction to Natural Language Processing – If you are an NLP enthusiast, this is the perfect course for you. You will get to learn the basics of Natural Language Processing, Regular Expressions & text sentiment analysis using machine learning in this course.
Getting Started with Neural Networks – Deep Learning has picked over the last decade and many enthusiasts are interested in learning neural networks. The course answers questions like – What is a neural network? How does it work? What does a neural network do?
Step 3: Take a couple of internships/freelancing jobs
This is to get some real-world experience before you actually venture out. This should also provide you an understanding of the work which happens in the real world. You would get a lot of exposure to real-world challenges on data collection and cleaning here.
Step 4: Participate in data science competitions
You should aim to get at least a top 10% finish on Kaggle before you are out of your university. This should bring you in eyes of the recruiters quickly and would give you a strong launchpad. Beware, this sounds a lot easier than it actually is. It can take multiple competitions for even the smartest people to make it to the top 10% on Kaggle.
Here is an additional tip to amplify the results from your efforts – share your work on Github. You don’t know which employer might find you from your work!
Step 5: Take up the right job which provides an awesome experience
I would take up a job in a start-up, which is doing awesome work in analytics/machine learning. The amount of learning you can gain for the slight risk can be amazing. There are start-ups working on deep learning, reinforcement learning – choose the one which fits you right (taking culture into account)
If you are not the start-up kind, join an analytics consultancy, which works on tools and problems across the spectrum. Ask for projects in different domains, work on different algorithms, try out new approaches. If you can’t find a role in a consultancy – take up a role in captive units, but seek a role change every 12 – 18 months. Again this is a general guideline – adapt it depending on the learning you are having in the role.
Finally a few bonus tips:
The role of a mentor is priceless! You can try to find professionals who have navigated the industry and take their advice. The AI and ML Blackbelt+ program offer mentorship sessions and a personalized learning roadmap customized by expert mentors.
Try learning new tools once you are comfortable with the ones you are already using. Different tools are good for different types of problem-solving. For e.g. Learning Vowpal Wabbit can add a significant advantage to your Python coding.
You can try a shot at creating a few web apps – this adds significant knowledge about data flow on the web and I personally enjoy satisfying the hacker in me at times!
Few modifications to these tips, in case you are already out of college or hold work experience:
In case you can still go back to college, consider getting a Masters’s or a Ph.D. Nothing beats the improvement in the probability of getting the right job compared to undergoing a good program from a top-notch University.
In case full-time education is not possible, take up a part-time program from a good institute / University. But be prepared to put in extra efforts outside these certifications/programs.
If you are already in a job and your company has an advanced analytics setup, try to get an internal shift by demonstrating your learning.
I have kept the focus on R or Python because they are open source in nature. These are becoming the mainstream technology stack standard for the industry.
What do you think about this path towards a career in data science? Do you have additional tips, which can help people making their career choices? Please feel free to post these tips below for the benefit of a larger audience.
Kunal Jain is the Founder and CEO of Analytics Vidhya, one of the world's leading communities of Al professionals. With over 17 years of experience in the field, Kunal has been instrumental in shaping the global Al landscape. His expertise spans diverse markets, from developed economies like the UK to emerging ones like India, where he has successfully led and delivered complex data-driven solutions. As a recognized thought leader, Kunal has empowered countless individuals to realize their Al ambitions through his visionary approach to Al education and community building. Before founding Analytics Vidhya, Kunal earned both his undergraduate and postgraduate degrees from IIT Bombay and held key roles at Capital One and Aviva Life Insurance across multiple geographies. His passion lies at the intersection of analytics, Al, and fostering a thriving community of data science professionals.
Hi Kunal,
I have 7 year of IT exp in development. I gone through all your post and found that very great resource of Knowledge
I have some query and confusion in my mind. Could you please help me on that.
1- I went through you last post (“Should I become a data scientist (or business analyst") and judge my self and score "54". What should i do?
2- As i have 7 years of exp in IT development. Is changing career a good idea.?
3- Which field is better Big data Or Data Science Or Business Analyst ?
4- I found course on jigsaw for (Big data and Data science) are they good for starting or is there any other better way courses in that area.
5- Are you providing any type of training from your end. If yes please update me i want to join them.
Thanks
Dinesh
Kingshuk
Great article Kunal.
I'm in the final year of my MS Statistics course. I am comfortable with using R. Should I also do another course on SAS? Will it help? And if you could tell me what exactly the top analytics companies look for in a candidate when they hire for the role of data scientist.
Lastly, have you taken the hadoop course of udacity yourself? Is the free version of it good enough?
We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.
Show details
Powered By
Cookies
This site uses cookies to ensure that you get the best experience possible. To learn more about how we use cookies, please refer to our Privacy Policy & Cookies Policy.
brahmaid
It is needed for personalizing the website.
csrftoken
This cookie is used to prevent Cross-site request forgery (often abbreviated as CSRF) attacks of the website
Identityid
Preserves the login/logout state of users across the whole site.
sessionid
Preserves users' states across page requests.
g_state
Google One-Tap login adds this g_state cookie to set the user status on how they interact with the One-Tap modal.
MUID
Used by Microsoft Clarity, to store and track visits across websites.
_clck
Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_clsk
Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.
SRM_I
Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.
SM
Use to measure the use of the website for internal analytics
CLID
The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.
SRM_B
Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.
_gid
This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.
_ga_#
Used by Google Analytics, to store and count pageviews.
_gat_#
Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.
collect
Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.
AEC
cookies ensure that requests within a browsing session are made by the user, and not by other sites.
G_ENABLED_IDPS
use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.
test_cookie
This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.
_we_us
this is used to send push notification using webengage.
WebKlipperAuth
used by webenage to track auth of webenagage.
ln_or
Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.
JSESSIONID
Use to maintain an anonymous user session by the server.
li_rm
Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.
AnalyticsSyncHistory
Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.
lms_analytics
Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.
liap
Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.
visit
allow for the Linkedin follow feature.
li_at
often used to identify you, including your name, interests, and previous activity.
s_plt
Tracks the time that the previous page took to load
lang
Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings
s_tp
Tracks percent of page viewed
AMCV_14215E3D5995C57C0A495C55%40AdobeOrg
Indicates the start of a session for Adobe Experience Cloud
s_pltp
Provides page name value (URL) for use by Adobe Analytics
s_tslv
Used to retain and fetch time since last visit in Adobe Analytics
li_theme
Remembers a user's display preference/theme setting
li_theme_set
Remembers which users have updated their display / theme preferences
We do not use cookies of this type.
_gcl_au
Used by Google Adsense, to store and track conversions.
SID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
SAPISID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
__Secure-#
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
APISID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
SSID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
HSID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
DV
These cookies are used for the purpose of targeted advertising.
NID
These cookies are used for the purpose of targeted advertising.
1P_JAR
These cookies are used to gather website statistics, and track conversion rates.
OTZ
Aggregate analysis of website visitors
_fbp
This cookie is set by Facebook to deliver advertisements when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
fr
Contains a unique browser and user ID, used for targeted advertising.
bscookie
Used by LinkedIn to track the use of embedded services.
lidc
Used by LinkedIn for tracking the use of embedded services.
bcookie
Used by LinkedIn to track the use of embedded services.
aam_uuid
Use these cookies to assign a unique ID when users visit a website.
UserMatchHistory
These cookies are set by LinkedIn for advertising purposes, including: tracking visitors so that more relevant ads can be presented, allowing users to use the 'Apply with LinkedIn' or the 'Sign-in with LinkedIn' functions, collecting information about how visitors use the site, etc.
li_sugr
Used to make a probabilistic match of a user's identity outside the Designated Countries
MR
Used to collect information for analytics purposes.
ANONCHK
Used to store session ID for a users session to ensure that clicks from adverts on the Bing search engine are verified for reporting purposes and for personalisation
We do not use cookies of this type.
Cookie declaration last updated on 24/03/2023 by Analytics Vidhya.
Cookies are small text files that can be used by websites to make a user's experience more efficient. The law states that we can store cookies on your device if they are strictly necessary for the operation of this site. For all other types of cookies, we need your permission. This site uses different types of cookies. Some cookies are placed by third-party services that appear on our pages. Learn more about who we are, how you can contact us, and how we process personal data in our Privacy Policy.
Awesome write up kunal !!!
Thanks Pradeep
Hi Kunal, I have 7 year of IT exp in development. I gone through all your post and found that very great resource of Knowledge I have some query and confusion in my mind. Could you please help me on that. 1- I went through you last post (“Should I become a data scientist (or business analyst") and judge my self and score "54". What should i do? 2- As i have 7 years of exp in IT development. Is changing career a good idea.? 3- Which field is better Big data Or Data Science Or Business Analyst ? 4- I found course on jigsaw for (Big data and Data science) are they good for starting or is there any other better way courses in that area. 5- Are you providing any type of training from your end. If yes please update me i want to join them. Thanks Dinesh
Dinesh, Here are answers to your queries: 1. Which areas were the ones which require most improvement? 2. It depends on how you feel about your current area. I usually advice against making late areer shift, until an dunless you are dead sure about making the shift. You can read more details here. 3. Given your background in IT, BIG Data might be the best bet. But it depends on your exact experience. 4. Courses from Jigsaw are good. You can also take up a basic course on Big data on Udacity - but it is a basic course. 5. We have a basic training running for college students - focusing on Excel. Apart form that, we are not running any other trainings. Regards, Kunal
Great article Kunal. I'm in the final year of my MS Statistics course. I am comfortable with using R. Should I also do another course on SAS? Will it help? And if you could tell me what exactly the top analytics companies look for in a candidate when they hire for the role of data scientist. Lastly, have you taken the hadoop course of udacity yourself? Is the free version of it good enough?
Kingshuk, I would have kept the focus on R. SAS is easier to learn and can be picked up quickly More so, if you are from stats background. The interviews for analytics typically happen in form of business case studies and guess estimates along with test for technical skills. You can read more about these interview here. A few companies have also started arranging Hackathons to solve for their hiring problems out of college. On the course on Udacity - it is a basic course. I have done it myself some time back. If you are interested more in big data - you can also try out bigdatauniversity.com Regards, Kunal