As part of DataFest 2017, we launched a new initiative – DataHack Hour. DataHack Hour was inspired by numerous queries we get related to learning Data Science. Questions like “How to learn analytics?” or “How to become a data scientist?” are asked to us multiple time every day.
While we had written several articles on this subject on Analytics Vidhya – we needed something more definitive to answer these queries. In order to answer these queries, we decided to create an experience to show people how to learn Data Science. This version of DataHack Hour is our answer to the above questions or many more questions which come to us.
DataHack Hour is completely free to consume for Analytics Vidhya community and is created with an aim to help more and more people learn data science. This article will tell you the journey participants of DataHack Hour are undergoing. If you are one of the people struggling to learn Data Science – join DataHack Hour today and become part of this awesome experience.
DataHack Hour is based on a very simple concept – “Daily small improvements or learning in small steps can make a huge difference over time.” It is the same principle which Jeff Olson describes in his book “The Slight Edge”. Let me explain this in a bit more detail.
Most of the queries which we receive on Analytics Vidhya about challenges in learning can fall in one of the following categories:
We believe that DataHack Hour is the solution to the first 3 problems mentioned here. We believe that by going over one chapter at a time daily, with help of volunteers and mentors from community to help is the most powerful way to learn Data Science. You learn by solving hands on problem, the content has been curated by Analytics Vidhya team and there are mentors to help you out on a daily basis. Honestly, I can’t think of a better way to learn!
We launched DataHack Hour on 16th April 2017 as part of DataFest 2017. We got outstanding response from the community members and from people who want to really learn the subject.
We came across various users like EspyM, who said he does not have access to such resources in his country and in 5 days we have seen him devoting time to build first model and submit a solution to DataHack platform! I am pretty sure that by end of this DataHack Hour, we will have multiple people like EspyM who would enable learning in their own communities later on.
In order to raise awareness about DataHack Hour further, we are releasing the content of the first 5 days on our blog. The idea is to put the content out to a larger world and invite people who have missed out on 5 awesome days. You can still join today by learning the content below. You can register for DataHack here.
If you registered on DataHack Hour and missed out a particular day, you can go through the content below and come back on track.
We kicked off Datahack Hour with this awesome session by Tavish Srivastava. The agenda of the webinar was “How to convert a business problem to analytics problem? and Importance of hypothesis generation”. This is the best place to start your journey about learning analytics. It also touches about the point which gets ignored in a lot of tool focussed courses today.
Here is the webinar recording from the session:
Hopefully you are all geared up for the hands on exercises to come!
From Day 2 onwards we started our 1 hour challenges. The agenda for day 2 included the following:
Let us cover them one by one. You can download all Day 1 resources here after Logging in and Signing up. By end of the day you would have installed Anaconda, become comfortable with Jupyter notebook interface and would have written a few simple programs in Python and Pandas. We also cover different data structures in Python, iterative and conditional statement and ways to load and access the data.
Our mentor for the day was none other than me 🙂
This session focussed on some of the practical challenges people face while doing exploratory analysis. Irrespective of how good is your data, you would come across missing values and Outliers. This session was aimed to help people deal with missing values and Outliers in the data. Again, you can access the content here after logging in and registering for DataHack Hour.
Topics covered in Missing Value
Outlier detection
On day 5, people will start build simple predictive models. The sessions starts with talking about what is a predictive model and enables you to build a simple and a multivariate regression model by end of this session. You can download the resources here.
Here is the agenda for the coming days. If you think you got stuck in learning analytics and data science in past, come and join us in these DataHack Hour sessions. By end of this DataHack hour, you will be able to work on data science problems independently, would have 10+ mentors you wold have interacted with and a few hundreds of peers. All of this is available freely and can be absorbed as long as you are motivated!
Day 6: Feature Engineering and Transformation, helps to improve model performance
Day 7: Validating and measuring model performance
Day 8: Building logistic regression model
Day 9: Building naive bayes model
Day 10: Building a decision tree model
Day 11: Building a k-NN model
Day 12: Ensemble, Methods to combine model outcomes
Day 13: Apply your learnings on 6-hours hackathon
Are any of the slides/videos posted anywhere online for ppl who missed to see it ?
The material has not been released for now as DataHack Hour may recur at a later time.