Machine Learning at Scale using SparkML for Big Data
Wondering how to use machine learning at large scale? Join this curated workshop by our data science experts who will introduce Apache Spark to process a large amount of data and build different machine learning models using MLLib. This workshop goes beyond building your expertise in applying different machine learning algorithms on huge data using MLLib. It focuses on how machine learning algorithms can be applied at scale to work on petabytes of data, to generate models used for predictions.
Pre-requisites for the workshop
- Python programming experience
- Machine Learning basic knowledge
- Pandas would be useful
STRUCTURE OF THE WORKSHOP
This is an 8-hour workshop and includes the following modules:
- Introduction to Spark
- Installing and setting up Spark
- Spark API’s in Scala, Python, R
- Basic syntax of Spark
- Read, process, aggregate, write data
- Modeling framework using MLLib
- Different ML algorithms
- Feature engineering
- Evaluation metrics
- Mini-hack
- AMA
INSTRUCTORS
Rohan Rao
Rohan Rao (a.k.a. ‘vopani’) currently works as a Senior Data Scientist at Paytm, building machine learning solutions for the organization. Prior to this, he has worked with multiple startups in the machine learning space across various industries, platforms and products.He is a regular participant of hackathons and competitions, having won on several occasions, including the AV DataFest in April-2017, which ranks him #2 on the AV Leaderboard. He’s a Kaggle Grandmaster and ranks among the top-100 Kagglers in the world.Apart from Data Science, he’s a 11-time National Champion in Sudoku and Puzzles, having represented India at the World Championships last 9 years.
Make sure you don’t miss this advanced workshop on SparkML for Big Data!