Airbnb and Lyft have transformed their respective industries in recent years using data science as their guiding light. In episode 9 of our DataHack Radio series, Dr. Alok Gupta gave us some very interesting insights into how Airbnb and Lyft use data science. For instance, did you know that Spark is Airbnb’s machine learning tool of choice?
Dr. Alok is currently working as the Director of Data Science and Head of Growth Science at Lyft. He has a deep passion for mathematics and has used that throughout his career, including his four year stint at Airbnb. You will learn a lot in this podcast about how a data science leader thinks about challenging problems, and how leading tech start-ups scale up their operations from the ground up.
This article summarizes the key points Dr. Alok discussed during this podcast. This is another valuable addition to the DataHack Radio podcast series, and I highly recommend listening to it as soon as possible!
Subscribe to DataHack Radio NOW and listen to this, as well as all previous episodes, on any of the below platforms:
Dr. Alok completed his undergraduate in mathematics from Cambridge University and proceeded to do his Masters in Finance and Mathematics from Imperial College, London. During his time there, he developed an interest in stochastic finance and statistics and decided to pursue this as his Ph.D at Oxford University, which he successfully completed in 2010.
During his Ph.D years, the infamous recession struck and created chaos in the industry so he wasn’t sure which industry to apply to. He ended up in financial trading at Deutsche Bank where he had an opportunity to design and build algorithms with profit and loss objectives.
As part of his role at Deutsche Bank, he moved from London to New York, where he worked for around a year and a half. He discovered the role of a data scientist while in New York, and realized the similarities between that, and his own role as a Quant Trader in finance. This led to him applying at a number of companies and he finally got his break in 2014 at Airbnb as a data scientist and the rest, as he said, is history.
The overlaps between a data scientist and a quant trader were plenty, including understanding the problem and framing it in a way that made business sense. There were other intersections, like opportunity sizing, detective analysis, impact estimation, etc. Of course one of the most interesting commonalities was actually solving the problem – deciding which mathematical, or statistical, techniques do we need to apply, what is the objective function, how do we get to the optimal solution, among others.
But there were a couple of crucial differences between these two roles as well, as Alok discovered in his initial days at Airbnb. The metric that you’re trying to optimize in finance is taken as given (for example, trying to optimize PnL is a concrete objective). Whereas in the technology space, this was vague and needed to be understood at a far more granular level before performing any data science task.
Experimentation is another tricky and challenging aspect in technology (there are a number of assignment units, different methodologies for measurements, etc.), whereas in finance you run an algorithm, see how much money it makes, and you’re done!
When Alok joined Airbnb in 2014, the entire company was some 1,000 employees strong, with the data science team consisting of just 10 people (when he left earlier this year, the team had grown to around 110!). He started as the Data Scientist on their Risk and Safety fraud prediction team, where he built models for both online and offline fraud detection.
One year into his role, Alok started to build his own data science team in the Customer Support optimization space. Airbnb has some 10,000 customer support employees globally that use channels like phone, chat, email, SMS, etc. to help their customers resolve issues. This, as you can see, was a challenge ripe for machine learning. Alok has explained how his team took this as an optimization problem in the podcast and the different features they considered for the final model. A very fascinating section, this.
In his last 2 years at Airbnb he switched focus completely to work on acquisition of new guests. This included sourcing different marketing channels, working on search engine optimization, recommendation systems, etc.
Alok has described the acquisition process in a lot of depth which will benefit anyone who works in data science, regardless of the industry. The way he and his team approached the problem and worked their way through it can serve as a roadmap for all aspiring data scientists.
Most of the data scientists at Airbnb use tools and services like Amazon Web Services (AWS), HIVE, etc. to pull or extract the data they needed. Python and R are used to perform local analysis and Alok saw an increasing number of data scientists moving to Python as it’s easier to productionize Python scripts.
For building models and solutions when confronted with large datasets, Airbnb’s machine learning tool of choice was Spark. Airbnb has also invested in building it’s own centralized machine learning platform that can enable non-data scientists and non-engineers to spin up their own ML models without needing to have a lot of programming experience.
Alok led the way in pioneering a knowledge sharing tool within the organization which was shared between data scientists and non-data scientists. The idea behind it was to get everyone on the same page regarding the happenings internally, and it was almost always written in Python or R as a Markdown document. This also helped them get peer reviews on any technical stuff and the quality of analysis was raised to unprecedented levels.
At Lyft, all the folks working in the analytics and data science domain are grouped under the umbrella of scientists. Acquisition, engagement, and retention (of passengers and drivers) are some of the problems they are currently working on simultaneously. In his role as the head of Growth Science, Alok has been exposed to the supply side of things, a new and exciting challenge for him.
The data science team under Alok currently consists of 40 people (at the time of recording this podcast). Quiet a few challenges he is facing in his current role, which he started just four months ago, he has already seen at Airbnb, so he feels at home in that respect.
The details covered in this podcast about Airbnb’s data science operations is eemplary and exhaustive. I found out far more about how a leading tech start-up operates, thinks, the structures involved, etc. than I had initially anticipated. Anyone involved in data science will benefit from listening to Dr. Alok.
If you have any suggestions for us on who you would like to see as a guest in the future, or any feedback on the nine episodes we have released so far, use the comments section below and let us know!
Wow..nyc blog. Really very impressive and informative blog. Thanks for providing...!!!!