“I think the main challenge for a lot of beginner kagglers is to overcome their fear of failure and criticism”- Kostiantyn Isaienkov
The statement is very relatable when we observe this in the data science community. A little criticism is enough for the fear to proliferate within us and stop us from improving on our mistakes we make in our kaggle journey.
To help you push further, in the 21st edition of the Kaggle Grandmaster Series we have with us- Kostiantyn Isaienkov.
Kostiantyn is a Kaggle Notebooks Grandmaster. He ranks 8th in this category and has 17 gold medals to his name. Kostiantyn is also an Expert in the Kaggle Competitions category.
Kostiantyn has a Master’s degree in Computer Science from Donetsk National University. He currently works as a Data Science Engineer at Quantum_Inc.
You can go through the previous Kaggle Grandmaster Series Interviews here.
So without any further ado. Let’s begin.
Kostiantyn Isaienkov(KI): According to my experience I can’t say that a Master’s degree is mandatory for a career in Data Science. I know a lot of people who have a Bachelor’s/Master’s degree or Ph.D. in Computer Science and related fields. Also, I know people who don’t have a degree in this field.
Despite this, I think that a university degree is quite important for a future career. During the study, we usually have a great opportunity to train discipline, communication skills, the ability to work on deadlines, solve complex problems, and the most important – the ability to search for important information and practice self-education.
Regarding the contribution of my degree to my career – it is definitely positive. During the exchange study at Czech Technical University, I obtained a strong knowledge base in the field of Machine Learning and Autonomous robotics that helped me a lot to get my first Data Science job.
Regarding the contribution to Kaggle – I think that these are two parallel terms. Regardless of the specialist’s title and the number of his medals, the Kaggle is a platform where you can always find something new or participate in a competition in a new field for yourself. Thus, Kaggle is an additional source for obtaining knowledge and a university degree has no effect on your success on this platform.
KI: In both companies, the main responsibility is solving problems in the field of data science for various clients. The main difference is the direction of the tasks being solved. In Akvelon the majority of the tasks were associated with classical machine learning on tabular data. Also, I worked a little bit as a data engineer.
In Quantum I have a chance to work with different directions such as classical machine learning, computer vision, and even NLP projects. During the work here I had the opportunity to implement ML algorithms in Java, integrate my solutions on an iPhone, and participate in product development in the company. In a couple of words about the product – this is an auto machine learning solution for a wide range of tasks. For more information, you can check maister.io.
I also lead the internship program in the company. Quantum is serious about preparing young specialists and has already successfully completed several internship programs in Data Science.
Another area of work in the company is research activities in the field of processing satellite images. One of the most recent results of the work is a published paper for change detection in the forests of Ukraine where I took an active part.
KI: I created my first kernel during the participation in my first competition about 5 years ago. I think the main challenge for a lot of beginner kagglers is to overcome your fear of failure and criticism. When I worked on my first notebook I thought that I would publish it and obtain a lot of negative comments under the code. This problem went away by itself with the growth of experience and activity on Kaggle.
Secondly, the problem that I faced is to create a clear and readable kernel. You can even create an awesome kernel with unique content but it will not be popular because your code is not good and you can’t create a good description for your work. You should always remember that the Data Science solution is not only a model training process.
And of course, the common problem for beginners in the field of Kaggle notebooks is the lack of votes on your kernel. It can often be a reason to stop notebook creations for some beginners because usually, people want to get rapid progress. Unfortunately to solve this problem you need to spend some time on the creation of good notebooks and develop your own style for them. Sometimes it can take more time than you expect.
KI: I still use kaggle mainly for self-education. So I am trying to select competitions or datasets based on the field of the problem. I am trying to select the most interesting for me or those where my experience is not really high. Sometimes it helps a lot in my working process. For example, there is some new project in the field where no one in the company has experience. But you worked on a similar problem on kaggle and have knowledge at least in the current domain. It’s really cool and useful.
To be honest I spend so much free time on the kaggle that I don’t have to think about other competition platforms. Nevertheless, I participated in several competitions on zindi.africa and Analytics Vidhya. Can’t say that I spent a lot of time there but it was an interesting experience for me.
KI: Of course, I have some standard plan based on which I am building my kernels. Moreover, I even have a template that usually helps me to build my notebooks in the fastest way. So I don’t need to spend a lot of time on rewriting code the same for all datasets like data reading, some visualizations, and basic analysis.
I can describe my plan in a couple of words but I think that it is really usual for general Data Science tasks.
KI: I really like to take part in kaggle competitions but due to the limitation of my free time I really seldom do it seriously and from the very beginning until the end of the competition. I don’t think I have the favorite one, each of my past competitions brings me some new knowledge and sometimes new connections with other data scientists.
The main challenge for me is that usually, people are fighting for hundredths, thousandths in the competition metric. And usually, the difference between a gold medal and no medal at all is really small. Every time I try to fight with it in different ways – starting on building powerful ensembles, ending finding some insides in datasets.
And usually, the difference between a gold medal and no medal at all is really small.
KI: Basically I don’t memorize notebooks when I check them. I just take new technology and search for information about it to be ready to use it in the future. But if we speak about notebooks with good EDA for beginners we of course need to take a look at Titanic and House Prices competitions. Just sorted kernels by “Most Votes” we will see a lot of notebooks with hundreds and even thousands of votes. Most of them are really impressive works and every beginner will obtain a lot of new knowledge there.
KI: For today there are a lot of opportunities to study, and it does not depend on your professional level. I can example just a couple of those ways that I use.
1) Research papers. This allows you to check some new models, methods, and other tools before they become popular and commonly used.
2) Kaggle notebooks and competitions. I think I don’t need to describe how much kaggle can bring to you. There are always new methods, models inside the real-world tasks on this platform.
3) Communications inside the team and different Data Science communities. Knowledge sharing inside the large groups of professionals is the truest way to increase your level.
KI: The first and the most important – never stop learning something new. Data Science is a very fast-growing field where updates happen often. Fortunately, we have a lot of opportunities today for study. You can complete online courses, read papers, take part in different competitions and simply have discussions with your colleagues and other experts in the field of Data Science.
The second one – don’t forget about your programming skills. Good Data Scientist should be a good programmer too.
Also, we need to focus a little bit on soft skills. It is impossible to be a good specialist without the skill to communicate with other people and team members.
And the last one – don’t spend all of your free time on work. Find a good hobby, be happy with your family, and just have relaxed. It is much easier to burn out at work than you can think.
His thoughts and words are enough to get anyone to begin and stay focused on their data science journey. I hope this edition of the Kaggle Grandmaster Series with Kostiantyn adds value to your data science journey.
This is the 21st interview in the Kaggle Grandmasters Series. You can read the previous few in the following links-
What did you learn from this interview? Are there other data science leaders you would want us to interview for the Kaggle Grandmaster Series? Let me know in the comments section below!