“I’ve never felt that had I more degrees or no degree at all to mention in my resume, things would go dramatically different”- Dmytro Danevskyi
We often undervalue ourselves and our skills when we are rejected for a data science job role just because we do not have a relevant degree. And the cycle to do a Master’s or Ph.D. begins.
The Kaggle Grandmasters Series is back with its ninth interview. This time we are joined by Dmytro Danevskyi who will break this myth for us.
This is the ninth interview of the Kaggle Grandmasters Series. We recommend you go through a couple of the previous interviews as well-
Dmytro is a Kaggle Competitions Grandmaster and currently ranks 67th. He has 5 gold medals to his name along with 8 silver and 2 bronze medals in the Kaggle Competitions category. He is also a Kaggle Discussions Expert.
Furthermore, he has a Bachelor’s Degree in Engineering Physics/Applied Physics from the National Technical University of Ukraine ‘Kyiv Polytechnic Institute’ and currently works as a Machine Learning Engineer at Respeecher.
So, go through this interview and absorb all you can!
Dmytro Danevskyi(DD): I think the major thing I acquired during my physics training is the ability to learn. The major lesson I got is that given enough time and effort you could master anything be it in quantum physics or ancient philosophy. That might seem obvious, but I’ve seen many fail to learn something just because they lack this mental belief that they could actually do it. I was there too, but dozens of new skills/subjects I learned throughout my education taught me there are basically no limits. Time and effort, that’s what you need.
DD: It’s not that different from an ML Engineer position. Probably the major distinction is that a Deep Learning Engineer should be comfortable working with unstructured data, like images, texts, or audio. This aspect is indeed different from a more conventional machine learning, where the datasets are typically more structured, e.g. databases. This means that typical solutions and also deployment conditions are somewhat different.
A Deep Learning Engineer is normally assumed to have experience working with various sorts of neural networks. Common tasks include data preparation, model training, model debugging, model interpretation, model compression, or optimization.
Contrary to popular opinion, domain knowledge and algorithms are still valuable. Neural networks are powerful, but not omnipotent. Carefully chosen pre/post-processing is very important and sometimes could change a model from being “just good” to “amazing”. In order to get final predictions from an object detector, one still should use a very old algorithm called Non-Maximum Suppression. Neural machine translation models often rely on the Beam Search algorithm to generate a translation. Most deep learning solutions for speech analysis still rely on the Fourier transform that was invented about 200 years ago.
Being capable to deploy a model into production is also a vital skill for a DL engineer. These models are usually compute-heavy, and often require a decent amount of optimization or compression before they could be deployed. Therefore, familiarity with such techniques as structured pruning, knowledge distillation, or model quantization could be a nice plus.
DD: My opinion is that higher education is a very good thing to have, but at least from what I see, it’s not mandatory for a career in DS/ML. Yes, many companies care a lot about formal education, but there are a lot of them who care more about actual skills and experience (mostly true about tech companies and startups of all sorts). A person with truly good skills will always find a place to apply them. I’ve never felt that had I more degrees or no degree at all to mention in my resume, things would go dramatically different.
I think I can feel for those who sometimes find themselves undervalued based solely on a lack of a formal degree. I used to feel the same. Some companies I had an interview with asked for a Ph.D. degree which I don’t have. But at some point, I’ve found these companies are typically too bureaucratic for me and I probably wouldn’t be happy to work there even if I had a Ph.D.
DD; A really important thing for me is to have a broad enough view. I hate to think of myself as a “CV engineer” or “NLP engineer” despite having some skills and knowledge in both computer vision and natural language processing. I see that being up-to-date with progress in ML and DL, in general, is what gives me some advantage.
DD: A competition that I took seriously for the first time was Data Science Bowl 2018. The task was to spot cell nuclei on biological images. I entered the competition for about 1 month before the deadline. It was my first experience with segmentation and detection models, so it took me about 2 weeks just to figure out what was going on. Another challenge was to quickly acquire some domain knowledge, though it wasn’t that necessary for this particular competition.
I guess I read every research paper on nuclei segmentation that I could find. I also had to familiarize myself with classical CV algorithms such as Watershed. That gave me some insights and at some point, I even was quite close to the gold zone.
Another major difficulty was related to the almost complete absence of computing resources. I can still remember training quite heavy convolutional networks on my personal laptop that didn’t have a GPU. That was quite funny but rather slow. However, the dataset wasn’t that big and I managed to do several dozen experiments throughout the competition.
DD: The only motivation for me now to participate in any kind of hackathon is being able to do it as a part of a really cool team. Solo participation is mostly worthless unless you’re super familiar with the domain or just really know what you are doing.
DD:
The task was to spot faults on above-ground electrical lines given voltage measurements from these lines.
I had a unique chance to test my signal processing skills. I did quite well in terms of modeling but failed to find any correlation between local validation and the leaderboard score. I gave up solving the competition eventually. However, when the private leaderboard had finally been released, I learned that my local validation was indeed quite reliable and the public leaderboard just had too few positive samples. Had I trusted my validation and continued working, I would’ve probably found myself in the gold zone.
The task was to find ships on satellite images.
Our team managed to come up with a two-state solution: first, we would use a binary classifier to find images with no ships at all, and then use a U-Net model to segment those that were predicted to have at least some ships. That gave us some advantage, as training a classifier is typically much faster and easier than training a segmentation model. We believe it was mostly due to the two-stage approach that we finally found ourselves in 4th place.
But the real fun comes later – that competition had an additional challenge for top-scoring teams called the Speed Prize. As the name suggests, the goal was to do the same task as in the main challenge, but as fast as possible without loss of predictive quality. That was rather a challenge! However, we found that we can make our models thinner by using higher compression rates, e.g. by doubling the stride parameter in some layers. In addition to that, I managed to utilize some PyTorch-specific optimizations that also gave a nice boost. Eventually, we got a solution that was capable of processing about 15k full-size images in less than 3 minutes on a Tesla K80. That was over a 10x speedup compared to our original solution. That was just enough to get first place!
The task was to classify sounds on short audio clips.
My first and so far the only solo gold medal. I spent almost 3 months working on the competition. Tried basically every possible approach I could come up with. Read dozens if not hundreds of research papers. I learned a lot despite having been involved in projects around speech processing for about two years.
The task was to develop an extractive question answering system for Wikipedia articles.
One of the hardest competitions on my list. I spent around a month trying to reproduce a public TensorFlow baseline using PyTorch. Came up with an efficient strategy to prefilter answer candidates using a lite version of BERT that allowed to use 4 instead of 2 models for the submission kernel. You can imagine my feelings when we finished 13th but there were exactly 12 gold slots.
The task was to create a system for automatic assessment of the quality of Stack Overflow and StackExchange questions and answers.
Jumped into this one right after the previous competition. I was very lucky with the team – each of my team members brought something unique and valuable to the final solution. We utilized SoTA NLP models available back then, combined them with powerful semi-supervised techniques, and ended up being 1st – a huge achievement for me.
It’s actually mostly the reverse – my Kaggle experience helps me in the industry 🙂
DD: Many options here. I personally value the ability to do things quickly and deeply understand problems, not just from a technical point of view. I recommend mastering basic things – Python, SQL, math, English (if you’re not a native speaker). Once you feel comfortable and fluent in these things, this will free you a lot of time to spend on something more creative and important.
DD: That might sound odd, but I rarely use something besides standard DL frameworks like PyTorch or, seldom, TensorFlow. What I find more valuable is to be up-to-date with the latest progress in DL. For that, I read dozens of research papers each week – I’ve found that’s the only way to keep pace with the field.
Along with many other takeaways, the prominent one has definitely been to keep track of developments in your area of interest via research papers. I hope this helps you in understanding things better.
This is the ninth interview of the Kaggle Grandmasters Series. We recommend you go through a couple of the previous interviews as well-
What did you learn from this interview? Are there other data science leaders you would want us to interview? Let me know in the comments section below!
Wonderful Blog. multi-vendor marketplace ecommerce