“Let me tell you a secret, getting the first freelance data science project may require huge patience”- Raju Kumar Mishra
India is the 2nd largest market for freelancers. Even new fields like data science have a lot of Freelance players to help you with your analysis.
In the 17th edition of the Kaggle Grandmaster Series, we have one such freelance data scientist joining us- Raju Kumar Mishra.
Raju is a Kaggle Discussion Grandmaster and ranks 48th with 51 gold medals to his name. He is also a Kaggle Notebooks Master. He has been very active on Kaggle for the past 8 years.
Raju has a Master’s in Computational Science from IISc (Indian Institute of Science) Bengaluru. He has been a Freelance Data Scientist since 2013.
You can go through the previous Kaggle Grandmaster Series Interviews here.
So without any further ado, let’s begin.
Raju Kumar Mishra(RKM): In the process of clearing the IIT JEE exam, then studying Bachelor of Technology, I got an intense interest growing in me for applied mathematics and applying computers to solve mathematical problems. In mining, one has to go through many subjects that require good knowledge of applied mathematics.
Subject Rock mechanics is full of, mathematical formulas, which have been developed using numerical and statistical knowledge. I also applied neural networks to predict, if a given mine roof is safe for working or not. Mining engineering requires knowledge of electrical science because many machines use electricity to work. In summary, many subjects, which are taught in Mining Engineering, require mathematics and statistical knowledge.
Whenever someone starts working in the mining industry, they will be surrounded by a huge amount of data. Data related to the production of minerals and coal, data related to manpower, a huge amount of data generated by heavy earthmoving machinery. These data can be used to get insight for preventive maintenance, improve production quality, and many improvements related to a more profitable business.
The Master’s in Computational Science course, provided by IISc (Indian Institute of Science) Bengaluru, was suitable for me to pursue my interest. I applied for it, got selected. This course provided excellent knowledge to solve mathematical problems using computers and supercomputers. A Problem solver has to write code to solve mathematical problems, which makes a person a better programmer.
I consider data science is the science of estimation. Using different algorithms, we estimate the value of some quantity and try to minimize error in estimation. We estimate the value, if full information required to know the value of a quantity, is not available. We also use estimation when calculation the exact value might be costly. Minimization of error is using some or other type of optimization. We have to estimate values in all engineering departments. We can find, some or other use of data science in every working area.
RKM: As I have mentioned, the Computational science course, provided me, excellent knowledge of applied mathematics and solving mathematical problems using computers. One has to write their own code to solve mathematical problems. In the process, I got a good knowledge of programming, as code written for mathematical problems will be optimized for data-intensive and computation-intensive conditions.
In my master course, I studied subjects like numerical methods, numerical linear algebra, parallel computation, simulation courses, scientific data visualization, statistics, probability, neural networks, stochastic finance many more associated with numerical mathematics. I went through pattern recognition and reliabilities courses too. In fact, I am coming from a data science background. Basically, I did not switch to Data Science, but I started to work in my core field.
We performed parallel computation using MPI (Message Passing Interface) and some bits of OpenMP. How to use distributed system for computation, was not new to me. Therefore, Big data area (In the context of problem-solving), was not strange for me.
RKM: Teaching is my passion. Mainly, I provide training in the area of programming and data science. I get corporate training assignments through Linkedin. I also provide training for groups of individuals. In the process of corporate training, sometimes I get problem-solving assignments.
Working as a freelancer provides me a good amount of flexibility and proper time to read new concepts.
I started my career as a freelancer at the end of 2013. In the beginning, it was very difficult to get corporate training and any freelance assignment. I will tell the people, who are eager to provide their service as a freelance, just have patience, you will get assignments.
I started my kaggle journey with all three areas, competition, notebook, and discussion. Nowadays, the dataset area has been included. Since I am more involved in teaching, taking part in the discussion, was inevitable. My maximum discussion was more around new concepts discussion, new and efficient tools, and study notes, which can be used by working data scientists and data science aspirants to grow their expertise.
I am also a notebook master. Generally, I write notebooks on some new packages and new tools. Writing notebooks makes one more clear about, how to explain concepts, improve the skill of story-telling. Good notebooks require organizing your notebook content in a logical order, hence the skill of organizing reports is improved. Informative notebooks help others to learn new concepts. I feel that concentrating on one area at a time, helps you to concentrate more and provide valuable work.
RKM: Generally, I used to write in the evening on daily basis. Whenever I was not having assignments, I wrote for the entire day. Basically, I write whenever I get time.
Writing a book requires the knowledge of the subject, expertise to explain the knowledge, skills of identifying problems and solving them for readers, so they can be benefited. Whenever I explain a concept, I try that anyone can understand. This is the requirement for my profession as a corporate trainer. Explaining concepts in books, certainly helped me to improve my explanation skills. Writing books also helps in organizing training material and providing proper charts and diagrams for a better understanding of participants.
RKM: Writing books, requires knowledge of the topic, and the skill to explain it. It also requires, to identify and create problems related to different concepts, and solve them, so that reader can understand concepts easily. Explanation of concepts requires creativity. Creativity helps in generating new ideas to explain topics. Skills and tools are required to create, relevant charts and diagrams, which helps the reader understand the concept.
I cannot create many ideas while I am sitting in front of the computer and writing a book. Ideas come to me at different points time. In order to overcome this problem, I used to have a diary and a pen always with me. Many ideas for writing, I used to get while I am in some vehicle or walking in the market or in my classes where I am explaining some concept or in the problem discussion sessions. Choosing how to explain in a better way and organizing chapters is a time-consuming process.
Whenever I get the idea about, how to explain a particular topic or some new problem or some idea to solve a problem, used to write it in the diary. Then up to the evening, I used to have some material to add to writing every day. If I do not write the ideas at the time, when it popped, then I might be forgetting some ideas. Writing ideas in dairy, whenever it popped up, helped me in utilizing time more efficiently, because every day, I get some material for my book.
Organizing all topics, you are going to include in a chapter, will be helpful to create a thought path. Having a clear thought path will be helpful in completing the book early.
Having a bird’s eye view every day, on a whole day work routine, can help you find slots, which can be utilized for generating new ideas, data, and problem creation and writing.
RKM: I started learning R in my statistics class, while I was pursuing my Master of Technology course in IISc. I understood that R can be used, to analyze any sort of data using many statistical and mathematical concepts. No need to reinvent the wheels, because R consists of many statistical and mathematical concept implemented and tested by many users.
I learned Python when I was working as a software engineer. I use both, Python and R, in my analysis and to perform day to day programming tasks. Both python and R is having a huge helpful community. There are many discussions on R vs Python, available as blogs and videos. Both languages have pros and cons. I have provided some benefits of using R and Python, on basis of my work experience and my feelings.
One can use R when the dataset is small (Small in the context of RAM size used). Using R, a data scientist can apply different ready to use algorithms. For exploratory analysis, R is good, due to many data analysis and data visualization packages with advanced functionality.
Python becomes useful if dealing with a huge amount of data. It is easier to use Python when machine learning and deep learning models have to be integrated into GUI and webpages.
Julia is an impressive programming language for data science. I loved its quote that “easy to write code that’s nearly as fast as C”. Its syntax is easy to learn and help in creating clean and concise code. Whenever people switch from one programming language to another, there is a cost associated with it. Cost like learning the new programming language, adapting to a new programming environment, etc.
Code is written in Julia. is faster in execution and clean, concise as a scripting language. Whenever fastness will be more important than the cost of switching to Julia from another programming language, people will start switching to it. What I feel, incoming time data science applications are going to be more computationally intensive. Therefore, a potential chance, Julia will be used more in data science than in other programming languages.
RKM: Let me tell you a secret, getting the first freelance project may require huge patience. A Freelancer has to be positive that, he will get the first assignment. After getting the first assignment, life might be easier. As a freelancer, always improve your skills, get more certifications and make your work portfolio reach.
Write blogs, working on Kaggle can help you in making your data science portfolio impressive and get connected to data scientists from all over the world. In order to achieve freelancing works, let people know that you are willing to work as a freelancer. You can start with Linkedin and Twitter to spread the news, you are ready to work as a freelancer and waiting for projects. Following companies/websites can be helpful in getting freelance data science assignment-
The flexibility that freelancing provides is very attractive for data science beginners. But that luxury, comes the other efforts you have to put to gain popularity in the community and make freelancing favorable to you. We hope Raju’s words and wisdom help you understand these things better.
This is the 17th interview in the Kaggle Grandmasters Series. You can read the previous few in the following links-
What did you learn from this interview? Are there other data science leaders you would want us to interview for the Kaggle Grandmaster Series? Let me know in the comments section below!