Join us in this interview as Sumeet shares his background, journey as a former Data Scientist to a software engineer, and learn the captivating aspects of his current job. He provides insights into the future of data science and software engineering and offers valuable advice for career transitioners. Let’s dive into our conversation with Sumeet!
Sumeet: Currently, I work as a Software Engineer at Natwest Group in Data Engineering, working on NLP Generative AI use cases like Summarization, Named Entity Recognition, and Q&A Chatbot in the Risk and Finance domain.
Previously I worked as a Data Scientist at Cognizant in Banking and Financial Sector domain on unstructured scanned documents.
I worked as a Senior Software Developer at Siemens Technology in Industrial Automation Domain on an internal application portal that integrates and provides streamlined access to various specialized components of the above.
My previous job role as a Business Technology Analyst at Deloitte Consulting in a US State Health Care client project as a UNIX and Python Script developer for the automation of manual processes.
I have done Masters in Data Science from the Higher School of Economics, Moscow, and Bachelors in Computer Engineering from Thapar University, Patiala.
My competencies are in C#/Python/Java/PHP/UNIX/SQL, AWS Sagemaker, AWS Textract, SpaCy, Machine Learning, Natural Language Processing, Named Entity Recognition, Computer Vision, Artificial Intelligence, and Neural Networks.
Sumeet: I started by obtaining a six-month certification from IIT Kharagpur in the Foundation of Artificial Intelligence and Machine Learning, which gave them a basic understanding of the fundamental mathematical ideas that underlie machine learning algorithms, such as linear algebra, statistics, calculus, and probability.
I found these topics fascinating when studying them in the course and decided to learn more about them in depth. I chose to pursue data science as a profession or expand my studies as a result of my interest in the underlying mathematical concepts of machine learning.
Sumeet: I wanted to understand the process of handling infrastructure and scaling for large volumes of data, be it for a machine learning application or any software. Knowledge about the architecture of the system will help in addressing its limitations to a certain extent.
Sumeet: Experimenting, thinking out of the box, and an unconventional approach towards a problem. Also, it has enhanced my ability to multitask and contribute to building machine learning pipelines across the domain.
Sumeet: I had some prior experience in Software Engineering and did not face many challenges during the transition since both fields go hand in hand with each other.
Sumeet: Software Engineering lets you explore unknown areas in your project work and challenges you in terms of optimizing the code base and the infrastructure used. There is always a better approach and room for improvement in this evolving field.
Sumeet: I had worked on an Unstructured Document Segmentation project to extract relevant information using Computer Vision and NLP. We started the groundwork of the project with the PoCs, where we tried different techniques, both ML and Non-ML-based. Finally, we zeroed down upon instance segmentation using Mask RCNN, and we further enhanced it with the capabilities to generate output files in the form of JSON to make them parser and reader-friendly. For each stage of the project, we conducted demos where we gathered useful feedback, both positive and negative, and improved upon it. Some of the challenges involved were the type of unstructured documents and their quality. To overcome it, some image processing techniques were applied. Also, some infrastructure problems were overcome by introducing the concepts of Multiprocessing and Concurrency using AWS Lambda.
Sumeet: Cognizant has a dedicated pipeline for Analytics and Data Science and has a large magnitude of projects ranging from Machine Learning to Deep Learning. The company has a plethora of reusable in-house tools and solutions that we can enhance and apply to different project use cases.
Sumeet: In one of the projects, during code review, the technical team lead suggested refactoring a component in the codebase. To achieve this, we did not properly design and implement the test cases infrastructure, and we suggested an improvement. We implemented the suggestion, resulting in a drastic decrease in the volume of bugs received for that project component.
Sumeet: I generally follow some people on LinkedIn and, through their connections, get to know about the latest technological advancements and useful links around them. For research papers, arxiv.org and research gate are some of the best resources which are open source.
Sumeet: My tips for people interested in transitioning their career from a Software Engineer to a Data Scientist include keeping up with the most recent developments in the industry and learning the fundamentals of how to apply them. You can accomplish this in a number of ways, such as participating in pertinent conferences, workshops, and training courses; reading blogs and papers on data science; and gaining hands-on experience with datasets and models.
Additionally, I advise on developing a Machine Learning use case for the context of software engineering initiatives. This entails locating a challenge or problem in the field of software engineering and using data science tools to resolve it. This will give not only practical experience but also show potential employers one’s ability to apply Data Science techniques to real-world problems.
Sumeet: First of all, the field of data science is still in its early stages and constantly advancing. I believe that people should routinely read research papers in order to keep up with these developments and stay current with the most recent trends. Individuals can learn how to apply novel ideas, methods, and algorithms to address issues in the real world.
Secondly, the use cases for generative AI in the fields of computer vision and natural language processing are currently receiving more attention. A sort of AI known as “generative AI” can create new content, such as images or text, similar to the data it has been trained on. Numerous industries, including healthcare, entertainment, and banking, could undergo a revolution as a result. People should concentrate on honing their talents in these areas if they want to be in a position to stay ahead of the curve. They can accomplish this by enrolling in Generative AI classes or workshops, taking part in research projects, or creating their own Generative AI models.
This interview highlights the close relationship between the two fields (Data Scientist and Software Engineer) and the importance of broad skill sets in today’s ever-evolving technological landscape. His insights provide valuable advice for individuals looking to stay ahead of the curve and position themselves for success in the world of data science and software engineering. If you’re on your way to become a data scientist in 2023, then here is the roadmap for you.
If you wish to read these kinds of the interview that shed some light on the journey of a data scientist, engineer, or any people in tech, then check out our website for more.