As a data science practitioner, you have an ever-increasing number of data sources at your disposal. But before you can even get started with the analysis, bringing your data together is a huge challenge. Different sources mean different schema, extraction logic, de-duplication and being in sync with changing data sources, in addition to a number of other challenges. That’s where data engineering and more specifically, data integration techniques come in to help.
This session will walk the participants through integrating data sources and associated best practices. We’ll also cover the need for data engineering at a broad level and why it’s important for data science.
Key Takeaways:
- Fundamental understanding of data engineering
- Best practices in data integration