You acquire new information every day. But it’s only after analyzing it that you are able to add it to your knowledge and make more informed decisions. The same is the story with every organization around the world. Organizations collect huge volumes of data and then process it to make sense out of it and take the apt business decision. Unlike humans, an organization has two types of data processing capabilities: OLTP and OLAP.
Contrary to their name, there is a stark difference between the two. While one manages real-time processes, the other helps in analyzing huge volumes of data for improving the capabilities of the organization. Let’s understand this difference between the two in a bit more detail in this article.
OLTP, or Online Transactional Processing, systems handle a large number of transactions happening in real-time. But, what are the transactions?
Well, Transactions are processes that occur in their entirety and in isolation from one another. They either insert, update, or delete data in a database. On successful execution, the changes made by a transaction to a database persist in the database even in the event of a system failure.
The transactional data is stored in Relational Databases that ensure ACID properties for transactions. This data is written and queried at a very high pace to prevent any delay in processing.
OLTP governs transactions because they are the critical processes that we encounter in our everyday life. Online transactions, e-commerce orderings, online hotel bookings, atm transactions, etc. are all managed by OLTP processes.
Imagine you log onto an e-commerce website to book the last pair of your favorite headphones which are currently on sale. Consider the following:
OLTP ensures that such transactions are carried out without any inconsistencies in the database with the help of the ACID (Atomicity, Consistency, Isolation, Durability) properties (that we just discussed).
Organizations have data generated from transactions stored in various OLTP databases. But this data is of no use unless analyze it to derive valuable insights for the organization. However, querying this data directly from OLTP databases is not efficient because of the sheer amount of data and the complexity of the queries that need to be written. Therefore, we store this data in a different database called OLAP databases.
OLAP, or Online Analytical Processing, databases store data in an aggregated form from multiple OLTP databases. This data is then stored within a data warehouse. But, instead of a transaction-level view, it gives a multidimensional view of the data. This means that if the organization wants to view the aggregated sales data, they can view them according to multiple categories and sub-categories- location (region, country, state), time (year, month, day), the customer (gender, age), etc. This enables organizations to perform advanced analytics on their data, thereby giving a deeper understanding of their products.
It goes without saying that the key to the success of OLAP databases is the multidimensional view. But how is it stored? Let’s take an example to understand that.
Taking our previous example of the e-commerce website, imagine that the company now wants to analyze sales for the year that went by. But simply looking at the sales per month is too simplistic. Looking at sales per month categorized by region would be a better approach.
This table is known as an OLAP cube with two dimensions (quarter and region) and the aggregated data stored here is known as a measure.
The data stored in data warehouses is in a similar fashion with as many dimensions as required for analysis.
Here, Time, Region, and Category are dimensions. The shaded cell of the cube shows data for the month of February, in the North region for the Clothes category.
OLTP
(Online Transactional Processing) |
OLAP
(Online Analytical Processing) |
|
Functionality | Manages transactions that modify data in databases. | Used for analytical and reporting purposes. |
Source | Real-time transactions of organizations. | Data is consolidated from various OLTP databases. |
Storage format | Tabular form in Relational Databases. | Multidimensional form in OLAP cubes. |
Operation | Read and write | Read-only |
Response time | Fast processing since queries are simple. | Slower than OLTP |
Users | Executives, Data scientists | Programmers, Database professionals |
To summarize, in this article, we discussed, the difference between OLTP and OLAP. Sure it is a rudimentary topic but an essential one to build your foundation for a data engineering career.
If you looking for more such articles in the data engineering domain, I highly recommend the following articles to kickstart your journey:
I hope this article was able to give clear explanation to the topic. Let me know if you have any queries in the comments below.
Very well articulated. If you could add 1,2 example about table structures mostly used to query (normalized vs denormlized) would be icing on the cake
Nice information my question