Search

Modern Enterprise Data Architecture – Data Lake or Data Warehouse?

In today’s world data is the catalyst for all decision making. In previous years companies needed Data Analysts to read and understand the data collected. Now technology has evolved beyond that, with data being transformed using Business Intelligence solutions, allowing data to be presented in a format that can be understood by most employees.

When we talk about Big Data we usually refer to a Hadoop Ecosystem (technology systems). Hadoop uses programming models to process large data sets across clusters of computers, being able to be scaled from a single server to hundreds of machines. One limitation of Hardoop is that doesn’t fit nicely with Cloud services, it is more suited to physical hardware rather than the cloud.

For data to be transformed into understandable information, it first needs to be stored somewhere accessible. The two data solutions that are commonly used for data management are Data Warehouses and Data Lakes, each solution having its own benefits.

Example of Contemporary Enterprise Data Architecture

Data Lakes

Big Data is stored on the Corporate Data Lakes, a logical construct in which data can be stored and further manipulated. They are a vast pool of raw data that isn’t purposely collected for a defined purpose.

Data Lakes can be based on different technologies from File Storage to RDBMS. They are flexible and can store all kinds of data including structured, unstructured, and semi-structured. Structured and semi-structured data includes JavaScript Object Notation (JSON) text, CSV files and website logs. An enterprise data lake supports storage of Big Data for real-time analysis and has more flexibility than the other common data solution, Data Warehouse

Data Warehouse

An Enterprise data warehouse (EDW), is a centralized repository of data where organisations store data from business systems and other sources in a structured format. It is a subset of organisational Data used for reporting needs, storing structured data only and is not as flexible as Corporate Data Lakes. EDW Data sources can include Online Transaction Processing (OLTP) databases, Customer Relationship Management (CRM) and Enterprise Resource Planning (ERP) databases. Due to data warehouses structured format they cannot be used for Big Data.

Although Enterprise Data Warehouses are less flexible then Data Lakes and are sometimes seen as two separate data repositories, EDW data is subset of Data Lakes. A well-designed and defined Data Lakes able to support the EDW functionality.

Future of Big Data

The increased technology used by companies today such as AI and the emergence of the metaverse means that data solutions will be reimagined for the future in order to store and process new types of data.

One new concept is Data Mesh, a decentralised approach towards data ownership, allowing users to access the data, without it needing to go into a Data Lake or Data Warehouse. However for it to be usable as a data solution it requires a lot of standardisation throughout Enterprise and beyond. While it is a good approach from practical point of view, it will take some time before the Data Mesh concept could be implemented, particularly on an enterprise scale. In the near future I believe that Data Lake and Data Warehouse will be gradually merge into one concept in the future – a Corporate Lake–Warehouse. The Corporate Lake-Warehouse will provide the flexibility of storing unstructured, semi structured and structured data of a data lake, combined with the data warehouses structured collection of reporting data.  

While the future of Data Architecture will rely on the requirements of new technology, one thing that is guaranteed is that we are heading towards an even more data driven future – and companies need to be prepared.

Latest Posts

software partners

Partnership Announcement

Scope Systems and PeopleTray Announce Strategic Partnership to Accelerate Software Innovation for the Mining and Mining Contracting Sectors. Australian owned

Read More
Search

Contact us

* indicates required field

 

By submitting this form, I accept the Privacy Policy.

Cloud Hosting Enquiry

* indicates required field

 

By submitting this form, I accept the Privacy Policy.

Contact our BI team

* indicates required field

 

By submitting this form, I accept the Privacy Policy.

Contact our Sales team

* indicates required field

 

By submitting this form, I accept the Privacy Policy.