Are data mesh and data fabric the latest and greatest initiative, or new buzzwords aimed at selling solutions? It’s hard to say, but these emerging new corporate initiatives have a goal in common – namely dealing with disparate data. You can often achieve more value from your data if you can use disparate data for your analytics without having to copy data excessively and repeatedly. Data mesh and data fabric take different approaches to solving the disparate data problem.
What’s the difference between data mesh and data fabric?
Both data mesh and fabric focus on metadata and a semantic layer to leverage multiple data sources for analytics. However, the major difference seems to be about context.
In layman’s terms, data mesh is about the ability to offer various data sources to an analytical engine. Data mesh counts on the fact that you know the structure of your source data files and that the context of the data is solid. Using data mesh assumes you know the who, when, where, why, and how the data was created. Data mesh might be the strategy you use, for example, if you want to analyze data from several data warehouses in your company. It’s a use case where the original metadata is fairly well-defined.
Data fabric focuses on orchestration, metadata management, and adding additional context to the data. In the data fabric, managing the semantic layer is the focus. Use the semantic layer to represent critical corporate data and develop a common dialect for your data. A semantic layer in a data fabric project might map complex data into familiar business terms such as product, customer, or revenue to offer a unified, consolidated view of data across the organization. Pharmaceutical trials are a good example of where you might use data fabric, since the data from a trial comes from a combination of machines, reports, and other studies where the data has little accurate metadata to rely on. This data may be ‘sparse’ as well, meaning that a significant number of rows and columns are blank or null.