How a Data Mesh Works

Our customers always come to us with a problem… 

“I want to get data from the factory floor to my cloud application”

“My customer wants to run my application on-premise, but we don’t deploy our applications onsite.”

“My data is scattered across multiple data centers and I can’t replicate it centrally”

I could go on and on with the number of ways these stories are told. But they all have one thing in common…..the separation of data from the application that relies on it.

In each of these scenarios, the data must remain logically and physically separated from the application, but the application needs to consume this data from sources that it has limited to no control over. 

And while there may be differences in architectures and use cases for these connections, they do share similarities in the challenges they face.

  • Complexity of deployment and management of persistent connectivity 
  • Need to be able to guarantee the availability of a real-time connection
  • Must meet the data’s security and compliance requirements

On the surface, these may seem like networking related troubles, but these challenges go beyond simply ‘networking’; they require a shift in approach. The realities of hybrid cloud environments (especially when multiple data owners are involved) have forced a rethinking in application architectures, connectivity models and security postures to address things holistically.

Solving these problems actually encompasses expertise across multiple disciplines. The concept of a Data Mesh has emerged as a way to address these complexities. By creating a layer of connectivity that also combines centralized management features, each of these disciplines are brought together in a way that eliminates the need for highly specialized experts to build and maintain this new data fabric.

Data Meshes are created by combining software-defined networking with data-centric deployment, management, and automation tools. Stitching these together creates a connective tissue between multiple applications and data sources that was previously left to managing a variety of point solutions with lots of manual labor. 

By creating a layer of secure connectivity integrated with automated management capabilities, users and applications are able to consume data freely without worrying about where the data lives or how the transit is occurring.

How does a data mesh actually work?

While we will save the technical details for another blog, an overarching view of a data mesh looks like this…

data mesh platform

On the left, you see examples of the types of data sources that a data mesh connects. A data mesh is data source agnostic. This means that the source or schema of the data does not matter. The data source is connected via an adjacent hardware or software node and, if needed, normalized by a connector or adaptor (usually running in a container) on the node. 

That data is then connected via a standard internet connection to a virtual point of aggregation. Encryption, access control and other means of security can be applied to these connections to ensure that sensitive data is never exposed to anything other than an approved application.

The applications on the right side of the graphic show the variety of applications that require real-time access to this data. The goal is to create a standardized and virtually centralized pool of data that allows these applications to access this data without network configurations.

This is done by creating a virtual data warehouse from the distributed data sources. Once created, any number of applications have the ability to access the data once they are given access. Access, change, and policy management are controlled centrally to ensure that only the appropriate users and applications are granted permission. Once connected, the flows between data and application happen on-demand, via private encrypted connections.

In summary, data and applications are allowed to remain where they make the most sense. Administrative controls are applied centrally and engineering or DevOps teams are given the freedom to deploy new connections or turn off connections as needed without relying on internally managed infrastructure.

Learn more about Trustgrid’s Data Mesh Platform