Cloud and the Data Gravity Problem

The term data gravity, which was initially coined by GE Engineer Dave McCrory in a 2010 blog post, refers to the ability of large amounts of data to attract applications, services, and other data. In this context, the bigger the data, the more that applications and services will be drawn to it.

Similar to The Law of Gravity which states that the attraction between objects is directly proportional to their weight (or mass), the bigger or more critical a data set is, the more that it wants to stay in place. 

It is this force which is causing trouble for cloud adoption in the enterprise.

While applications and services can have gravity of their own, it is not as strong as the force that large data sets can have. So while modern applications are beginning to create their own gravity in the cloud, adoption of these applications can be impeded when these applications rely on data that refuses to move from the data center.


New data-intensive applications like analytics, machine learning, and IoT are dependent on large amounts of data that often reside in locations that cannot be easily centralized to the cloud. This challenge requires new application architecture considerations that revolve around the concept of overcoming data gravity. 

Let’s look at a few examples:

  • Digital banking applications – fintech applications that serve banks and credit unions require real-time access to core banking data that is held in an on-premise data center. As these applications move to the cloud they must still rely on these core banking systems in a customer’s data center.
  • Analytics platforms — utilities such as Splunk, Hadoop and TensorFlow want to own the data. Data migration has historically been a precursor to running analytics. Data lakes must be built near the application which introduces complexity for some applications and data sets.
  • Industrial IoT (IIoT) – data from sensors and other systems on a factory floor needs to be analyzed in real-time but are being produced at a rate that limits its ability to be economically aggregated in the cloud for machine learning and analytics applications.

In each of these examples, cloud software providers are faced with 2 choices:

  1. Avoid the challenge and limit the addressable market for their cloud applications by not working with companies that must keep their data in place.
  2. Brute force a solution to the problem with cumbersome networking solutions and networking experts which grow in cost and complexity with each deployment.

These options could be characterized as fight or flight responses…..leaving part of the market unaddressed or fight through the pain, hoping to charge the customer enough to make it worth it. Neither of which is desirable as cloud application providers seek to scale their install base with as little friction and cost as possible.

Introducing Data Anti-Gravity

While data gravity may be working to keep data in place, the answer to the opposing cloud application gravity is to create a solution of Data Anti-Gravity. 

Data Mesh technologies have emerged to invert the centralized data lake model and create a layer of secure connectivity between data and applications in ways previously unavailable. A Data Mesh acts as Data Anti-Gravity by creating a data fabric that forms persistent, real-time connections between disparate data sources and the cloud applications that rely on them.

By creating a Data Mesh, accessing data behind a firewall becomes as easy as having the data in the same data center as the application. It creates a seamless layer of connectivity that allows for an application to easily consume data in real-time and on-demand while providing security in transit. Monitoring, access controls, and policy enforcement are all handled centrally while the data can be managed as multi-tenant or pooled as a single virtual data warehouse. This means that data is free to live wherever it chooses. 

What results is a hybrid cloud environment that allows data and applications to reside where they are best suited and gives application providers the freedom to quit managing the complexities of connectivity. 

In the past, all of a company’s data was created and used behind its corporate firewall, so a data warehouse, applications, and administrators also lived within a company’s walls. With the rise of the cloud, all of this has changed. The gravity of these once centralized systems has become challenged. Data Anti-Gravity solutions like Data Meshes have arrived to ensure peaceful co-existence for these two technical forces of nature.

Learn more about Trustgrid’s Data Mesh Platform