The Vexing Problem of Data Entropy

Chris Hallenbeck

Posted by Chris Hallenbeck on

SVP Database & Data Management

More by this author

Imagine a library that houses a vast collection of books—centuries of print from across the globe.

Now imagine that this collection is housed not under one stately roof, but across dozens of smaller buildings spread out over thousands of miles, and instead of one centralized database that allows users to locate and access the books they need, each building maintains its own database with its own cataloging and storage logic. What started as a scholar’s dream—an abundance of useful information—quickly became a nightmare—none of it easily accessible, most of it just out of reach. Information without order can be crippling. That is the place where many companies now, unfortunately, find themselves—rich with information but hungry for order.

Data is the lifeblood of an enterprise. Without it, a business will wither and die. Businesses are accumulating more data than ever before, data that should allow them to be more efficient, more flexible, and more profitable; however, companies report quite the opposite is happening. This abundance of data brings not only infinite opportunities but also infinite challenges.

Without a comprehensive, secure, logical, and efficient data management system—a heart to keep the blood flowing to all corners of a business—essential data can become useless, or, even worse, a drain on an enterprise as it struggles to access and interpret its mountains of increasingly complex and varied information. A traditional on-premise enterprise data warehouse used to be the defining solution for analyzing data. While never perfect, these systems covered the basics, such as Customer 360.  But as data multiplied and expanded with the addition of new systems (departmental, personal, machine, mobile) and through the adoption of cloud solutions and other line-of-business uses, data quickly spread beyond the four physical walls of the organization—creating a modern data landscape that is difficult to manage.

Today’s data landscapes suffer from data entropy.  Data has been seeking its simplest and lowest cost storage.  Businesses are awash in information—CRM data, purchase history and entitlements, web click log data, and IoT data—but it is increasingly decentralized, stored in various clouds and on-premise repositories. We know that companies want and need a consistent view of their operations, customers, suppliers and partners. The data to create that view exists, but as the data becomes more difficult to locate, companies feel that they are losing their ability to understand not only their customers but also their own businesses. This is the real cost of data entropy. Without a way to efficiently and effectively access its information, a company’s abundant data can become a liability.

Data entropy has an interesting property – the less value dense the data is, the further it is from the enterprise.  The highest value density data is largest still on-premise or on a cloud with a dedicated network connection back to the enterprise.  Click log data, which encodes how and what customers look for online and on their mobile applications is locked in cloud block-storage systems.  IoT data, which encodes how customers use products, is tied up in distributed databases and/or in thousands of edge databases.

 

Crippling decentralization of structured information points to a losing battle against data entropy. Given the sheer complexity and volume of data that businesses now have access to, the notion of an enterprise data warehouse that can easily store top line aggregate data for analysis has been rendered moot; companies know they can’t go back to a 100% on-premise data management system. Other vendors claim to offer an easy solution—“bring all of your data to our cloud, and we’ll provide the insight.” As appealing as this one-stop cloud solution may sound in theory, it is not actually tenable in practice. Bandwidth between clouds is slow, with high outbound data movement costs, making it a non-viable solution for increasingly real-time and cost-conscious IT departments.

Solving the problem of data entropy calls for a solution that efficiently leverages vast amounts of data that can be easily analyzed. Organizations must be able to determine the location of the information they need at any given time, all while moving the smallest possible amount of data between sources. A modern data management solution is needed for companies to regain a firm grasp of their customers and their business—one that is enriched rather than crippled by an abundance of data.

Stay tuned to this blog series as we unveil how SAP tackles the problem of data entropy.

Next blog in series: The Challenge of Data Sprawl: Accessibility and Quality

Learn how today’s leading businesses tackle their data management challenges by joining us in Orlando at SAPPHIRE NOW 2018.

 

Join in the conversation #SAPPHIRENOW

VN:F [1.9.22_1171]
Average User Rating
Rating: 5.0/5 (7 votes cast)
The Vexing Problem of Data Entropy, 5.0 out of 5 based on 7 ratings

761 Views