Given the new paradigm in enterprise computing that has unfolded with HANA and the power of in-memory technology as the platform for business, I have been thinking a lot about how we can more deeply and accurately measure the value of enterprise systems. I am also keenly aware of grand claims from some vendors, claims that don’t provide context or insight into actual value. Companies struggle to make sense of these claims and in the end are left to guess.
I believe there can be a new way to benchmark, one that gives an accurate picture of value, and does away with the divides of the past. Limitations of database and other information systems, created the unnecessarily separate world of OLTP, OLAP, event-processing and unstructured data management systems. Unfortunately, benchmarks reflected these divides. Each silo bringing its own benchmark family, inherently legitimizing the divides that HANA now does away with.
Additionally, the new way to benchmark must follow the same thinking with regard to systems themselves – they must be timeless. That is, benchmarks should factor in that systems are very long-lived and extremely complex, that systems must absorb new innovation at a very rapid pace, that they must be real-time, that they must bring incredible new value, and that all of this must happen non-disruptively.
To be more precise, the real value in decision-making comes from our ability to efficiently take important decisions within their relevant time-window. With HANA, and with the massive shift in the software industry toward in-memory databases, we need to take a new look at how to define this. It is no longer purely about speed; it is about value.
To understand value, we can look at basic economics:
In software, the benefits offered by a data-processing system are defined by five core abilities. These five distinct and orthogonal dimensions make up the benefits and costs. To perform highly, an information processing system needs to maximize the boundaries of each dimension:
(1) going deep (the benefit of allowing unrestricted query complexity)
(2) going broad (the benefit of allowing unrestricted data volume and variety)
(3) in real-time (the benefit of including the most recent data into the analysis)
(4) within a given window of opportunity (the benefit of rapid response time)
(5) without pre-processing of data (the cost of data preparation)
A system that achieves high performance on all five of these dimensions would result in very high value in our decision-making ability. Traditional enterprise systems today make compromises on each of these dimensions due to fundamental technological or architectural limitations. This means we don’t get the detailed, up-to-date answers we seek at the precise moment we need them. Solving this has been our goal with HANA.
To measure value, we can take this one step further and define these things more quantitatively, with a value-metric. I believe we can follow basic physics and apply real-world values:
For our purposes, the value of a database system depends on its abilities to serve applications data when and where needed, as efficiently and cheaply as possible. We can measure performance of an information system in this manner, as a ratio of a question’s “information distance” from its answer, and the amount of time involved in obtaining the answer. The two variables in this equation are constituted from the five basic constructs mentioned earlier. This simple framework gives us a new approach to quantify value using concrete, real-world measurements:
While the “traditional” industry standard performance benchmarks have been instrumental in enhancing RDBMS technologies over the last two decades, they do not match with the new performance needs of the real-time enterprise. Further, now there are deep mathematical and highly scalable workloads in the mix that deserve a fair test. Still, we use “raw horsepower” to measure success which is a major flaw in the current benchmarking systems. It is no longer sufficient to simply compare DBMS platforms based on their speed going from point A to point B. Industry benchmarks must reflect the reality of mixed OLAP and OLTP workloads and the necessity for real-time data access across heterogeneous data sources. Companies depend on industry benchmarks to assess the suitability of new hardware and software capabilities and need more than just a speed test on an artificial series of activities that they will never experience in the real world. Now is the time to move benchmarking to where the technology is and where it is going – for the benefit of our customers who depend on the benchmarks.
In essence, this new benchmark will reflect the real-world performance of DBMS products and illustrate the business value companies can realistically expect when those products are deployed into production. This OLEP (On Line Everything Processing) benchmark will establish the future performance paradigm that accurately reflects the real-world performance customers can expect. With in-memory data processing, the technology industry has created its own version of the electric car. Therefore, relying on the proverbial metric of miles per gallon is now outdated and a new measure based on these comprehensive capabilities is needed.
Let us demonstrate the groundbreaking power of these complex IT systems and have them properly benchmarked, thus dissolving false claims. We have all done this before with TPC and we look forward to doing it again by setting a similar vendor agnostic method of benchmarking. HANA is inspiring everyone to follow her, and in this Hana-led world, it behooves us to not be bound by the limitations of yesteryear, but demand our systems to deliver the value we need and deserve.
The massive shift to in-memory databases requires a new way of benchmarking systems. I believe it is time to challenge the status quo. Let us join together!