Yesterday, Lenovo released a new record for the SAP BW-EML Benchmark and I thought that some might be interested in the purpose, history and details behind the benchmark.
The SAP HANA platform was designed to be a data platform on which to build the business applications of the future. One of the interesting impacts of this is that the benchmarks of the past (e.g. Sales/Distribution) were not the right metric by which to measure SAP HANA.
As a result, in 2012, SAP went ahead and created a new benchmark, the BW Enhanced Mixed Workload, or BW-EML for short. The BW-EML benchmark was designed to take into account the changing direction of data warehouses – a move towards more real-time data, and ad-hoc reporting capabilities.
BW-EML is quite a straightforward benchmark, and quite elegant in some ways. It was designed to meet with the Data Warehousing requirements of customers in 2012, and there were two key goals:
The SAP BW-EML Benchmark looked to achieve these goals with the following constructs:
The benchmark result is the total number of query navigations run over 60 minutes. What’s nice about BW-EML is that whilst the queries are random, they are chosen with groups of cardinality, so the variance in runtime has been shown to be <1%.
Anything which a customer might do is permissible, which includes any supported configuration and platform. This includes indexes, aggregates, and any other performance constructs. Anything that a customer can do is fair game.
There are pros and cons to this, but it drives a behavior of irrational optimization in many benchmarks. CPU manufacturers have been known to tune microprocessors to run SPEC benchmarks faster. Database vendors create parameters to turn off important functionality, to gain a few extra points in TPC-C.
One thing that surprised me on the SAP HANA database is that the configuration used by published results is basically the stock installation. What’s more, there are no performance constructs like additional indexes or aggregates in use.
From a benchmarking perspective that’s insignificant (benchmarkers routinely spend months tuning a database), but it is hugely significant from a customer DBA and TCO perspective.
Getting back to the benchmark, the Scale Factor is an important point of note. BW-EML is like TPC-H, and runs at a factor. The minimum factor is 500m (50m per object), and this grows to 1bn (100m per object), 2bn (200m per object) and beyond.
Caution: benchmark results with different Scale Factors cannot be compared! From a SAP HANA perspective, they can be compared, because HANA performs linearly with respect to data volumes. Therefore you can safely assume that if you get 200k navigational steps with 1bn scale factor, you will get ~100k steps with the 2bn scale factor.
But you cannot apply this logic with other databases, because you cannot assume linearity of performance. For instance there is an IBM i-Series result at 500m scale factor, and a SAP HANA result at 1bn scale factor. These results are not directly comparable – who knows whether i-Series will be linear.
Today, Lenovo released a world record in BW-EML with > 1.5m navigational steps per hour for the 1bn scale factor. That’s an incredible 417 sustained queries/second benchmarked over an hour.
Even more impressive is that it beats the previous highest single-node result by >10x, despite only having 7x the hardware. The remainder of the improvement comes from innovations in BW 7.4 SPS09 and SAP HANA SPS09, which is equally impressive.
It’s worth noting that the larger scale-factors have not yet been run for BW-EML. For instance, many of my customers have typical DSOs of 1bn+ rows each, which means a scale-factor of 10bn would provide very interesting results. The capability to do 150k navigation steps per hour on a large-scale data warehouse would be deeply impressive.
Some benchmarks (e.g. TPC-H) require the hardware cost to be listed as part of the benchmark submission. That’s a pretty important metric, but not one which BW-EML requires.
A rule of thumb for BW on HANA hardware is $100k/TB is a good target price. For other databases, the hardware price will no doubt vary significantly.
The fascinating thing about benchmarking is that they become out of date as technology and business needs change.
In the market right now there is a clear move towards hybrid-workload or OLTAP systems in the Enterprise, which combine transactional processing and analytic processing. This represents a challenge though because SAP’s strategy for this is HANA-only, and there are two orthogonal goals to benchmarking.
The first is to size systems. Given that the future of SAP Business Applications is on S/4HANA, I expect that there will be a S/4HANA S4-2tier benchmark, like the SD-2tier that came before it. This will allow customers to use the SAP QuickSizer to estimate the size of SAP HANA required. Let’s call it a unit of S4PS, which will relate to on-premise deployments and cloud deployments. In all likelihood it will be mixed-workload with analytics and core transactions, driven by the SAP Gateway API layer.
The second is to perform cross-vendor bake-offs, either for hardware or for the database layer. Customers really want to benchmark HANA against other databases, or even to know what HANA appliance is faster. It seems likely that given the future of SAP applications is HANA-only, that the BW-EML benchmark will be the last cross-vendor benchmark.
The BW-EML benchmark isn’t perfect. It doesn’t have enough a result submissions, especially at scale (I hope the incredible Lenovo result will cause other vendors to submit). It’s also really quite hard to run and requires both benchmarking and SAP BW expertise.
There are also not any certified BW-EML benchmark results by Oracle, IBM DB2 BLU or Microsoft SQL Server. One might assume this is because they can’t get good results, but that is conjecture.
Hopefully in the future those vendors will make submissions so customers can understand both the relative performance of those systems against SAP HANA, but because the SAP Benchmark Council have a full disclosure policy on all benchmark methodology, also the amount of tuning effort required to get good performance.