By Vishal Sikka
At a recent public webinar, a competitor showed their lack of understanding of SAP HANA and delivered this limited and inaccurate understanding to industry and financial analysts. At SAP, we normally don’t respond to this type of thing. But seeing this drama unfold, I got myself thinking – we must be doing something right! I mean, could a leading and distinguished builder of databases be so fundamentally unaware of next-generation database technology? Or are they simply unwilling to acknowledge the reality, and therefore are resorting to sharing misinformation?
The HANA team is a group of people with a passion for 3 things in common – make HANA customers successful, have fun, and question status quo. In this spirit, on behalf of the SAP HANA team, I’d like to set the record straight on some inaccurate info from our competition and have some fun while doing so.
The SAP HANA Advantage
I want to start by saying (a) the comparisons to yesterday’s technological approaches miss the point completely, (b) HANA customers are enjoying phenomenal benefits of modern technology without disruption, and (c) HANA represents a next generation in enterprise computing, especially in database technology. It is a modern data platform for real-time analytics and applications. It enables organizations to analyze business operations based on large volume and variety of detailed data in real-time, as it happens, eliminating the latency and layers between OLTP and OLAP systems for “real” real-time. The HANA Advantage is a tightly integrated system with different components that are fully transactional and well integrated into the system optimizer. Scale up and scale out work seamlessly for all components like OLTP, OLAP (operational as well as warehouse operations), text, planning and pure application development. It allows easy deployment with no server zoo, no internal replication, no materialized aggregates and no stack of engines!
In my analysis below, I will try to be precise in communicating all this, and I will update this blog regularly to continuously provide the truth on SAP HANA. Here are some of the corrections.
#1 The Future Design Center for Databases
Wrong: “In-Memory DBMS will not replace many or all relational DBMS”.
Right: In-Memory DBMS is a future design center for databases. It is already replacing large parts of the database market – especially in analytics, planning, simulations and real-time application (e.g. in gaming markets). It is based on sound research in academia and designed to support OLAP and OLTP. It’s already prevalent in markets where performance is key and will transform the enterprise markets in the same way for cloud as well as transactional apps.
#2. Role of the Database Platform
Wrong: In-memory database can only do a few things such as MOLAP, Operational reporting, Query & Analysis, Planning & Budgeting, Unstructured Information Discovery.
Right: SAP HANA is a general purpose in-memory database platform – bringing fresh data, capturing transactions with full ACID compliance, analyzing them as it happens, doing in database processing, pushing down business, predictive and planning logic into the database and even serving clients such as analytics, cloud and mobile applications. It has mainstream application far beyond the niche use cases being described for an in-memory database. You don’t need to add multiple technologies and duplicate engines into one box for different use (e.g. Endeca – text Essbase – planning, TimesTen – caching, analytics).
#3. Scale Out with growing Data Volumes & NUMA support
Wrong: SAP HANA has limited support for data marts and data warehouses and can’t scale-out.
Right: SAP HANA can scale out to an unlimited number of cores/nodes and hardware prices continue to fall. At our SAPPHIRE NOW conference in May last year, Hasso already showed 32 nodes /1000+ cores running SAP HANA (at 13.45 minutes into his speech). Incidentally we have 3 such 1000+ core systems now running around the world. We have live HANA customers on multi-node scale-out systems and several partners, including IBM and HP, already making scale-out appliances.
Also, our recent 100TB benchmark runs on 16 nodes of IBM’s X5 servers each with ½ TB of main memory and processes 100TB of BW data in 300ms-500ms for operational reporting scenarios and 800ms-2s for ad-hoc analytical queries. In addition, data can be swapped out with standard HANA mechanisms such as aging criteria. NUMA architecture is supported in HANA. On the contrary, public documentation highlights a 1 TB limit on Oracle Exalytics, and they have publicly said a significant portion of this is used for working memory for all the products they have put together (Essbase+Endeca+TimesTen).
#4: OLTP & OLAP
Wrong: There is “limited write performance” with SAP HANA.
Right: SAP HANA is a single foundation for OLTP+OLAP on one hardware and one operating system; it scales-up and out (from a mini to 1000+ cores across multiple nodes) and it dynamically adjusts to workloads. We are the only in-memory database with inserts on a columnar store with high write performance AND high analytical performance. This is a key differentiator for SAP HANA.
#5: Stores – Row, Column, and Text
Wrong: SAP HANA has “no unstructured data support” and does not provide “row and column compression”.
Right: SAP HANA has row, column and text stores in one database and it natively supports unstructured data. Furthermore these are integrated and thus simplify transactional and analytic operations across all the stores. In fact, SAP HANA’s foundation was in unstructured search. It handles standard search and text mining as well as text like search on structured data. With Inxight technology also linguistic features like tagging, feature extraction, entity extraction and sentiment analysis will be included in SAP HANA. Inxight is the best text analysis software in the market.
SAP HANA supports heavy compression in column store. Heavy compression is not required for row store, because it is used as a buffer for column store and for compression irrelevant tables only. The benefit of SAP HANA is the intelligent integration with the application stack which makes row store compression irrelevant.
#6. Data Caching & Query Optimization
Wrong: Both SAP HANA and TimesTen do data caching.
Right: Previous generation databases use caching to improve performance. HANA is a pure in–memory DB based on a new architectural paradigm. Since the entire database is in memory, you don’t cache data in HANA. SAP HANA has a world-class query optimizer that natively enables massively parallel query execution, including inter and intra-operator parallelism.
#7: ACID Properties and Transactional Integrity
Wrong: SAP HANA does not have “transactional integrity/correctness and lacks multi-version concurrency (MVCC)”.
Right: SAP HANA is fully ACID compliant, we use permanent storage systems for backup and persistence. It is fully MVCC with regular capabilities like statement level and snapshot isolation.
#8: Aggregates and Materialized Views
Wrong: You need materialized views of aggregated data for high performance
Right: Another Ha! Electric cars don’t need spark plugs. On-the-fly aggregates on detailed data held in memory are much higher in performance. Aggregates are outdated technology now, as it requires a lot of effort to create, store redundantly and manage changes. SAP HANA does not need indices for performance like traditional databases; the whole in-memory database across all the dimensions of data set itself acts like an index.
#9: Business Intelligence Clients
Wrong: SAP HANA provides limited support to few BI clients.
Right: SAP Business Objects is optimized to run on SAP HANA. In addition numerous 3rd party clients are possible today (e.g. Tableau, Tibco Spotfire) and we will continue being completely open to 3rd party BI clients on SAP HANA.
#10: Planning Applications and Analytical Functions
Wrong: SAP HANA has limited support for planning and budgeting applications.
Right: SAP HANA provides complete support for planning applications, many SAP Enterprise Performance Management applications run on SAP HANA. SAP HANA has native planning support inside the DB with the planning engine. Operators like disaggregation, copy and others are part of the relational algebra inside SAP HANA. Additionally we support the SAP planning language FOX natively inside the DB.
Planning is a huge argument for SAP HANA — not the other way around. SAP HANA does not need the standard cube operations, because we calculate on the fly. SAP HANA includes major analytic & business functions such as math functions, currency conversion, unit conversion, exceptional aggregation, time series analysis, hierarchy handling and predictive functions in its library, and has extended support to other libraries. With SAP BW on HANA, we don’t have layers; we push down planning calculations to HANA DB delivering high performance. SAP HANA will support all transactional apps, business warehouse, BI and all cloud apps of SAP. We also have third-party partners developing planning apps built natively on SAP HANA.
#11. Operational Reporting & Data Sources
Wrong: SAP has limited operational reporting capabilities due to “limited Data Sources” with its replication and ETL technologies.
Right: SAP has an extremely well suited solution of real-time operational reporting from multiple sources (e.g. SAP CO-PA Accelerator); many of them are non-SAP application data sources. SAP Data Services and SAP Sybase Replication server are market leading ETL & replication technologies to bring data from non-SAP and SAP data sources. HANA has an extremely high insert rate for bulk inserts due to massive parallelism. It supports all data sources and has been tested to 2TB+/hour data movement into SAP HANA.
Wrong: “SAP HANA is 5-50X more expensive than Exalytics.” 1 TB HANA H/W would cost $362K, and SAP HANA software would cost $3.75 M.
Right: For 1 TB we expect H/W to cost $40-$60K (not $362K) and software to be also dramatically less expensive than touted here. Also SAP HANA is available at price points for different market segments.
It ranges from HANA edge for small-business with appropriate prices (e.g. $12K for a single node H/W + $2K for HANA for SAP B1 Analytics on HANA) to very large scenarios of greater than 100TB of memory. Customers can also buy SAP HANA for an app (BW, BPC, CO-PA, Smart Meter Analytics, etc.) or for data marts and data warehousing. BW on HANA for Data warehouse sizes of 40 TB is very competitive. Also, HANA pricing is inclusive of everything customers need – test, development and QA environments and support. There is no need to buy other software for data loading and movement, storage acceleration (e.g. Exadata) etc. Considering all this – for a 512 GB usage configuration, SAP HANA total cost is approximately less than 50% of the cost of competitor products.
#13: Standards and Openness
Wrong: “HANA only works with SAP tool’s, and has limited or non-standard SQL”.
Right: A significant percentage of customers use HANA for non-SAP data and use case – it is for both SAP and non-SAP application use cases. It works on standard SQL and MDX and has standard interfaces for any application. It’s open across every layer:
- Open choice on H/W vendor of your choice bringing new chip level innovations to market ahead of competition
- Open choice of BI clients
- Open to all applications and platforms.
- There are hundreds of custom (non-SAP) applications under development on SAP HANA
e.g. Oracle Apps and Oracle BI run without any changes on SAP HANA. Existing stored procedures in Oracle are translated into SQLscript for IP reasons, which shows complete openness of SAP HANA.
#14: SAP HANA on disk
Wrong: SAP HANA does not support data stored on disk.
Right: SAP HANA supports data stored on disk through prioritization techniques such as Least Recently Used (LRU). SAP HANA can keep relevant data in-memory, and data from disk can be loaded on request.
#15: Query Speed
Wrong: SAP HANA does not execute queries faster than other databases.
Right: SAP HANA keeps all data in column store in integer format and is optimized to take advantage of latest Intel innovations such as CPU developments in vector operations. SAP HANA’s next-generation architecture and chip level innovations make it faster than any of the competing databases in the market. For example we have 4 customers who have crossed 100,000X improvement on the speed of business process with SAP HANA. The leader in the pack is MKI, showing a 408,000X improvement on retail/logistics data analysis.
#16: SAP HANA is slower than Exadata & Exalytics
Wrong: “SAP HANA does not run faster than Exadata, leave alone Exalytics”.
Right: In one example in a customer’s infrastructure, SAP HANA was 15,000X faster on the credit check and credit limit verification business process on the same data and query on Oracle Exadata. Compare this real-time performance to multiple redundant boxes for transactional, analytical, in-memory acceleration and text processing that have inherent latency in their architecture. We see this in several customers and use this one as example.
The Current Approach in the Market
The New Approach with SAP HANA
#17: Installation and Implementation Experience
Wrong: SAP HANA takes days to install and months/years to implement.
Right: SAP HANA installs within minutes to an hour in a data center. In fact soon you will be able to provision it from our or our partners’ clouds. Provimi has gone live on profitability analysis in as fast as 3 weeks.
Wrong: It is hard for databases to show time-based reports without significant overhead.
Right: With SAP HANA you can do time traversal on your reports (e.g. compare actual vs. predicted by day) and for example use a slider to go through the time axis and reports are constructed on the fly without the need to store separate indices or views.
I have presented the facts and request that you understand the truth behind SAP HANA. The SAP HANA performance is disrupting the traditional database market. It is a single foundation for OLTP+OLAP on one hardware and one operating system and runs from a mac-mini to a 1000+ core server cluster. Its technical specifications across attributes we really care about, such as,
- Exploding data volumes (yes, scales as you grow and works with disk based stores)
- Multi-structured data (yes, including text and machine data)
- Real-time analysis on fresh data (yes, real, real-time)
- High speed of interaction (yes, at the speed of human thought)
- No efforts to tune databases (yes, it’s simpler and cheaper)
These provide orders of magnitude improvements in performance. SAP HANA creates immense business and competitive advantage for companies by revolutionizing their customer interactions, financial and supply chain performance. Customers like Nongfu Spring (22k improvement on Oracle DB) are turning off Oracle.
We are also charting new frontiers, in healthcare, for instance, in revolutionizing genome analysis, or in bringing commerce and real-time banking to hundreds of millions around the world, and in other great challenges of our time. Times are calling for all of us to go after these new horizons, to not think of the world as an increment of the past, but as something amazing that can be created based on what we know to be possible. Life is too short for us to be held back by misinformation.