Massively Parallel Processing on HANA

Posted by Robert Klopp on April 22, 2013

More by this author

Parallel processing is a “divide and conquer” strategy.

Imagine if you were handed a basket of playing cards with a request to sort the cards into suits. By yourself it might take an hour to respond to this GROUP BY suit query. If you divided the basket into four equal parts and distributed the cards to four people the query might take 15 minutes plus a minute to divide the basket plus a minute to merge the results. If the size of the basket doubled then the query time would double to 30 minutes plus. If the number of people doubled as well then you might maintain the 15 minute response. In other words, the level of parallelization impacts performance in a direct and immediate manner.

In a computing system the equivalent to a person is a core in a CPU as this is the unit that sorts cards. Each core represents a unit of parallelization that can accept and sort a card sort of. In some cores such as the Intel Xeon CPUs we use with HANA, it is possible to sort two cards at a time (using hyper-threading) so each core represents two units of parallelism.

On the hardware certified today with 40 hyper-threaded Xeon cores per server HANA provides 80 units of parallelization.

Other products struggle to provide this level of parallelization because they are constrained by their ability to feed cards to the core fast enough (see here for a review of parallelization of some database products). HANA is relatively constraint-free as the result of its in-memory architecture.

The net effect of this is that HANA with 80 units of parallelization can deploy from 2X to over 4X more compute per server per query than the other major competitors. This provides a 2X-4X performance advantage based just on the power of parallel processing. Further, the next generation of multi-core processor coming later this year from Intel will include up to 120 cores providing HANA with 240 units of parallelization. This 4X jump will immediately advantage HANA providing our customers with the opportunity to deploy 8X-16X more people per server to your card sorting queries.

Other products can overcome this disadvantage by deploying more servers, a product with only 16 units of parallelization per server will be able to deploy 15 servers to HANA’s 1 to match the parallelization. But the cost in servers, licenses, power, and floorspace will be significant.

Big MPP is just one of the performance advantages that HANA provides as this 8X-16X advantage does not include the savings from eliminating disk reads or from our columnar implementation. It is just one very straightforward part of the performance puzzle that makes HANA both wicked fast and cost effective.

VN:F [1.9.22_1171]
Average User Rating
Rating: 4.7/5 (6 votes cast)
Massively Parallel Processing on HANA, 4.7 out of 5 based on 6 ratings

18632 Views