BW on HANA and Very Large Tables

Klaus Nagel

Posted by Klaus Nagel on

Director - Development

More by this author

There have been a lot of questions regarding table sizes and partitioning of tables in the SAP HANA database and SAP applications like SAP NetWeaver BW on HANA. This blog provides a short overview of the requirements and best practices.

Best practice from a design and system administration perspective is to split your data into physical partitioned objects (using semantic criteria) to avoid tables with more than 500 million records. A customer already maintaining an even larger table in their system is best advised to consider remodeling the current data model. Such large tables are a nightmare in all aspects like load & reporting scalability, data aging and archiving, pruning, delta merges, backup & recovery.

The BW functionalities Semantic Partitioned Object (SPO) and MultiProvider offer an easy way to combine many “tables” by ensuring semantic integrity to report on any amount of data independent of their actual physical storage in the system.

An additional attractive way of reducing the size of a table (and optimizing the In-Memory sizing requirements) can be using BW’s Nearline-Storage (NLS) capabilities (e.g. Sybase-IQ), thereby defining a data-slice that is moved to the NLS-server and then deleted from the HANA DB table. Based on the definition of the data-slice in the NLS server and the filter conditions in the query BW’s Analytic Manger determines whether the NLS-server has to be accessed – completely transparent for the end-user.

If a customer has an extremely large table (>2 billion records) and no opportunity to re-model the table short-term, several optimizations steps have to be taken into account when migrating to BW on HANA:

InfoCube: The fact table of an InfoCube is split into 4 database partitions: uncompressed requests, compressed requests, reference points, and historic movements – the later 2 are only relevant for InfoCubes with inventory keyfigures and are otherwise empty.

  • SingleNode HANA system: Each partition MUST be smaller than 2 billion records
  • MultiNode* HANA system: Each of the above partitions is automatically split into multiple parts, depending on the table size (during the R3LOAD Import procedure) and the settings of the landscape organizer** – since this partitioning is round-robin each part has basically the same size. Each part MUST be smaller than 2 billion.

DSO: A DSO can be partitioned on the database using a time characteristic – this is modeled in BW.

  • SingleNode HANA system: Each partition MUST be smaller than 2 billion records
  • MultiNode* HANA system: Each of the semantic partitions is automatically split into multiple parts, depending on the table size (during the R3LOAD Import procedure) and the settings of the landscape organizer** – this partitioning is based on a HASH-Key on the key fields, so in general the parts are of nearly the same size. Each part MUST be smaller than 2 billion.

PSA tables: As part of the “BW post migration procedure” a new PSA version (and therefore a new table) is created. The new PSA tables are automatically partitioned based on the technical field PARTNO and new partitions are created automatically once one partition exceeds a certain size so avoiding the 2 billion limit. A PSA table MUST not be larger than 2 billion records before the DB migration to HANA.

Masterdata tables: As part of the data quality and referential integrity check in BW a unique integer value is created for each characteristic value, the “SID”. Since SIDs are of type INT4 the Masterdata tables in BW can’t become larger than 2 billion records. (Remark: With BW7.40 the SID-creation can be omitted for very-high cardinality InfoObjects with more than 2 billion distinct characteristic values. The Masterdata tables are then automatically partitioned to avoid hitting the 2 billion limit per part/table).

(*) A “real” MultiNode HANA system has at least 3 active nodes: 1 master + 2 worker/slave nodes. A 1+1 system is not recommended (see e.g. http://scn.sap.com/docs/DOC-39682 ).

(**) A partition/table is split into maximal “n-1” parts, where “n” is the number of active nodes.

/wp-content/uploads/2013/04/tabledistribution.jpg

Picture: „Table1″ is partitioned into 3 parts: 2010, 2011, 2012. And each part is again hash-partioned into 3 parts. The parts are automatically distributed across the nodes to evenly distribute load and memory consumption.

VN:F [1.9.22_1171]
Average User Rating
Rating: 5.0/5 (4 votes cast)
BW on HANA and Very Large Tables, 5.0 out of 5 based on 4 ratings

45325 Views