With the launch of SAP Data Hub on September 25, 2017, we know you have lots of questions. So, we have developed the following FAQ to provide answers to your most pressing questions. For those of you who want to dive deeper, you can find all the details on sap.com/datahub.
What is it?
SAP Data Hub is a data sharing, pipelining, and orchestration solution that helps companies accelerate and expand the flow of data across their modern, diverse data landscapes.
SAP Data Hub provides visibility and access to a broad range of data systems and assets; allows the easy and fast creation of powerful, organization-spanning data pipelines; and optimizes data pipeline execution speed with a “push-down” distributed processing approach at each step.
SAP Data Hub meets the governance and security needs of the enterprise, ensuring that appropriate policy measures are in place to meet regulatory and corporate requirements.
Why is this product necessary? What is the market need?
There is more data and more ways to store and use it than ever before. While this data holds business opportunity, corporate data landscapes are growing increasingly complex, and it is getting harder and costlier for organizations to not only understand the data that they have, but to work across all the different systems that need to use it, and apply end-to-end governance, to capture the maximum value.
Key Pain Points:
How is SAP Data Hub different from other offerings for integration, pipelining, or orchestration?
SAP Data Hub accelerates and expands your data projects by easily and quickly creating powerful data pipelines in a single, visual design environment
In a single design environment, data stewards can easily and quickly create powerful data pipelines that access, harmonize, transform, process, and move information from a variety of sources across the organization. Pipeline creators can easily activate powerful libraries for computation or machine learning, for example; rapidly connect data of a wide variety of types, such as social media, customer, and product information; and leverage existing processing investments, such as capabilities in SAP HANA, Apache Hadoop, SAP Vora, or Apache Spark. Pipeline models can be easily copied, modified, and re-used to accelerate pipeline deployment and leverage best practices.
SAP Data Hub accelerates business results with innovative “push-down” processing to power more agile, comprehensive data-driven applications
SAP Data Hub not only accelerates the creation and management of data pipelines that span varied data sources, it also provides fast execution of the pipeline activities themselves by distributing computational tasks to the native environments where the data reside. This federated “push-down” distributed processing ensures that the activities of the pipeline complete as rapidly as possible, delivering fast results to the business. This data processing approach allows customers to take advantage of serverless computing in the cloud, potentially reducing the overall cost of data pipelining and data management.
Other solutions often require you to centralize your data. Some companies offer a pipelining and orchestration solution, but only for the data held in their solution. They want you to move all your data into one place to create and execute advanced data pipelines.
Who benefits from SAP Data Hub?
When is it generally available?
SAP Data Hub is already generally available, as of September 1, 2017.
What are the planned deployment options?
For the initial release, SAP Data Hub will be offered as an on-premise application, which can connect and process data in cloud environments (e.g. Data Lakes in Amazon AWS). Its
architecture is cloud-ready, and a PaaS and SaaS version will follow in future releases.
Why is it called SAP Data Hub? Does it centralize data?
SAP Data Hub gets its name from the fact that it offers centralized governance and pipelining capabilities – a unified view and data management of the complex data landscape.
Part of the power of the solution resides in its ability to leave the data where it is. The data does not have to be mass centralized with SAP Data Hub. This provides advantages in terms of ease of management and speed of data pipeline execution. Customers leverage their existing data stores and existing processing capabilities.
Is data stored in SAP Data Hub?
No. SAP Data Hub does not offer its own data storage. It is a platform to orchestrate and manage data between existing data storages, but is not a data warehouse, data mart, or Data Lake on its own.
Is SAP Data Hub yet another ETL or Streaming tool?
No. SAP Data Hub goes beyond classical batch ETL or real-time streaming. It modernizes these functions and focusses on the integration of new technologies, operating in distributed landscapes (e.g. Hadoop cluster or public cloud storages). The main paradigm is to bring the logic where the data resides and to leverage the cluster compute power. Hence it adds the processing and integration on top.
What key functionality does SAP Data Hub v1.0 include?
With its first version, SAP Data Hub will allow the enterprise to achieve:
What is the relationship between SAP Data Hub and SAP Vora?
SAP Vora capabilities are included in SAP Data Hub, however SAP Data Hub and SAP Vora are designed to address different use cases, based on customers’ specific needs.
SAP Data Hub simplifies the orchestration of complex data processes while providing governance across modern and diverse landscapes including big data stores, enterprise data stores, enterprise applications and cloud solutions.
SAP Vora is an enterprise-ready, easy-to-use in-memory distributed computing engine to help organizations uncover actionable insights from Big Data, typically stored in Hadoop and NoSQL solutions. It is positioned for both data scientists, and as a part of multi-tier data strategy with Hadoop.
What is the relationship to SAP Data Services, SAP HANA smart data integration (SDI), and SAP HANA smart data quality (SDQ)?
SAP Data Hub will leverage existing customer investments and execute SAP HANA SDI/SDQ flowgraphs that run on SAP HANA boxes, as well as leverage SAP Data Services jobs that run on existing Data Services job servers. It will not replace their existing use cases.
SAP Data Hub is designed as a central place to orchestrate, monitor, and model integration flows, where SAP Data Services jobs, SAP HANA SDI and SDQ tasks, and Big Data flows can be brought together. These SAP EIM products will continue to be developed and offered separately from SAP Data Hub.
What is the relation to SAP Agile Data Preparation (ADP)?
SAP Data Hub has some built-in profiling capabilities, but can be complemented with SAP ADP as
a self-service data preparation tool. For this use case SAP ADP offers business users the
capabilities to search and access their data sources, visually manipulate the data to make it ready for reporting, and publish it. It will be interacting closely with SAP Data Hub to bring this self-service to Big Data scenarios. In later releases SAP ADP, will leverage the metadata repository of SAP Data Hub.
What is the relation to SAP Analytics?
SAP Data Hub helps drive value of analytics by optimizing the data pipeline with speed and security to enable organizations to act on the right information in the moment. SAP is the only vendor in the market that can offer an end-to-end software portfolio across Data, Analytics, and Business Applications. SAP Analytics Cloud, a cloud based solution for all analytics (built on SAP Cloud Platform); will take advantage of powerful data orchestration capabilities with SAP Data Hub, allowing organizations to enhance powerful analytical use cases through the ability to control, manage and optimize their data environments.
How is this part of SAP Leonardo?
SAP Leonardo is a digital innovation system that enables customers to rapidly innovate and then rapidly scale that innovation to redefine their business for the digital world. SAP’s Big Data solutions, SAP Data Hub, SAP Vora, and SAP Cloud Platform Big Data Services, are relevant to the Leonardo offering because they are key to scale and innovation. As such, they are offered in the Leonardo Big Data packages.
SAP Data Hub resonates with the core themes of Leonardo, because:
How do I buy SAP Data Hub?
Please contact your SAP Account Executive to get started, or contact us at: https://www.sap.com/registration/contact.html