Article appears in SAP Startup Focus Newsletter – Issue 6
One of the biggest challenges facing today’s enterprise is to make data (structured, unstructured and semi-structured) easily searchable for human beings and help increase their productivity. Most enterprises are in the elusive search of a Unified Information Access based platform which can use all information available to be processed and analyzed without limit to scalability.
Such a technology platform is critical for development of search based applications that are built on semantic technologies to aggregate and normalize the data as well as on-the-fly text analytics. Such applications provide real-time information access and faceted search features for huge volumes of structured and unstructured text data.
A typical search based application could be an e-commerce website which provides a simple google-like search functionality for the user to just type in her interest and return results from the online store which match the interests combined with various attributes such as customer review, language options, further suggestions based on affinity analysis, associated tags, popularity index and so on.
SAP HANA is one such Unified Access platform. One of the hidden gems in the SAP HANA technology platform is the ability to do exploratory search and text analysis of unstructured data. The HANA database contains a core search engine which handles text search, fuzzy text matching, entity ranking, facets and snippets.
This search engine along with the Text Processor, is used to build a search information model which is a runtime development artifact used for further UI development. HANA also contains an HTML5 UI toolkit for SAP HANA Info Access that provides UI building blocks for developing browser-based search apps on SAP HANA. The toolkit enables a free-style search of a SAP HANA attribute view, displaying and analyzing the result set.
The toolkit provides UI elements (widgets) such as a search box, a result list with a detailed view, and charts for basic analytics on the result set. The widgets are interconnected and adapt in real-time to user entries and mouse-over (hover) selections.
Below, very broadly, the steps to build a Search based application in SAP HANA:
1. Get the data into HANA: Load the unstructured data or create the infrastructure to stream unstructured data into SAP HANA.
2. Create Full Text Index: Use the SAP HANA Studio to enable the table columns containing text for fulltext search.
3. Run Text Analysis: Use the SAP HANA Studio to extract structured data based on pre-defined Rules and Dictionary.
4. Create a Search Model: Use HANA Studio to define the data model and to specify the search behavior.
5. Build Search based application frontend: Use text editor and HANA Information Access UI toolkit to define layout and data for the Application.
For more technical details, visit section 12.3 in the Developer Guide.
Suneet Agera, Solution Architect, SAP HANA Platform at SAP Labs India Pvt. Ltd.
VN:F [1.9.22_1171]HANA Curious - Text Search,