Vous êtes sur la page 1sur 20

SAP HANA

BW on SAP HANA
Understanding HANA technology, what it means to your business, and what to expect during data migration

The SAP Business Warehouse (BW) is a core part of the SAP NetWeaver technology. Serving as a powerful Enterprise Data Warehouse application platform BW provides flexible reporting and analysis tools. Businesses are able to make well-founded decisions on the basis of this analysis. Business information from SAP and external data sources are integrated, and consolidated in BW on HANA. SAP HANA (HANA) is a new database and analytics engine. Data now resides in main-memory (RAM) and no longer on a hard disk. Complex calculations on data are not carried out in the application layer, but are moved to the database. By running BW on HANA your business will experience significant gains in speed for retrieving analytical queries and reports. In this section, we introduce core concepts of SAP HANA in-memory computing, and how these concepts can help your SAP NetWeaver 7.3 Business Warehouse run better. We also consider some of the technical implications of upgrading your current data warehouse to version 7.3, and of migrating it to the SAP HANA database. Lastly, we cover what you might expect during the process of transitioning to this new technology.

Row vs. Column Data Storage


Relational databases typically use row-based data storage. However, column-based storage may be more suitable for some business applications. As shown in the figure below, a database table is conceptually a two-dimensional structure composed of cells arranged in rows and columns. Because computer memory is structured linearly, there are two options for the sequences of cell values stored in contiguous memory locations:

Row Storage - The data sequence consists of the data fields in one table row. Column Storage - The data sequence consists of the entries in one table column.

Traditional databases store data simply in rows. The HANA in-memory database stores data in both rows and columns. It is this combination of both storage approaches that produces the speed, flexibility and performance of the HANA database. OLAP queries on huge amounts of data take a lot of time because every single row is touched to collect the data for the query response. In columnar tables, this information is stored physically next to each other, significantly increasing the speed of certain data queries. Data is also compressed, enabling shorter loading times. The following example shows the different usage of column and row storage, and positions them relative to row and column queries. Column storage is most useful for OLAP queries because these queries get just a few attributes from every data entry. But for traditional OLTP queries, it is more advantageous to store all attributes side-by-side in row tables. HANA combines the benefits of both row- and column-storage tables.

In-memory Technology
In-memory technology moves data and information sources from remote databases into local memory so the results of analyses and transactions are available immediately. The elements of in-memory computing are not new. However, dramatically improved hardware economics and software technology innovations have made it possible to realize The Realtime Enterprise with in-memory business applications.

The cost of main memory has decreased significantly. It is now cost effective to store all data of a large enterprise in main memory. The SAP HANA Appliance is a combination of in-memory software and SAP-partner hardware that allows you to query multiple types of sources at speeds and in volumes as never before. All data are kept in main memory and can be processed at an incredible speed. HANA's real-time platform combines high-volume transactions with analytics to help create solutions that take your business performance to the next level. The HANA in-memory database can help your applications zero-in on the information you need without wasting time sifting through irrelevant data. The result: Instant answers to your complex queries, and better decision making across your enterprise. With optimized loading routines, system data can be restored quickly in case of power failures. The SAP HANA Appliance can fail over to a cold standby server to guarantee high availability.

Multi-core Processors
Processor speed is no longer dependent on clock speed but rather, on the degree of system parallelism. Modern server boards have many CPUs with several cores each. The HANA database is optimized to use the capabilities of multi-core processors in order to enable incredibly fast queries. Parallelism can be achieved on different levels from the application level to query execution on the database level. Processing multiple queries at the same time is handled by multi-threaded applications which map each query to a single core. Query processing also involves data processing,i.e., the database needs to be queried in parallel. HANA distributes the workload across multiple cores of a single system.

Using column-based tables enables easier data partitioning, and parallel processing wherever allowed. HANA uses multi-core systems on different layers to achieve highly-parallelized query execution.

Tools: HANA Studio and Modeler


The two most important tools that come with BW on SAP HANA are the HANA Studio, the system management application, and the HANA Modeler, the data migration and optimization application. SAP HANA Studio SAP HANA Studio is pre-installed on the SAP HANA Appliance. Studio is used for data modeling and provisioning.

SAP HANA Modeler SAP HANA Modeler is a graphical data modeling tool used to design analytical models and, later, analytical privileges that govern access to those models. SAP HANA Modeler supports:

ERP table metadata upload: Mass ERP table metadata upload using the Load Controller API Selective ERP table metadata import using Data Services integration Extractor metadata upload:
Extractor table metadata upload using the Load Controller API Selective Extractor table metadata import using Data Services integration

Business Value

BW on HANA enables

Faster information for better, more timely business decisions Lower Total Cost of Ownership (TCO) Tight integration with other parts of your SAP landscape Simplified configuration and operation Improved BW performance Answering many important business questions immediately Significantly faster analytics and reporting Access to the most current and complete business information Realtime access to transactional data Development of deeper insights into your business Elimination of data aggregation Cost effective management of large volumes of data New possibilities applying groundbreaking in-memory hardware innovations to your business needs

BUSINESS VALUE:

Faster decision-making
Having the right information when you need it
Increasingly sophisticated business decision models depend on fast access to and manipulation of massive data stores. Insight into business operations demands data volumes and velocity that are beyond the capabilities of traditional disk-based systems. HANA helps your SAP NetWeaver 7.3 Business Warehouse run better than ever. HANA enables you to analyze large amounts of data, from virtually any source, in near real time, making it possible to access reports with up-to-the-minute information. As an example, having the most current order and logistics information makes it possible to manage your inventory more efficiently, and to predict Available to Promise (ATP) more accurately.

Lower Total Cost of Ownership (TCO)


Reducing costs through simplification
In-memory computing with HANA becomes the primary persistence model for the enterprise business warehouse and will enable significant rationalization of existing BW landscapes resulting in lower overall TCO. System administration is simplified through one set of tools. Simplification of Models, Modeling, and Re-Modeling BW on HANA allows for simplification of existing models since, in many cases, layers in your Enterprise Data Warehouse can be eliminated because the speed of loading and querying makes some objects unnecessary. The structure of the physical models that are implemented for in-memory optimized InfoCubes and Datastore Objects (DSOs) has also been simplified; eliminating redundant storage, and the need to unload and reload data when remodeling these objects. This simplification of remodeling makes it possible to respond quicker and with less disruption when new requirements are identified by the business. Reduced Data Redundancy The physical models implemented by BW on HANA for InfoCubes and DSOs eliminate redundancy within a model, and the enhanced performance for both loading and querying make it possible to

eliminate entire models in many cases. The result is that the same amount of data for BW on HANA requires significantly less storage. Simplified Operations and Monitoring With the integration of basic HANA administration capabilities with the BW Admin Cockpit it is possible to perform and monitor most common database and data warehouse functions from one place. This reduces the number of tools that have to be installed, learned, and maintained, and reduces the skill set and training required to create, operate, and maintain your data warehouse.

Tight product integration


Increased flexibility because client tools provide more options Tighter integration with SAP BusinessObjects Data Services Enhanced integration with SAP BusinessObjects Metadata Management Rapid prototyping of Ad-Hoc-Scenarios via BW Workspaces Webi, Bex, Xcelsius empower business users with powerful, yet easy-to-use business intelligence Intuitive, web-based interface with offline capabilities Start from a blank slate or use an existing analysis or report Multi-source access Interactivity with filtering, ranking, sorting, calculations, and more Data lineage Lightens the IT workload Self-service analysis and reporting Controlled and secure access with tight BI platform integration Intuitive, business-centric view of information with universes With Excelsius Dashboards business users can conduct what-if analyses with sliders and other controls, and can drill-down into data details. Dashboards can then be customized with pre-built components, skins, maps, charts, gauges, and selectors.

Simplified configuration and operational management


Non-disruptive innovation and advanced administrative tools
Your current business processes inside BW can stay as they are and will mesh perfectly with HANA. System operation stays as it is, and process chains do not need to be remodeled. With HANA there is no need to retrain end users familiar with BW. There is still the same BW application process but, with BW on SAP HANA, it is now possible to run queries, updates and reports much faster than before. Expert users do not need to get retrained because they can continue to use their current BI or other frontend tools.

In addition, HANA supports the BW Analysis Authorization Concept, and can be integrated with NetWeaver Identity Management to ensure security remains intact.

Migration Options
Various approaches to system implementation
Three options exist for implementing BW on SAP HANA. All three achieve the same result: copying your BW data into an SAP HANA database.

1. The easiest option: Create a totally new BW instance, and connect it to the SAP HANA database. 2. A more complex option: Upgrade an existing BW system to version 7.3 SPS5, then change the underlying database from a traditional disk-based relational database to the new in-memory HANA system. 3. The most popular option: Keep the current BW system running on a traditional database, while creating a new BW instance running on the HANA database. This third option is important for companies who already have an active BW system which must function continously and without interruption. If this is the case for your company, SAP strongly recommends you follow a parallel approach to data migration: keeping your production landscape in place while bringing up the BW on SAP HANA system. You can implement SAP NetWeaver BW scenario by scenario, with the assurance that the existing production landscape is still available as a fallback.
A parallel approach mitigates risk while simultaneously enabling you to familiarize yourself with the administration and capabilities of HANA. SAP also strongly recommends you consider the high availability and backup/recovery procedures of HANA before starting to use it in production systems.

1.1 What is SAP HANA?


SAP HANA is a general purpose and ANSI standards-compliant in-memory database. Because of its design it allows transactional and OLAP reporting in a single system, which makes it simpler, and much faster, than traditional RDBMS systems like Oracle.

1.2 Is SAP HANA an appliance?


SAP HANA comes shipped as a pre-configured appliance from your hardware vendor and the license is bought from SAP. SAP HANA is an analytics appliance that consists of certified hardware, an In Memory DataBase (IMDB) an Analytics Engine and some tooling for getting data in and out of HANA. You build the logic and structures yourself, and use a tool e.g. SAP BusinessObjects, to visualise or analyse data.

1.3 How is SAP HANA licensed?


With SAP HANA, you pay based on the size of productive usage. All test, demo, HA and DR licenses are included in this price and there are no hidden extras like CPU or user licenses. It is one simple price based on appliance size. There are volume discounts so as you buy HANA, the price decreases. SAP HANA is priced by the 64GB unit right now, and there is some discounting based on volume. As usual with SAP licenses, it's best to contact your account exec directly and talk to them. The minimum purchase amount is currently 64GB, and the smallest appliance is 128GB, which is upgradeable to 256GB. This means if you buy 64GB today, you can easily incrementally expand up to 256GB. Note that Steve Lucas from SAP has given some HANA prices for BW to the market -What Oracle won't tell you about SAP HANA - saying that it can cost as little as 13k per 64GB unit.

1.4 Why is SAP HANA versionless and what is innovation without disruption?
SAP HANA was originally going to be numbered 1.0, 1.2, 1.5 and 2.0 and you will see this in some early literature. But what SAP have done is really interesting: they have removed the versions and provide innovations automatically when you update HANA. For the purposes of information and marketing, SAP HANA has patches - SP01 which was the ramp-up, SP02 which was the generally available version, SP03 which provided support for BW and SP04 which provides support for Text Analytics and High Availability. But the patches are just to let people know about the new features - there is no release of SP04. But the reality is that SAP HANA only comes released in Revisions. And for example, Revision 28 is SP04. So when last week, I had all our SAP HANA systems updated to SAP HANA Revision 28, we got the innovations from SP04 included. And this update takes about 10 minutes and can be done online in High Availability environments. This is what SAP call innovation without disruption and it seems to work really nicely.

1.5 What are the key benefits of SAP HANA Patches?


SAP HANA SP01 (Revision 10) is the initial release of SAP HANA to ramp-up. SAP HANA SP02 (Revision 12) is the general-availability release of SAP HANA to the market. SAP HANA SP03 (Revision 20) brought:

Support for the SAP NetWeaver BW database Information Composer

SAP HANA SP04 (Revision 28) brought:

Loading Data from Flat Files (CSV, XLS, XLSX) including automatic table creation in HANA Studio Enhancements for Attribute/Calculation Views, Usability, Security, Multi-language and technical. High Availability ETL-based Data Acquisition by SAP HANA Direct Extractor Connection

10

Predictive Analytics Library (PAL) R Programming Language Integration

1.6 What is SAP NetWeaver BW on HANA?


SAP now supports SAP HANA as the underlying database for its first Business Suite product, the NetWeaver BW Data Warehouse. I have broken this out into a separate article - The SAP BW on HANA FAQ

1.7 What is SAP ERP on HANA?


SAP planned from the start to allow customers to run their ERP or Business Suite on SAP HANA. However, out the box, ERP on HANA does not provide the same level of benefits that BW on HANA does. This is because ERP is predominately transactional (OLTP) and SAP HANA does not optimise large transactional volumes to the extent that it does the OLAP functions of SAP BW. It will still run faster than ERP on Oracle or DB2, but not 100 or 1000 times faster. SAP ERP is not optimised for any particular database and this was a deliberate decision. ERP basely makes use of database stored procedures. However, to optimise ERP on HANA it is necessary to push the logic down into the database and make use of the SAP HANA stored procedure language SQLScript. This work is in progress. In addition, SAP wanted to prove the reliability of SAP HANA and its ability to support business critical applications. From a technology perspective, it is already possible to run the Business Suite on IMDB and SAP has trialled moving some large databases into HANA already. In fact, it runs its own ERP system, affectionately called "NSP" by employees, on HANA in parallel. SAP ERP on HANA is expected to be released into ramp-up in Q4 2012. CRM, SCM and PLM will follow.

1.8 What is SAP HANA great at?


The best thing that HANA brings to the table is the ability to aggregate large data volumes in near real-time - and to have the data updated in near real-time. SAP's demos show hundreds of billions of records of data being aggregated in a matter of seconds. SAP has built a set of Analytics Apps on top of HANA and this are set to be great point use cases to get customers up and running quickly. The really great SAP HANA apps that have been created mix three big performance improvements. First, the performance of in-memory analytics, second, an inefficient design and third, a change in process that allows further improvements. This is what SAP's CTO Vishal Sikka affectionally calls the "100,000x club". In addition, SAP NetWeaver BW 7.3, powered by SAP HANA looks like it will be a no-brainer for the majority of SAP's 14,000 BW customers. The improvements in performance and flexibility it allows resolve many of the classic data warehouse problems that have plagued the market for 20 years.

1.9 Where might SAP HANA not provide a benefit?


SAP HANA improves the biggest bottleneck that exists in standard database platforms - the spinning disks. Inmemory technology is typically 100-1000x faster than disk for this reason. The biggest examples of where I have seen SAP HANA not able to provide a benefit is where it is compared featurefunction as a replacement to an existing transactional system. The reason for this is because SAP HANA provides opportunities to simplify the architecture of the existing solution and simply replacing the database does not provide this opportunity. For example, SAP HANA does in this instance not require a separate data warehouse for analytics - you can just build real-time virtual OLAP functions on top of your transactional OLTP store. So, the analytics functions are realtime where they were replicated before, and what's more because of the high analytical performance of SAP HANA, they are likely to be massively faster.

1.10 How does SAP HANA compare to Oracle Exalytics?

11

This is a perfect example of the simplification example I gave in the last question. With Oracle, you need to build your transactional database in Exadata, then you replicate this into the Exalytics Times-Ten database for reporting and into Essbase for forecasting. By contrast if you use SAP HANA, you store the information once in the SAP HANA appliance. From that one store you can do transaction processing, analytical reporting, forecasting and predictives. With HANA you are not moving information around the whole time and this simplifies the solution, enables the solution to be more easily changed and more agile. And you do not pay a performance penalty because everything happens in-memory.

1.11 What happens if hardware or power fails?


Intel has a comprehensive collection of Reliability, Availability and Scalability features in their SAP HANA hardware and this includes predictive memory failure, fault tolerance and recovery of failed memory. This is designed to avoid hardware failure but obviously hardware does fail from time to time. In case of hardware failure, SAP HANA supports fully Highly Available scenarios and standby nodes. If one node fails, another will replace it. It also supports Disaster Recovery using disk mirroring to an alternative location, in case of power failure in the main site. In addition, SAP HANA writes a copy of what is happening in memory to disk, using a combination of save-points and log files. If the power goes out, it will reload the last save point and then apply the log files when you switch it back on.

1.12 What does SAP HANA cost?


SAP HANA is priced by the 64GB unit right now, and there is some discounting based on volume. As usual with SAP licenses, it's best to contact your account exec directly and talk to them. The minimum purchase amount is currently 64GB, and the smallest appliance is 128GB, which is upgradeable to 256GB. This means if you buy 64GB today, you can easily incrementally expand up to 256GB. Note that Steve Lucas from SAP has given some HANA prices for BW to the market -What Oracle won't tell you about SAP HANA - saying that it can cost as little as 13k per 64GB unit.

1.13 Why is SAP HANA so fast?


Regular RDBMS technologies put the information on spinning plates of iron (hard disks) from which the information is retrieved. HANA stores information in electronic memory, which is some 50x faster (depending on how you calculate). HANA stores a copy on magnetic disk, in case of power failure or the like. In addition, most SAP systems have the database on one system and a calculation engine on another, and they pass information between them. With HANA, this all happens within the same machine.

1.14 Does SAP HANA replace Oracle?


It's the elephant in the room, but once the Business Suite runs on IMDB, Oracle won't be needed any more by SAP customers who purchase HANA. This doesn't affect anything in the short term because many of those people buying HANA today will still need an Oracle ERP system. However if you run an Oracle or DB2 data mart that performs poorly, you could replace this outright with SAP HANA and that would allow you to actually eliminate some licenses today. The same applies if you buy your SAP BW licenses from another database vendor directly.

1.15 What compression can I expect as compared to alternatives?

12

The answer is it really depends on the number of unique values in your data. The fewer unique values, the better the compression. If you have raw flat files or uncompressed databases like DB2 or Oracle then I generally see 10x compression to be a good start point. If you are using DB2 or Oracle compression then you can expect that to reduce to 5x compression with HANA in an average scenario. Note that this is missing the point because HANA allows simplification. In one customer I have dealt with, they had 27TB of SAP BW database, but 20TB of this was aggregates and indexes used to improve performance. So when the database was moved to SAP HANA, they started with 7TB and got 5x compression. In real life this means compression of 27TB down to 1.5TB or 18:1.

1.16 What is the wider market opportunity for in-memory technologies?


I think that this is the biggest challenge that SAP HANA provides today: because it simplifies and changes the way in which computer solutions can be designed, which requires a change in the design philosophy of computer systems. I have been talking to a number of people that see the potential and the key is this: you move all your data into one place. You transact, report, plan, forecast and consolidate on a single version of the truth. If you can make the mental jump of what that would mean to your organisation then you can see the potential.

2. SAP HANA database hardware 2.1 What hardware is supported right now?
I have broken out the SAP HANA Hardware guide into a separate FAQ - The SAP HANA Hardware FAQ There is a supported hardware list on SAP's website at: http://service.sap.com/pam (login required).

2.2 Why doesn't SAP HANA run on blades?


Running SAP HANA on blades is only relevant in multi-node systems. SAP HANA does run on blades from Cisco and HP. Fujitsu and IBM currently do not have a blade solution and IBM have stated that it is not their current strategy. This is because their GPFS filesystem requires local disk storage in the system and blades cannot hold this.

2.3 Does SAP make its own SAP HANA hardware?


Yes, but only in the labs so far. There are no public plans to compete against IBM/HP/Dell in this space, but it may make sense for SAP to enter the appliance market, especially in the context of Data Centres and even more so in the context of the SAP Business byDesign cloud offering, which will run on HANA.

2.4 How big does SAP HANA scale?


The largest certified appliance is 16TB and there are 100TB appliances in the lab. Remember that you do get compression on this so this is equivalent to 160TB of raw data for a 16TB appliance. But for "big data" fans, HANA currently only scales to the small-end of Big Data, which refers to the kind of huge datasets that FaceBook or Google have to store - not Terabytes, but rather Petabytes. These volumes remain the domain of solutions like Hadoop. That said, given that we moved from 1TB to 16TB certified appliances in the last year, you can expect by 2013 for much larger appliances to be certified.

3. Technical FAQ 3.1 What source databases does SAP HANA support in real-time?

13

There are two mechanisms that HANA supports for near-real-time data loads. First is the Sybase Replication Server (SRS), which works with SAP or non-SAP source systems running on Microsoft, IBM or Oracle databases. This was expected to be the most common mechanism for SAP data sources but there remain some license challenges around replicating data out of Microsoft and Oracle databases, depending on how you license the database layer of SAP. If you buy your database license direct from the vendor then you are fine, but if you buy it through SAP then you may have a restricted license that does not allow for usage of SRS. For those scenarios, SAP have a second choice of replication mechanism called System Landscape Transformation (SLT). SLT is also near-real-time and works from a trigger from the SAP Business Suite products. This is both database-independent and pretty neat, because it allows for application-layer transformations and therefore greater flexibility than the SRS model. Note that SLT has now been extended to work with non-SAP source systems. In addition there is a new model, the Direct Extractor Connection. This provides a means to work with Business Content DataSources (DXC), which send data from an SAP Business Suite system to SAP HANA. With DXC, the Business Content extractors are redirected, and instead of flowing into SAP Business Warehouse, extracted data flows into SAP HANA directly. SRS has additional restrictions which are worth bearing on mind. It can only replicate Unicode data and does not support IBM DB2 compressed tables at this time.

3.2 What source databases does SAP HANA support for batch loads?
If you use SAP BusinessObjects Data Services 4.0 for bulk loads then pretty much anything. BO-DS is a very flexible Extract, Transform & Load tool that supports many databases. Data Services was previously called Data Integrator, and was previously called Acta, prior to being acquired by Business Objects. You can reasonably load into HANA using Data Services every 10 minutes and Data Services allows for excellent flexibility because you can take care of complex business transformations including e.g. address verification outside of HANA, which may allow simplified modelling within HANA. I hear that SAP plan to open up a certification for third-party ETL tools later in 2012. However there are plans to move the Data Services ETL engine into SAP HANA which would allow transformations to happen in-memory. This would provide a significant benefit over any other ETL tool.

3.3 What BI Platforms does SAP HANA support?


SAP HANA supports the ODBC, JDBC and MDX standards for BI (or other connections). Today, only the SAP BI4 suite and Analysis for Excel client are supported. However I have tested a number of different tools on top of HANA and they generally work well - including the SAP Mobility Platform for real-time replication to mobile devices. Again there is set to be a certification process starting later in 2012 that will allow third-party vendors to certify their software.

4. Follow-ons, corrections & credits


This is a work in progress and your help correcting me, clarifying some things I may have not explained so well or even just asking a question that I haven't covered would be really useful for the wider market. Let me know and I'll expand this as the months go on!

Q. What is the difference between SAP Business Warehouse Accelerator (SAP BWA) & SAP HANA?
SAP BW Accelerator (SAP BWA) is an in-memory accelerator for BW. HANA is a full featured in-memory platform. BWA was specifically designed accelerate BW queries by reducing the data acquisition time by persisting copies of the InfoCube data in-memory. SAP BWA is focused 14

on improving the query performance of SAP NetWeaver BW. SAP BWA can be used today with any SAP BW 7.0 release and above. SAP HANA is an in-memory appliance and platform for delivering high-performance analytics and applications. As such, it includes a full-featured in-memory database. Data can be loaded into SAP HANA from SAP & non-SAP data sources and viewed using SAP BusinessObjects front end tools. In the near future, SAP HANA will also act as an In-memory database that will power SAP NetWeaver BW 7.3 and above. In this way it will be able to dramatically improve the overall performance of SAP NetWeaver BW by combining the value proposition of both the database & BWA into a single platform.. HANA & BW 7.3 PART -1 Over the past few months, I've presented on this topic to many customers and colleagues in and outside Walldorf. As there seems to be such a high demand, I've decided to convert the underlying slide presentation into two blogs, with the first focusing on the motivation, scenarios and use cases while HANA and BW 7.30 - Part 2 looks at the combination of HANA and BW from a technical angle. Before I start with the first blog please note that the usual disclaimer applies. Everything here has been announced at some SAP event - see The SAP Run Better Tour - BW Roadmap, for example. So I'm focusing on bringing pieces into context rather than revealing something that has not been known before. Overview Part 1 Review In-Memory Overview HANA and BW 7.30 - Part 2 In-Memory @ SAP HANA and BW 7.30 - Part 2 How HANA affects Data Warehousing HANA and BW 7.30 - Part 2 HANA Scenarios HANA and BW 7.30 - Part 2 HANA as BWA HANA and BW 7.30 - Part 2 Conclusion HANA and BW 7.30 - Part 2 Review In-Memory For a start, let's review the fundamentals behind in-memory computing. To that end, let's have a look at the table in figure 1 that I've gratefully borrowed from Andy Bechtolsheim's presentation at HPTS 2009. It shows what the semiconductor industry predicts on how the listed components will evolve - see the ITRS.

15

Figure 1: CPU module roadmap It is sufficient to look at the first two lines, the clock rateand the cores. Two things can be concluded from that: A. Moore's law will continue to apply. B. However, it will be based on scaling the number of CPU cores rather than the CPU clock rate - with power efficiency being the main reason for this change. The "however part" (B) is fundamental and carries a big mandate for the software industry, namely that parallelism will be key on those future CPU architectures. SAP's response to this is what has been labeled in-memory computing. However, this term over-emphasizes the aspect of main memory and comes a bit short of some other aspects that are at the heart of the performance benefits achieved in this context. The logic goes along the following lines: parallelism: as seen in figure 1, supporting the multi-core architectures via software parallelism is key in-memory: a prerequisite for parallelism is to have the related data located close to the cores in local memory columnar data structures: this, in turn, is a prerequisite to fit data into main memory; the columnar approach is extremely I/O efficient and is an enabler for the next bullet compression: columnar data can be more efficiently compressed than rowbased data due to a higher repetition of values and thus a higher potential to compress application-awareness: this is separate from the previous four technology arguments and comes down to building an engine tailored towards the SAP applications; the second blog will provide examples in the context of BW for this. In my opinion, the last item is one of the most overlooked and undervalued in the current debate. Actually, it is something that many other companies already and successfully do, namely exploiting inherent properties of the underlying applications to relieve some of the traditional RDBMS constraints in order to build innovative data processing clusters, e.g. based on MySQL nodes or Hadoop. The CAP theorem is an instance of that; see here for a few examples implemented by Ebay. SAP's BWA is another good example as it is tailored towards the BW schema. 16

In-Memory @ SAP SAP's response to the imperative for a new software architecture is its In-Memory Computing Engine (IMCE; aka NewDB). I don't want to engage into a deep essay on IMCE and think that - for simple purposes - you can look at IMCE as an evolution of BWA, albeit not tied to BW alone anymore, SAP's implementation of an in-memory DB, tailored towards SAP applications, a full, stand-alone SQL database, an OLAP processor for MDX queries. Now, HANA is the acronym for High Performance Analytical Appliance. Also, in a simplified (albeit not 100% technically correct) way, you can look at HANA as roughly: IMCE as an appliance however, it comprises more than just IMCE HANA is the term you likely hear in public for the remainder of this presentation: IMCE HANA (to avoid too much confusion) How HANA affects Data Warehousing The following pseudo equations originate from some joky internal discussions that we had but have proven to be helpful: 1. Today: EDW = RDBMS + X This means that an enterprise data warehouse (EDW) is not equal to a database system but requires a complement (here: X). Under Xyou can imagine code that is manually written or generated by tools, e.g. extraction programs DDL code (like CREATE TABLE statements) constraints, validation rules data transformations and harmonization process definitions, schedules and monitoring, failure handling (especially consistent restart) KPI definitions business semantic like rules on how to convert currencies or fiscal year definitions management of shared and private dimensions, including hierarchies defining and interpreting semantics on top of tables and columns, e.g. o column X is the parent column of a parent-child hierarchy H associated to dimension D o column Y is a unit key figure with the associated unit stored in column U o column Z is an attribute of dimension members whose key is compound in columns A and B o table T holds natural language descriptions for dimension member keys, whereby column L indicates the language and column C the description 17

column P in table Q is a foreign key of members of dimension D; referential integrity is guaranteed (yes/no) o time and calendar semantics, e.g. based on hierarchies like day month - quarter - year, week - year table and data management like defining standards on how to store a dimension (tables and their respective layouts), how to index and/or partition those tables (meta data) lifecycle of models and tables, like versioning, changes including impact analysis and propagation, development / test / production setup (data) lifecycle: archiving and the underlying management of archives (what has been archived and what not, avoid overlapping data containers, etc.) security, especially modeling and management based on higher conceptual levels like dimensions, members, hierarchies logging, auditing and other compliance-related features etc etc etc In summary, X addresses those requirements. It can be a bundle of generated code, meta data definitions, manually written programs etc. BW is an off-the-shelf instance of X. 2. Now: RDBMS HANA This indicates that traditional RDBMS technology gets overhauled by inmemory computing as implemented in HANA. 3. Thus: (new) EDW = HANA + Y Now, 1. and 2. get combined into 3. As HANA is not an exact 1:1 replacement of an RDBMS and as the constraints and "physical rules" of in-memory computing changes - especially the performance cost model - the software that sits on top (i.e. previously the X) needs to be adjusted to accommodate those new constraints and rules. This is indicated by moving from X in 1. to a Y in 3. Still, Y needs to address the same requirements as Xbut in a different way. Beyond that, there are even new and more opportunities given by the new constraints and rules, meaning that many more options are possible in Yin comparison to X. It is a paradigm shift similar to moving from analog to digital photography. Simply think of all the additional things that are possible with digital photography today! BW will follow this transformation from X to Y by tailoring it towards HANA. First steps will be visible with the BW 7.3 enablement of HANA planned for end of 2011. HANA Scenarios From my experience, the slides shown in figures 2, 3 and 4 are extremely helpful as they trigger fruitful discussions with customers. Essentially, I've discussed figures 2 and 3 in my blog on The BW - HANA Relationship. Please note that there is no "best" scenario but that each of the scenarios in figure 2 over-emphasize a certain property at the expense of another one. So, there are trade-off decisions behind those scenarios. This confuses many people who would like SAP to give a simple answer.
o

18

But, I guess, it's like when you buy a car: you need to trade off various aspects for choosing the right model for your specific purposes.

Figure 2: HANA scenarios.

Figure 3: HANA scenarios and their respective trade-offs. HANA as BWA There have been many questions on the BWA-HANA relationship, e.g. whether there will be new releases of BWA, whether investsments into BWA would be safe etc. The basic plan is to enable HANA to play the role of a BWA in the future. In other words: in 2012 (plan!), it should be possible to buy a HANA box that can be set up and configured to run as an accelerator next to a BW like BWA did before. This offers two options to bring HANA into an existing BW 7.3 landscape - note that release 7.3 is a prerequisite for running BW with HANA: "conservative approach" (the two small arrows in figure 4): you bring in HANA as an accelerator for your existing BW. That way, you gain confidence with HANA, learn to operate HANA and already see a large amount of benefits. For example, HANA has a calculation engine that has been improved in comparison to the one in BWA. "progressive approach" (the long arrow in figure 4): this translates into migrating the DBMS server underlying your BW system to HANA. BWA as 19

accelerator becomes obsolete as HANA already incorporates the BWA calculation capabilities.

Figure 4: Migration options for a classic towards a HANA-based BW. Conclusion This concludes this first part. Hopefully, it has clarified what role HANA will play in a BW context. It should become obvious that there is a significant complement even though, and on a technical level, performance critical operators that are today implemented in the BW application stack are moved into the HANA engine. BW will eventually become a pure management software implementing a best practice approach that orchestrates the heavy data lifting inside HANA. HANA and BW 7.30 Part 2 will describe some examples on what is possible.

20

Vous aimerez peut-être aussi