Vous êtes sur la page 1sur 16

Parallel Cursor – To speed up

performance of Nested LOOP


By Naimesh Patel | December 13, 2009 | Performance | 43,586 | 6

Technique to speed up the performance of the Nested LOOP – Parllel Cursor

Today, we will tackle down the biggest performance related issue around the Nested Loops.

Preface
Traditionally in ABAP, we use the LOOP using the WHERE clause for Nested loops. This
type of nested loops are very common in our day-to-day programming. But, the cost, in
terms of performance, is higher when we use the nested loops. This cost would become a
key issue when working with huge tables e.g. BKPF & BSEG, VBAK & VBAP, MKPF &
MSEG. Sometimes, this cost increases and reaches to the point where program fails to
finish the execution.

We have the concept of Parallel Cursor exists in ABAP to overcome this hurdle and reduce
this cost. In parallel cursor, we first try to see if there is any entry exist in the second table
inside the LOOP construct of first table. We use the READ .. WITH KEY .. BINARY
SEARCH to check if the entry exist in the second table. We use this record number SY-
TABIX to LOOP on the second table using LOOP .. FROM index.

This code snippet gives us the idea of the time taken by both the nested loops and the
parallel cursor loops.

Code Snippet
In this code snippet, I have “classical” nested loop and the Parallel Cursor nested loop. You
would notice that, by using Parallel cursor, you can improve performance a lot.

*&--------------------------------------------------------------------
-*
*& Report ZTEST_NP_PARALLEL_CURSOR
*& Purpose: Illustration on how to use Parallel Cursor
*&--------------------------------------------------------------------
-*
*
REPORT ztest_np_parallel_cursor.
*
TYPES: ty_t_vbak TYPE STANDARD TABLE OF vbak.
DATA: it_vbak TYPE ty_t_vbak .
*
TYPES: ty_t_vbap TYPE STANDARD TABLE OF vbap.
DATA: it_vbap TYPE ty_t_vbap.
*
FIELD-SYMBOLS: <lfs_vbak> LIKE LINE OF it_vbak,
<lfs_vbap> LIKE LINE OF it_vbap.
*
* necessary data selection
SELECT * FROM vbak
INTO TABLE it_vbak
UP TO 1000 ROWS.
CHECK it_vbak IS NOT INITIAL.
SELECT * FROM vbap
INTO TABLE it_vbap
FOR ALL ENTRIES IN it_vbak
WHERE vbeln = it_vbak-vbeln.
*
DATA: lv_start_time TYPE timestampl,
lv_end_time TYPE timestampl,
lv_diff TYPE timestampl.
DATA: lv_tabix TYPE i.
*
*...... Normal Nested Loop .................................
* Get the Start Time
GET TIME STAMP FIELD lv_start_time.
*
* Nested Loop
LOOP AT it_vbak ASSIGNING <lfs_vbak>.
LOOP AT it_vbap ASSIGNING <lfs_vbap>
WHERE vbeln = <lfs_vbak>-vbeln.
ENDLOOP.
ENDLOOP.
*
* Get the end time
GET TIME STAMP FIELD lv_end_time.
*
* Actual time Spent:
lv_diff = lv_end_time - lv_start_time.
WRITE: /(50) 'Time Spent on Nested Loop', lv_diff.
*
CLEAR: lv_start_time, lv_end_time, lv_diff.
*
*....... Parallel Cursor with Nested Loop .......................
* Get the Start Time
GET TIME STAMP FIELD lv_start_time.
*
* Starting the Parallel Cursor
SORT: it_vbak BY vbeln,
it_vbap BY vbeln.
LOOP AT it_vbak ASSIGNING <lfs_vbak>.
*
* Read the second internal table with BINARY SEARCH
READ TABLE it_vbap TRANSPORTING NO FIELDS
WITH KEY vbeln = <lfs_vbak>-vbeln
BINARY SEARCH.
* Get the TABIX number
lv_tabix = sy-tabix.
* Start the LOOP from the first accessed record in
* previous READ i.e. LV_TABIX
LOOP AT it_vbap FROM lv_tabix ASSIGNING <lfs_vbap> .
*
* End the LOOP, when there is no more record with similar key
IF <lfs_vbap>-vbeln <> <lfs_vbak>-vbeln.
EXIT.
ENDIF.
* Rest of the logic would go from here...
*
ENDLOOP.
*
ENDLOOP.
*
* Get the end time
GET TIME STAMP FIELD lv_end_time.
*
* Actual time Spent:
lv_diff = lv_end_time - lv_start_time.
WRITE: /(50) 'Time Specnt on Parallel Cursor Nested loops:', lv_diff.

Some statistics
I ran this program multiple times and capture this statistics.
Parallel Cursor – 2: without using
READ

Parallel Cursor without using READ

In previous post Parallel Cursor – To speed up performance of Nested LOOP, we have


seen a technique how to speed up the performance of the nested LOOP constructs in
ABAP. In today’s post, we will see another variance of the Parallel cursor technique. In this
technique, we will exit out the inner LOOP when both keys are not matching by saving the
LOOP index in a variable. This index variable would be used in the LOOP construct to start
the LOOP. Initially, this index variable would be set to 1. Statistics shows that this new
technique is powerful over the technique as shown in the previous post which uses the
READ TABLE.

Here is the code snippet to achieve this Parallel Cursor technique:

*&--------------------------------------------------------------------
-*
*& Report ZTEST_NP_PARALLEL_CURSOR
*& Purpose: Illustration on how to use Parallel Cursor
*&--------------------------------------------------------------------
-*
*
REPORT ztest_np_parallel_cursor.
*
TYPES: ty_t_vbak TYPE STANDARD TABLE OF vbak.
DATA: it_vbak TYPE ty_t_vbak .
*
TYPES: ty_t_vbap TYPE STANDARD TABLE OF vbap.
DATA: it_vbap TYPE ty_t_vbap.
*
FIELD-SYMBOLS: <lfs_vbak> LIKE LINE OF it_vbak,
<lfs_vbap> LIKE LINE OF it_vbap.
*
* necessary data selection
SELECT * FROM vbak
INTO TABLE it_vbak
UP TO 100 ROWS.
CHECK it_vbak IS NOT INITIAL.
SELECT * FROM vbap
INTO TABLE it_vbap
FOR ALL ENTRIES IN it_vbak
WHERE vbeln = it_vbak-vbeln.
*
DATA: lv_start_time TYPE timestampl,
lv_end_time TYPE timestampl,
lv_diff TYPE timestampl.
DATA: lv_tabix TYPE i.
*
*....... Parallel Cursor with Nested Loop .......................
* Get the Start Time
GET TIME STAMP FIELD lv_start_time.
*
* Starting the Parallel Cursor
SORT: it_vbak BY vbeln,
it_vbap BY vbeln.
LOOP AT it_vbak ASSIGNING <lfs_vbak>.
*
* Read the second internal table with BINARY SEARCH
READ TABLE it_vbap TRANSPORTING NO FIELDS
WITH KEY vbeln = <lfs_vbak>-vbeln
BINARY SEARCH.
* Get the TABIX number
lv_tabix = sy-tabix.
* Start the LOOP from the first accessed record in
* previous READ i.e. LV_TABIX
LOOP AT it_vbap FROM lv_tabix ASSIGNING <lfs_vbap> .
*
* End the LOOP, when there is no more record with similar key
IF <lfs_vbap>-vbeln <> <lfs_vbak>-vbeln.
EXIT.
ENDIF.
* Rest of the logic would go from here...
*
ENDLOOP.
*
ENDLOOP.
*
* Get the end time
GET TIME STAMP FIELD lv_end_time.
*
* Actual time Spent:
lv_diff = lv_end_time - lv_start_time.
WRITE: /(50) 'Time Spent on Parallel Cursor Nested loops:', lv_diff.
CLEAR: lv_start_time, lv_end_time, lv_diff.
*
*....... Parallel Cursor - 2 with Nested Loop ...................
CLEAR lv_tabix.
* Get the Start Time
GET TIME STAMP FIELD lv_start_time.
*
* Starting the Parallel Cursor
SORT: it_vbak BY vbeln,
it_vbap BY vbeln.
lv_tabix = 1. " Set the starting index 1
LOOP AT it_vbak ASSIGNING <lfs_vbak>.
*
* Start the nested LOOP from the index
LOOP AT it_vbap FROM lv_tabix
ASSIGNING <lfs_vbap>.
* Save index & Exit the loop, if the keys are not same
IF <lfs_vbak>-vbeln <> <lfs_vbap>-vbeln.
lv_tabix = sy-tabix.
EXIT.
ENDIF.
* Rest of the logic would go from here...
*
ENDLOOP.
ENDLOOP.
*
* Get the end time
GET TIME STAMP FIELD lv_end_time.
*
* Actual time Spent:
lv_diff = lv_end_time - lv_start_time.
WRITE: /(50) 'Time Spent on Parallel Cursor 2 Nested loops', lv_diff.
CLEAR: lv_start_time, lv_end_time, lv_diff.

This statistics and graph shows the time used by the nested LOOP as the 100%. For the
1000 VBAK records, parallel cursor technique with READ consumes 1.84% time compare
to nested LOOPs 100%. This technique without READ TABLE only requires 1.05% time
compare to 100% of nested LOOPs.
ABAP Internal Table Performance for
STANDARD, SORTED and HASHED
Table
By Naimesh Patel | February 25, 2013 | Performance | 47,489 | 2

Standard Table is the most widely used table Type. It has performance drawbacks if not
used properly.

Few months ago, I asked a question:

Your Preferred table kind for your ITAB


Total of 236 people has voted for the Poll.

Demo Program on ITAB Performance


Before we jump into the poll result, check out the small program and its performance result.
Here I’m comparing the Performance of the READ on Internal tables – Standard VS Sorted
VS Hashed VS Standard using BINARY.

Performance Comparison - ITAB TYPES

*&---------------------------------------------------------------------*
*& Purpose : Performance Comparison between READ on various TYPEs of
*& internal tables
*& Author : Naimesh Patel
*& URL : http://zevolving.com/?p=1861
*&---------------------------------------------------------------------*
REPORT ZTEST_NP_TABLE_TYPE_PERF.
*

TYPES:
BEGIN OF ty_vbpa,
vbeln TYPE vbpa-vbeln,
posnr TYPE vbpa-posnr,
parvw TYPE vbpa-parvw,
kunnr TYPE vbpa-kunnr,
END OF ty_vbpa.

DATA: t_std TYPE STANDARD TABLE OF ty_vbpa.


DATA: t_sorted TYPE SORTED TABLE OF ty_vbpa WITH UNIQUE KEY vbeln.
DATA: t_hashed TYPE HASHED TABLE OF ty_vbpa WITH UNIQUE KEY vbeln.

TYPES:
BEGIN OF ty_vbak,
vbeln TYPE vbak-vbeln,
found TYPE flag,
END OF ty_vbak.
DATA: t_vbak TYPE STANDARD TABLE OF ty_vbak.
DATA: lt_vbak TYPE STANDARD TABLE OF ty_vbak.
FIELD-SYMBOLS: <lfs_vbak> LIKE LINE OF t_vbak.

DATA: lv_flag TYPE flag,


lv_sta_time TYPE timestampl,
lv_end_time TYPE timestampl,
lv_diff TYPE p DECIMALS 5.

DATA: lv_num_main TYPE i,


lv_num_sub TYPE i.

START-OF-SELECTION.
lv_num_main = 50000. " Change for different number of records
lv_num_sub = lv_num_main / 2.
*
SELECT vbeln
FROM vbak
INTO TABLE t_vbak
UP TO lv_num_main ROWS.
*
SELECT vbeln posnr parvw kunnr
INTO TABLE t_std
FROM vbpa
UP TO lv_num_sub ROWS
FOR ALL ENTRIES IN t_vbak
WHERE vbeln = t_vbak-vbeln
AND parvw = 'AG'.

* Copying into the Sorted and Hashed table


t_sorted = t_std.
t_hashed = t_std.
*----
* READ on Standard Table
GET TIME STAMP FIELD lv_sta_time.
LOOP AT t_vbak ASSIGNING <lfs_vbak>.
READ TABLE t_std TRANSPORTING NO FIELDS
WITH KEY vbeln = <lfs_vbak>-vbeln.
IF sy-subrc EQ 0.
"..
ENDIF.
ENDLOOP.
GET TIME STAMP FIELD lv_end_time.
lv_diff = lv_end_time - lv_sta_time.
WRITE: /(30) 'Standard Table', lv_diff.
*
*----
* READ on Standard table with Binary Search
GET TIME STAMP FIELD lv_sta_time.
SORT t_std BY vbeln.
LOOP AT t_vbak ASSIGNING <lfs_vbak>.
READ TABLE t_std TRANSPORTING NO FIELDS
WITH KEY vbeln = <lfs_vbak>-vbeln
BINARY SEARCH.
IF sy-subrc EQ 0.
"..
ENDIF.
ENDLOOP.
GET TIME STAMP FIELD lv_end_time.
lv_diff = lv_end_time - lv_sta_time.
WRITE: /(30) 'Standard Table - Binary', lv_diff.
*
*----
* READ on Sorted Table
GET TIME STAMP FIELD lv_sta_time.
LOOP AT t_vbak ASSIGNING <lfs_vbak>.
READ TABLE t_sorted TRANSPORTING NO FIELDS
WITH KEY vbeln = <lfs_vbak>-vbeln.
IF sy-subrc EQ 0.
"..
ENDIF.
ENDLOOP.
GET TIME STAMP FIELD lv_end_time.
lv_diff = lv_end_time - lv_sta_time.
WRITE: /(30) 'Sorted Table', lv_diff.
*
*----
* READ on HASHED table
GET TIME STAMP FIELD lv_sta_time.
LOOP AT t_vbak ASSIGNING <lfs_vbak>.
READ TABLE t_hashed TRANSPORTING NO FIELDS
WITH TABLE KEY vbeln = <lfs_vbak>-vbeln.
IF sy-subrc EQ 0.
"..
ENDIF.
ENDLOOP.
GET TIME STAMP FIELD lv_end_time.
lv_diff = lv_end_time - lv_sta_time.
WRITE: /(30) 'Hashed Table', lv_diff.

I ran the program multiple times for different set of records. Here are the average values
based on the performance readings:
READ-ONLY attribute vs GETTER
methods
READ-ONLY is something different for the people who have worked on the OOP in the
past. Like JAVA doesn’t have the read only addition. So, there is a hot discussion going on
in SAP ABAP Objects world, why READ-ONLY is there and what is the purpose of it. Let’s
unleash the READ-ONLY (and its power over GETTER methods).

As good Object Oriented Programming suggests, we should hide our attributes from the
outside world and let them allow to access only with the PUBLIC methods. These methods
are:

 SETTER – To set the data to the attribute. Usually, this method has the single parameter
and typically that is named as VALUE. Implementation of the method sets the respective
attribute from the parameter value. So, we should use the naming convention as SET_attr(
).

 GETTER – To retrieve the value from the attribute. This method should have the single
parameter and typically it would be VALUE as type RETURNING. Implementation of this
method, passes the attribute value to the caller in the parameter. As we name the SETTER
methods, we should name the GETTER methods as GET_attr( ).

Now, let’s see what is the READ-ONLY:


READ-ONLY is the addition to the PUBLIC attributes. This allows the caller to access the
attribute, but doesn’t allow it to change the value. Refer keyword documentation on READ-
ONLY on SAP Help at: READ-ONLY addition

Caller can simply access the variable directly with proper operator. Like this:

*&--------------------------------------------------------------------
-*
*& Developer : Naimesh Patel
*& Purpose : READ-ONLY Demo
*&--------------------------------------------------------------------
-*
REPORT ztest_np.
*
CLASS lcl_example DEFINITION.
PUBLIC SECTION.
DATA: v_speed TYPE i READ-ONLY.
METHODS:
constructor IMPORTING speed TYPE i.
ENDCLASS. "lcl_example DEFINITION
*
CLASS lcl_example IMPLEMENTATION.
METHOD constructor.
me->v_speed = speed.
ENDMETHOD. "constructor
ENDCLASS. "lcl_example IMPLEMENTATION

START-OF-SELECTION.
DATA: o_test TYPE REF TO lcl_example.
CREATE OBJECT o_test
EXPORTING
speed = 10.
* o_test->v_speed = 100. " Syntax Error
WRITE: 'Accessing V_SPEED:', o_test->v_speed.

In simple terms, we can reduce the amount of coding by using the READ-ONLY operator
rather than creating GET_attr( ) methods for each methods.

GETTER vs READ-ONLY Performance: Who wins?


I tried to create this sample program to find out which is better. Here is the code lines:

*&--------------------------------------------------------------------
-*
*& Developer : Naimesh Patel
*& Purpose : READ-ONLY vs GETTER Performance
*&--------------------------------------------------------------------
-*
REPORT ztest_np.
*
CLASS lcl_car DEFINITION.
PUBLIC SECTION.
DATA: v_speed TYPE i READ-ONLY.
METHODS: get_speed RETURNING value(return) TYPE i.
METHODS: set_speed IMPORTING value TYPE i.
ENDCLASS. "lcl_car DEFINITION

*
CLASS lcl_car IMPLEMENTATION.
METHOD get_speed.
return = v_speed.
ENDMETHOD. "get_speed
METHOD set_speed.
v_speed = value.
ENDMETHOD. "set_speed
ENDCLASS. "lcl_Car IMPLEMENTATION
START-OF-SELECTION.
DATA: o_bmw TYPE REF TO lcl_car.
CREATE OBJECT o_bmw.
DATA: lv_speed TYPE i.
DATA: lv_temp TYPE i.

DATA: lv_flag TYPE flag,


lv_sta_time TYPE timestampl,
lv_end_time TYPE timestampl,
lv_diff_w TYPE p DECIMALS 5,
lv_diff_f LIKE lv_diff_w,
lv_save LIKE lv_diff_w.

* Getter Performance
GET TIME STAMP FIELD lv_sta_time.
DO 1000 TIMES.
o_bmw->set_speed( sy-index ).
lv_temp = lv_temp + o_bmw->get_speed( ).
ENDDO.
GET TIME STAMP FIELD lv_end_time.
lv_diff_w = lv_end_time - lv_sta_time.
WRITE: /(15) 'With getter', lv_diff_w.

CLEAR: lv_sta_time, lv_end_time, lv_temp.

* READ-ONLY Performance
GET TIME STAMP FIELD lv_sta_time.
DO 1000 TIMES.
o_bmw->set_speed( sy-index ).
lv_temp = lv_temp + o_bmw->v_speed.
ENDDO.
GET TIME STAMP FIELD lv_end_time.
lv_diff_f = lv_end_time - lv_sta_time.
WRITE: /(15) 'With READ-ONLY', lv_diff_f.

* Saved time
lv_save = lv_diff_w - lv_diff_f.
WRITE: /(15) 'Saved time', lv_save.

When I ran the report, it showed that we can save around 67% of time while using the
READ-ONLY compared to GETTER( ) method. When GETTER( ) method take about 100%
time, direct attribute access finishes job in about 33% time. This should be useful when we
are designing complex system, which uses too many GETTER methods.
FOR ALL ENTRIES – Why you need to
include KEY fields
By Naimesh Patel | March 20, 2014 | Performance | 28,228 | 3

ABAP FOR ALL ENTRIES is handy, but would create lot of data inconsistencies if you don’t
use it properly.

Basic
For simplest use of FOR ALL ENTRIES, you would write your SELECT query using the
FOR ALL ENTIRES and use one or more fields from the table into WHERE condition.

IF t_ids IS NOT INITIAL.


SELECT *
INTO TABLE t_t100_all
FROM t100
FOR ALL ENTRIES IN t_ids
WHERE arbgb LIKE '0%'
AND msgnr = t_ids-TABLE_LINE.
ENDIF.

Problem
When your table has few key fields, you generally tend to select them, even though you
don’t need them. But when there are many Fields you tend to not include them in your table
and subsequently in your SELECT query.

For example, lets see this query:

Incorrect FOR ALL ENTRIES

SELECT ryear
drcrk
rpmax
rtcur
racct
rbukrs
rcntr
kokrs
hsl01
hsl02
hsl03
hsl04
hsl05
hsl06
hsl07
hsl08
hsl09
hsl10
hsl11
hsl12
FROM faglflext
INTO TABLE lt_fagl
FOR ALL ENTRIES IN lt_skb1
WHERE ryear IN lr_gjahr
AND racct = lt_skb1-saknr
AND rbukrs = lt_skb1-bukrs.

Since this query doesn’t have all the key fields, you would run into the issues when you will
try to reconcile it with some of the standard transaction FAGLB03. The issue will happen
when you have same posting for same GL using same CC in different periods. In that case,
system would only bring one entry instead of the multiple. On the other hand the standard
transaction would get you all the entries would a different total.

When you debug the std transaction and reach to SELECT, you would think, this is the
same query I have. Than why my query wont work. So you download the data from both of
the ITAB – Your program and Std transaction. You compare them and you would notice few
entries are definitely missing. After few rounds of VLOOKUP and Compare in excel, you
suddenly realize that FOR ALL ENTRIES is dropping few entries. By the time, you are
already expert in Excel.

To fix this you would than realize that you need to include more key fields into the SELECT.
Next hurdle would be to explain to your Functional counterpart and your users why it was
not working.

If this program was not developed by you, I can imagine how much you hate that person
when you realized the root cause. A silly mistake! On bright side, you are now Excel
expert

Why you need Key fields


To improve the performance, you would definitely want to have unique values in the ITAB
which you are using as FOR ALL ENTIRES. If you don’t pass unique values, DB would
reselect the records for each duplicate values. Finally, DB would remove the duplicates and
give you the result set.

This removing of duplicate would create data inconsistencies, if you don’t have key fields in
your selection fields part of your SELECT query. If you assume that all your fields would
make up the unique value without including the Key fields, you are inviting trouble in the
future.

Adding the Key fields of the table would ensure that all the selected records are unique. If
the SELECT is a join, make sure you include all the key fields from all the tables included in
the join.

Compare
Let me show you this simple example:

T100 SELECT without SPRAS

TYPES:
BEGIN OF ty_t100,
arbgb TYPE t100-arbgb,
msgnr TYPE t100-msgnr,
text TYPE t100-text,
END OF ty_t100.

DATA: t_ids TYPE STANDARD TABLE OF t100-msgnr.


DATA: t_t100_all TYPE STANDARD TABLE OF t100.
DATA: t_t100 TYPE STANDARD TABLE OF ty_t100.

APPEND '001' TO t_ids.


APPEND '002' TO t_ids.

IF t_ids IS NOT INITIAL.


SELECT arbgb
msgnr
text "comment to see more records are dropping
INTO TABLE t_t100
FROM t100
FOR ALL ENTRIES IN t_ids
WHERE arbgb LIKE '0%'
AND msgnr = t_ids-TABLE_LINE.
WRITE: / 'Without All Key Fields', sy-dbcnt.
ENDIF.

Now, this code with all the key fields

T100 with SPRAS (with all key fields)

IF t_ids IS NOT INITIAL.


SELECT *
INTO TABLE t_t100_all
FROM t100
FOR ALL ENTRIES IN t_ids
WHERE arbgb LIKE '0%'
AND msgnr = t_ids-TABLE_LINE.
WRITE: / 'With ALL Key Fields', sy-dbcnt.
ENDIF.

If you comment out the TEXT field from the 1st query, you would see more records are
being dropped. This is due to the fact that, DB would only retain the unique records. DB
would compare the entire record with all records and would drop the rows. Just like
DELETE ADJACENT DUPLICATES FROM itab COMPARING ALL FIELDS.

Based on entries in system where I executed this:

Conclusion
When you use FOR ALL ENTRIES, along with other things, you would definitely want to
make sure you have all the key fields in the internal table and you are selecting them as
well to avoid the data inconsistencies.

Vous aimerez peut-être aussi