Vous êtes sur la page 1sur 6

Converting Complex Procedure Logic to Set SQL Using Derived Tables

Page 1 of 6

Published on Teradata Developer Exchange (http://developer.teradata.com)


Home > Blogs > GeorgeColeman's blog > Converting Complex Procedure Logic to Set SQL Using Derived Tables

Converting Complex Procedure Logic to Set SQL Using Derived Tables


By GeorgeColeman Created Jul 10 2011 - 4:17pm

Print All Blogs GeorgeColeman's Blog

Converting Complex Procedure Logic to Set SQL Using Derived Tables


Blog entry by GeorgeColeman on 10 Jul 2011 comments array logic set sql sql sqlpl t-sql cursor Loop plsql set When Flat SQL looping logic involves complex processing of data elements from multiple tables, the loops can almost always be transformed into Set SQL with derived tables. By eliminating looping logic, the performance benefits of parallel processing can be exploited. Derived tables provide an easy technique to copy logic that is expressed in a cursor or a simple select statement in Flat SQL and paste it into Set SQL as part of the set logic. Expressions like this need to be modified slightly to enable joining of the derived tables to other tables in the Set SQL statement. Sometimes a derived table created in this way is so simple that it can be eliminated from the Set SQL by rewriting it as a join to a base table. In other cases, as the following example demonstrates, the derived table performs a necessary function that cannot be streamlined further. This is an example of a PL/SQL procedure that uses data from more than one table to make decisions and compute an updated quantity for certain rows in a target table. Other rows in the target table are deleted.
DECLARE CURSOR C1 IS SELECT RowID, LLID, QtyOrd, DistQtyOrd, ShipQty, Unit_Price, DistID FROM UNINVOICED WHERE PER_KEY = :PerKey; DIST_AMT NUMBER(22,5); QTY_REC NUMBER(22,5); AP_AMT NUMBER(22,5); BEGIN FOR UN_LIST IN C1 LOOP

http://developer.teradata.com/print/8349

7/31/2011

Converting Complex Procedure Logic to Set SQL Using Derived Tables


-- SUM Invoice Dist SELECT SUM(DECODE( SIGN(AcctgDt - :END_DATE) ,1,0, NVL(QtyInvoiced,0))) INTO AP_AMT FROM INVOICE_DIST WHERE DistID = UN_LIST.DistID; END IF; -- SUM RCV Txn SELECT SUM(NVL(RCT.DocQty,0)) INTO QTY_REC FROM RCV_TRANSACTIONS RCT WHERE RCT.TXN_TYPE IN ('RECEIVE','MATCH') AND RCT.TXN_DATE < :END_DATE + 1 AND PO_LLID = UN_LIST.LLID; TOT_QTY := QTY_REC; IF (TOT_QTY > AP_AMT) THEN IF (UN_LIST.ShipQty = 0) THEN DIST_AMT := UN_LIST.Unit_Price * ( (TOT_QTY - AP_AMT) * (UN_LIST.DistQtyOrd / UN_LIST.QtyOrd) ); IF (DIST_AMT != 0) THEN UPDATE UNINVOICED SET func_curr_dist_amount = DIST_AMT WHERE RowID = UN_LIST.RowID; ELSE DELETE FROM UNINVOICED WHERE RowID = UN_LIST.RowID; END IF; ELSE DELETE FROM UNINVOICED WHERE RowID = UN_LIST.RowID; END IF; END IF; END LOOP; END;

Page 2 of 6

To run this procedure efficiently on Teradata, the procedural logic has to be converted to Set SQL. Begin by locating the first insert, update or delete statement. At the end of this procedure is an UPDATE UNINVOICED statement. Following are two Delete statements, but the Update statement has to be handled first. Note that there is only one cursor in this procedure, and that cursor selects from the table that is being Updated and Deleted. The normal approach for translating Insert statements is to copy and paste all necessary cursors into derived table expressions in the Set SQL. For Update and Delete statements it is not usually necessary to copy the cursor on the table being updated, because Flat SQL tends to use that cursor only as a placeholder for doing a single-row update or delete. The goal in Set SQL is to avoid single-row updates and deletes. http://developer.teradata.com/print/8349 7/31/2011

Converting Complex Procedure Logic to Set SQL Using Derived Tables

Page 3 of 6

Under certain conditions the DIST_AMT column in the UNINVOICED table is updated with values that come from three sources: the UNINVOICED table itself (by way of the Cursor C1) and sums from two other tables. In the Set SQL, these two sums can be expressed as derived tables. Each of these sums is computed within the loop on the Cursor C1, based on values that are in the current row of the UNINVOICED table. Therefore, we have a way to join UNINVOICED to the two derived tables in the Set SQL. The first derived table summarizes INVOICE_DIST:
(SELECT SUM(DECODE( SIGN(AcctgDt - :END_DATE) ,1,0, NVL(QtyInvoiced,0))) INTO AP_AMT FROM INVOICE_DIST WHERE DistID = UN_LIST.DistID) SumInvoice

Note that the derived table needs an alias. A term like SumInvoice helps to remind the reader what this does. The DECODE expression can be simplified with a CASE operation as follows:
(SELECT SUM ( ( case when AcctgDt > :END_DATE then 0 else ( coalesce ( QtyInvoiced ,0 ) ) ) end ) INTO AP_AMT FROM INVOICE_DIST WHERE DistID = UN_LIST.DistID) SumInvoice

There are a couple of things in this expression that need to be modified before it can be used as a derived table in Set SQL. First, a derived table cannot select INTO a host variable; rather, the result of this sum is going to be used in a higher expression, so it must be aliased to something that can be referenced outside the derived table. Second, the DistID in "UN_LIST" is not available for a predicate within the derived table; therefore, DistID must be added to the select list in this derived table so it can be used for a join in the outer expression. A useable form of the derived table would look like this:
(SELECT SUM ( ( case when AcctgDt > :END_DATE then 0 else ( coalesce ( QtyInvoiced, 0 ) ) ) end ) ,DIST_ID GROUP BY DIST_ID FROM INVOICE_DIST ) SumInvoice ( AP_AMT, DIST_ID )

Now AP_AMT is available for use in the outer expression. The other calculated value needed for the Update statement is TOT_QTY. TOT_QTY comes from QTY_REC, which comes form another summarization:
-- SUM RCV Txn SELECT SUM(NVL(RCT.DocQty,0)) INTO QTY_REC FROM RCV_TRANSACTIONS RCT WHERE RCT.TXN_TYPE IN ('RECEIVE','MATCH')

http://developer.teradata.com/print/8349

7/31/2011

Converting Complex Procedure Logic to Set SQL Using Derived Tables


AND RCT.TXN_DATE < :END_DATE + 1 AND PO_LLID = UN_LIST.LLID;

Page 4 of 6

As with AP_AMT, this can be expressed as a derived table in this form :


(SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) ) ,PO_LLID GROUP BY PO_LLID FROM RCV_TRANSACTIONS WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' ) AND RCT.TXN_DATE < :END_DATE + 1 ) RcvTxn ( QTY_REC, PO_LLID )

Here PO_LLID will be used as a column to join to UNINVOICED. Putting the derived tables together and adding the join predicates to the Update statement produces this:
UPDATE UNINVOICED UN_LIST FROM ( SELECT SUM ( ( case when AcctgDt > :END_DATE then 0 else ( coalesce ( QtyInvoiced, 0 ) ) ) end ) ,DIST_ID GROUP BY DIST_ID FROM INVOICE_DIST ) SumInvoice ( AP_AMT, DIST_ID ) ,( SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) ) ,PO_LLID GROUP BY PO_LLID FROM RCV_TRANSACTIONS WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' ) AND RCT.TXN_DATE < :END_DATE + 1 ) RcvTxn ( QTY_REC, PO_LLID ) WHERE SumInvoice.DistID = UN_LIST.DistID AND RcvTxn.PO_LLID = UN_LIST.LLID

There is a condition on the target table which is expressed in the cursor in original flat logic: WHERE PER_KEY = :PerKey - this must be added to the final predicate. There are also conditions under which the update is to take place, and these too must be expressed in the predicate. Finally, note that there is a complex computation for func_curr_dist_amount. This expression must be copied into the SET clause of the Update statement. Putting all this together produces:

http://developer.teradata.com/print/8349

7/31/2011

Converting Complex Procedure Logic to Set SQL Using Derived Tables


UPDATE UNINVOICED UN_LIST FROM ( SELECT SUM ( ( case when AcctgDt > :END_DATE then 0 else ( coalesce ( QtyInvoiced, 0 ) ) ) end ) ,DIST_ID GROUP BY DIST_ID FROM INVOICE_DIST ) SumInvoice ( AP_AMT, DIST_ID ) ,( SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) ) ,PO_LLID GROUP BY PO_LLID FROM RCV_TRANSACTIONS WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' ) AND RCT.TXN_DATE < :END_DATE + 1 ) RcvTxn ( QTY_REC, PO_LLID ) WHERE SumInvoice.DistID = UN_LIST.DistID AND RcvTxn.PO_LLID = UN_LIST.LLID AND AND AND AND UN_LIST.PER_KEY = :PerKey RcvTxn.QTY_REC > SumInvoice.AP_AMT UN_LIST.ShipQty = 0 UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT ) * ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd ) <> 0

Page 5 of 6

SET func_curr_dist_amount = UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT ) * ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd )

Now that the Set SQL statement is complete, it can be scanned for ways to simplify it. Sometimes the logic in case operations and where clauses can be reduced to shorter statements. In this example, we already simplified the one complex Decode function. Another way to simplify SQL is to eliminate simple derived tables, but these derived tables are not that simple: they are performing necessary aggregate functions. This statement cannot be streamlined any further. Meanwhile, back to the procedure at hand, there are two more SQL statements that need to be handled. An examination of the logic shows that Delete statements are executed whenever either ShipQty is not 0 or DIST_AMT is 0. All the other data sources, logic and predicates are the same. So this requires a Delete statement that is similar to the Update statement but with a slightly different predicate:
DELETE FROM UNINVOICED U WHERE U.PER_KEY = :PerKey AND U.LLID IN ( Select LLID FROM UNINVOICED UN_LIST ,(

http://developer.teradata.com/print/8349

7/31/2011

Converting Complex Procedure Logic to Set SQL Using Derived Tables


SELECT SUM ( ( case when AcctgDt > :END_DATE then 0 else ( coalesce ( QtyInvoiced, 0 ) ) ) end ) ,DIST_ID GROUP BY DIST_ID FROM INVOICE_DIST ) SumInvoice ( AP_AMT, DIST_ID ) ,( SELECT SUM ( ( coalesce(RCT.DocQty, 0) ) ) ,PO_LLID GROUP BY PO_LLID FROM RCV_TRANSACTIONS WHERE RCT.TXN_TYPE IN ( 'RECEIVE' , 'MATCH' ) AND RCT.TXN_DATE < :END_DATE + 1 ) RcvTxn ( QTY_REC, PO_LLID ) WHERE SumInvoice.DistID = UN_LIST.DistID AND RcvTxn.PO_LLID = UN_LIST.LLID AND RcvTxn.QTY_REC > SumInvoice.AP_AMT

Page 6 of 6

AND ( UN_LIST.ShipQty <> 0 OR UN_LIST.Unit_Price * ( RcvTxn.QTY_REC - SumInvoice.AP_AMT ) * ( UN_LIST.DistQtyOrd / UN_LIST.QtyOrd ) = 0 ) )

It may seem inefficient that this set logic has to issue two DML statements, an Update and a Delete, which results in two scans of the UNINVOICED table. The flat logic is designed to scan it just once. However, Teradata will execute these two Set SQL statements many times faster than it would execute the given procedure with its cursor-based loop. That is the power of massive parallel processing, which breaks each task into tens, hundreds or thousands of smaller pieces that can all be executed concurrently, and thus reduces the time to process large amounts of data by orders of magnitude. The more complex flat logic is, the harder it is to visualize at first glance just how it could be expressed in Set SQL. The simple approach is to start developing the set logic from the flat logic with a step-by-step process as shown here. Once the Set SQL is complete, it is good practice to compare the Flat SQL and Set SQL side by side. This not only helps the developer verify that the two logics accomplish the same thing, it also helps train the mind to more rapidly conceive solutions in set logic. In a future post we will examine an example of flat logic that is so complex it seems to defy the step-by-step approach to the translation to set logic. Sometimes, one has to just study the code until there is sufficient understanding of its intent and then rewrite it in Set SQL from scratch. array cursor Loop plsql set logic set sql sql sqlpl t-sql

Source URL (retrieved on Jul 31 2011 - 2:02pm): http://developer.teradata.com/blog/georgecoleman/2011/07/converting-complex-procedure-logic-to-set-sql-usingderived-tables

http://developer.teradata.com/print/8349

7/31/2011

Vous aimerez peut-être aussi