Vous êtes sur la page 1sur 7

Building a Transactional Delta Table with SAP Data Services

What is a transactional delta table?


A transactional delta table is a way to look back at the history of your data and determine what changes
have been made to a particular transaction over time. It will show when a transaction was initially
created, and then how and when that transaction changed after the initial creation. Building one with
SAP Data Services is a simple process that I’ll outline using a fact table containing invoice data.

There are a few important components to a delta table. The first is that nothing is ever deleted from a
delta table regardless of what happened in the source transactional table. Instead a “negation” record
is inserted into the delta table which effectively zeroes out the sum of the value fields for that record.
For example, if I have an invoice for 10 widgets that gets deleted from the source table, I will insert a
new record into the delta table that contains the key information for that invoice as well as the negative
quantity of the original record. In this case, that will be a record with the same invoice number and line
item and a quantity of -10. If I were to perform a query of the delta table that sums up the quantity for
that invoice, the sum of the two records will return zero.

Similarly, if a change occurs to a record I need to do two things for the delta table: insert a new record
with the updated information of that record and then insert a “negation” record. If in the previous
example I changed the quantity of the invoice from 10 widgets to 5, I will need to insert a new record
with a quantity of 5 and insert a negation record with a quantity of -10. Performing a query on the delta
table to sum up the quantity on that invoice will return the sum of those three records, which is 5.

So what is the point of constructing a delta table? The real beauty comes into play when a record in the
source table has been changed multiple times. A transactional table such as an invoice table will
commonly carry a “Last Update Date” field, but that doesn’t tell you how many times that record was
changed or what exactly was changed on it. A delta table can give you both of these things to a large
degree. So, the second component to a delta table is a date field that signifies the insertion date of the
delta record into the delta table. Typically a delta table will be appended to about once per day, so
either the “Last Update Date” field can be used from the source or the ETL process can generate the
date and time that the process was run and use that in the record. It is important to note that the
records being inserted into the delta table carry all changes that occurred in the source table since the
last time the process was run to append to the delta table. So while the delta table won’t carry every
change that occurred in the source, when run on a schedule it will contain the changes that have
occurred to the record since the last scheduled run. This is insightful and powerful information when
the process runs regularly.
For sample data, we will use the following three tables. Table 1 is a view of the invoice line fact table on
Day 1, Table 2 is a view of the invoice line fact table on Day 2, and Table 3 is a view of the invoice line
delta table on Day 2.

Invoice Number Line Customer Item Quantity Price


1000 1 A Z 10 100
1001 1 B Y 20 500
1001 2 B X 5 20
1001 3 B W 15 300
1002 1 C V 3 60
1002 2 C U 25 50
Table 1 Invoice Line Fact table on Day 1

Invoice Number Line Customer Item Quantity Price


1000 1 A Z 10 100
1001 1 B Y 20 500
1001 2 B X 10 40
1001 3 B W 15 300
1002 1 C V 4 80
1003 1 D T 20 20
1003 2 D S 40 100
Table 2 Invoice Line Fact table on Day 2

Invoice Number Line Customer Item Quantity Price


1000 1 A Z 10 100
1001 1 B Y 20 500
1001 2 B X 5 20
1001 2 B X -5 -20
1001 2 B X 10 40
1001 3 B W 15 300
1002 1 C V 3 60
1002 1 C V -3 -60
1002 1 C V 4 80
1002 2 C U 25 50
1002 2 C U -25 -50
1003 1 D T 20 20
1003 2 D S 40 100
Table 3 Invoice Line Delta table on Day 2

The blue highlighted records are the “source record negated” insertion records, the yellow records are
the “source record updated” insertion records, and the green records are the “new source record”
insertion records. Notice the two insertions made for updated source records and the one negation
record made for the deleted source record.
Constructing a transactional delta table
The first step to creating a delta table is to make a straight copy of the fact table into the delta table.
This should only be done once as an initial setup step to populate the “base” data of the table. A simple
SAP Data Services dataflow like the one below can accomplish this.

Figure 1 Initial dataflow for Fact Invoice Line Delta

If we run this dataflow on Day 1 after the invoice fact table has been updated, both the
FACT_INVOICE_LINE table and FACT_INVOICE_LINE_DELTA table contain the data that is in Table 1.

The next step is similar to the first, but now we will copy the current fact table into a work table. This
should occur as often as you update your fact table and should take place directly before that update. A
simple data flow like the one below accomplishes this

Figure 2 Clear out all data from the work table before loading it
If we run this dataflow on Day 2 before the invoice fact table is updated, both the FACT_INVOICE_LINE
table and WT_FACT_INVOICE_LINE_COPY work table contain the data that exists in Table 1 above.

Next, you should update the fact table from the source so that FACT_INVOICE_LINE now contains the
data that is in Table 2. I’ll assume that an ETL process already exists for this and move forward to the
third step: inserting records correctly into the invoice delta table. We’ll need to A) compare the newly
updated fact table with our work table and insert any new or updated source records into the delta
table. Next we’ll B) create a work table to store any updated or deleted source records so that we can
C) insert the negation records into the delta table for the updated or deleted source records. Steps A, B,
and C are all broken down a respective dataflow that are placed in series within a workflow like below

Let’s break down each of these dataflows. The first, DF_FACT_INVOICE_LINE_DELTA_1_D, compares
the newly updated fact table with our work table and insert any new or updated source records into the
delta table. So, we use the invoice fact table as the source and perform a Table Compare against the
work table made in the second step. We are effectively comparing the fact table to what it contained
before it was updated, which is yesterday’s data in our case. A Map Operation maps Insert and Update
records to Normal while Normal and Delete records are discarded. The query that follows passes all of
the columns through to a second Map Operation and also populates an Insert Date with the sysdate()
function. The final Map Operation maps the rows to Insert and uses the invoice delta table as a target.
The invoice delta table now contains:

Invoice Number Line Customer Item Quantity Price


1000 1 A Z 10 100
1001 1 B Y 20 500
1001 2 B X 5 20
1001 3 B W 15 300
1002 1 C V 3 60
1002 2 C U 25 50
1001 2 B X 10 40
1002 1 C V 4 80
1003 1 D T 20 20
1003 2 D S 40 100
Table 4 The highlighted rows have been inserted into the invoice delta table

In the second dataflow, we need to create a work table to store any updated or deleted source records.
DF_FACT_INVOICE_LINE_DELTA_2_1_D is very similar to the first in general structure, however after the
Table Compare we will now map Update and Delete records to Normal and discard the Normal and
Insert records. The query following the first Map Operation passes through only the key columns of the
invoice fact table. The second Map Operation inserts these records into another work table which is
essentially just the list of invoice fact records that have been updated or deleted. We’ll use this list in
the next dataflow to perform the “negation” of that record that is in the invoice delta table.

The work table WT_FACT_INVOICE_LINE_UPD_AND_DEL_RECORDS now contains:

Invoice Number Line


1001 2
1002 1
1002 2
Table 5 The yellow records were updated in the source, the blue record deleted
The third and final dataflow uses the two work tables that were previously created to insert a negation
record into the invoice delta table. Remember, WT_FACT_INVOICE_LINE_COPY contains yesterday’s
version of the invoice fact table, and WT_FACT_INVOICE_LINE_UPD_AND_DEL_RECORDS contains the
updated and deleted invoice fact records since yesterday. Joining these two table and multiplying the
value fields by (-1) inserts a negation record with the correct values.

The invoice delta table now contains

Invoice Number Line Customer Item Quantity Price


1000 1 A Z 10 100
1001 1 B Y 20 500
1001 2 B X 5 20
1001 3 B W 15 300
1002 1 C V 3 60
1002 2 C U 25 50
1001 2 B X 10 40
1002 1 C V 4 80
1003 1 D T 20 20
1001 2 B X -5 -20
1002 1 C V -3 -60
1002 2 C U -25 -50
Table 6 The highlighted negation records have been inserted into the invoice delta table

Note that summing the value quantities for invoice number 1002 line 2 equals zero. Summing quantities
for invoices 1001 line 2 and 1002 line 1 equal the correct new quantities which are contained in Table 2
above.
Wrapping all of these dataflows into an overall job could look like the one below. Just remember to
leave out the initial invoice delta dataflow which is to be run only once.

Rich Hauser, Senior Business Intelligence Consultant


Decision First Technologies
Richard.Hauser@decisionfirst.com

Rich is a senior business intelligence consultant specializing in Enterprise Information Management. He


has delivered customized SAP BusinessObjects solutions for customers of all sizes across a variety of
industries. With Decision First Technologies, Rich utilizes SAP Data Services and SAP Information
Steward.

Vous aimerez peut-être aussi