Vous êtes sur la page 1sur 105

Teradata Utilities-Breaking the Barriers, First Edition

Chapter 2: BTEQ
An Introduction to BTEQ
Why it is called BTEQ?
Why is BTEQ available on every Teradata system ever built? Because the Batch TEradata Query (BTEQ) tool was the
original way that SQL was submitted to Teradata as a means of getting an answer set in a desired format. This is the
utility that I used for training at Wal*Mart, AT&T, Anthem Blue Cross and Blue Shield, and SouthWestern Bell back in
the early 1990's. BTEQ is often referred to as the Basic TEradata Query and is still used today and continues to be an
effective tool.
Here is what is excellent about BTEQ:

BTEQ can be used to submit SQL in either a batch or interactive environment. Interactive users can submit
SQL and receive an answer set on the screen. Users can also submit BTEQ jobs from batch scripts, have error
checking and conditional logic, and allow for the work to be done in the background.

BTEQ outputs a report format, where Queryman outputs data in a format more like a spreadsheet. This allows
BTEQ a great deal of flexibility in formatting data, creating headings, and utilizing Teradata extensions, such as
WITH and WITH BY that Queryman has problems in handling.

BTEQ is often used to submit SQL, but is also an excellent tool for importing and exporting data.
o
Importing Data: Data can be read from a file on either a mainframe or LAN attached computer and
used for substitution directly into any Teradata SQL using the INSERT, UPDATE or DELETE statements.
o
Exporting Data: Data can be written to either a mainframe or LAN attached computer using a SELECT
from Teradata. You can also pick the format you desire ranging from data files to printed reports to Excel
formats.
There are other utilities that are faster than BTEQ for importing or exporting data. We will talk about these in future
chapters, but BTEQ is still used for smaller jobs.

Logging on to BTEQ
Before you can use BTEQ, you must have user access rights to the client system and privileges to the Teradata DBS.
Normal system access privileges include a userid and a password. Some systems may also require additional user
identification codes depending on company standards and operational procedures. Depending on the configuration of
your Teradata DBS, you may need to include an account identifier (acctid) and/or a Teradata Director Program
Identifier (TDPID).

Using BTEQ to submit queries


Submitting SQL in BTEQ's Interactive Mode
Once you logon to Teradata through BTEQ, you are ready to run your queries. Teradata knows the SQL is finished
when it finds a semi-colon, so don't forget to put one at the end of your query. Below is an example of a Teradata table
to demonstrate BTEQ operations.
Employee_Table

Figure 2-1
BTEQ execution

Page 1 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 2-2

Submitting SQL in BTEQ's Batch Mode


On network-attached systems, BTEQ can also run in batch mode under UNIX (IBM AIX, Hewlett-Packard HP-UX,
NCR MP-RAS, Sun Solaris), DOS, Macintosh, Microsoft Windows and OS/2 operating systems. To submit a job in
Batch mode do the following:
1. Invoke BTEQ
2. Type in the input file name
3. Type in the location and output file name.
The following example shows how to invoke BTEQ from a DOS command. In order for this to work, the directory
called Program Files\NCR\Teradata Client\bin must be established in the search path.

Figure 2-3
Notice that the BTEQ command is immediately followed by the <BatchScript.txt' to tell BTEQ which file contains the
commands to execute. Then, the >Output.txt' names the file where the output messages are written. Here is an
example of the contents of BatchScript.txt file.
BatchScript.txt File

Figure 2-4
The above illustration shows how BTEQ can be manually invoked from a command prompt and displays how to
specify the name and location of the batch script file to be executed.
The previous examples show that when logging onto BTEQ in interactive mode, the user actually types in a logon
string and then Teradata will prompt for a password. However, in batch mode, Teradata requires both a logon and
password to be directly stored as part of the script.
Since putting this sensitive information into a script is scary for security reasons, inserting the password directly into a
script that is to be processed in batch mode may not be a good idea. It is generally recommended and a common
practice to store the logon and password in a separate file that that can be secured. That way, it is not in the script for
anyone to see.
For example, the contents of a file called "mylogon.txt" might be:
.LOGON cdw/sql00,whynot.
Then, the script should contain the following command instead of a .LOGON, as shown below and again in the
following script: .RUN FILE=mylogon.txt
This command opens and reads the file. It then executes every record in the file.

Page 2 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Using BTEQ Conditional Logic


Below is a BTEQ batch script example. The initial steps of the script will establish the logon, the database, and the
delete all the rows from the Employee_Table. If the table does not exist, the BTEQ conditional logic will instruct
Teradata to create it. However, if the table already exists, then Teradata will move forward and insert data.
Note
In script examples, the left panel contains BTEQ base commands and the right panel provides a
brief description of each command.

Figure 2-5

Using BTEQ to Export Data


BTEQ allows data to be exported directly from Teradata to a file on a mainframe or network-attached computer. In
addition, the BTEQ export function has several export formats that a user can choose depending on the desired
output. Generally, users will export data to a flat file format that is composed of a variety of characteristics. These
characteristics include: field mode, indicator mode, or dif mode. Below is an expanded explanation of the different
mode options.
Format of the EXPORT command:
.EXPORT <mode> {FILE | DDNAME } = <filename> [, LIMIT=n]
Record Mode: (also called DATA mode): This is set by .EXPORT DATA. This will bring data back as a flat file. Each
parcel will contain a complete record. Since it is not a report, there are no headers or white space between the data
contained in each column and the data is written to the file (e.g., disk drive file) in native format. For example, this
means that INTEGER data is written as a 4-byte binary field. Therefore, it cannot be read and understood using a
normal text editor.
Field Mode (also called REPORT mode): This is set by .EXPORT REPORT. This is the default mode for BTEQ and
brings the data back as if it was a standard SQL SELECT statement. The output of this BTEQ export would return the
column headers for the fields, white space, expanded packed or binary data (for humans to read) and can be
understood using a text editor.
Indicator Mode: This is set by .EXPORT INDICDATA. This mode writes the data in data mode, but also provides host
operating systems with the means of recognizing missing or unknown data (NULL) fields. This is important if the data
is to be loaded into another Relational Database System (RDBMS).
The issue is that there is no standard character defined to represent either a numeric or character NULL. So, every
system uses a zero for a numeric NULL and a space or blank for a character NULL. If this data is simply loaded into
another RDBMS, it is no longer a NULL, but a zero or space.
To remedy this situation, INDICATA puts a bitmap at the front of every record written to the disk. This bitmap contains
one bit per field/column. When a Teradata column contains a NULL, the bit for that field is turned on by setting it to a
"1". Likewise, if the data is not NULL, the bit remains a zero. Therefore, the loading utility reads these bits as
indicators of NULL data and identifies the column(s) as NULL when data is loaded back into the table, where
appropriate.

Page 3 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Since both DATA and INDICDATA store each column on disk in native format with known lengths and characteristics,
they are the fastest method of transferring data. However, it becomes imperative that you be consistent. When it is
exported as DATA, it must be imported as DATA and the same is true for INDICDATA.
Again, this internal processing is automatic and potentially important. Yet, on a network-attached system, being
consistent is our only responsibility. However, on a mainframe system, you must account for these bits when defining
the LRECL in the Job Control Language (JCL). Otherwise, your length is too short and the job will end with an error.
To determine the correct length, the following information is important. As mentioned earlier, one bit is needed per field
output onto disk. However, computers allocate data in bytes, not bits. Therefore, if one bit is needed a minimum of
eight (8 bits per byte) are allocated. Therefore, for every eight fields, the LRECL becomes 1 byte longer and must be
added. In other words, for nine columns selected, 2 bytes are added even though only nine bits are needed.
With this being stated, there is one indicator bit per field selected. INDICDATA mode gives the Host computer the
ability to allocate bits in the form of a byte. Therefore, if one bit is required by the host system, INDICDATA mode will
automatically allocate eight of them. This means that from one to eight columns being referenced in the SELECT will
add one byte to the length of the record. When selecting nine to sixteen columns, the output record will be two bytes
longer.
When executing on non-mainframe systems, the record length is automatically maintained. However, when exporting
to a mainframe, the JCL (LRECL) must account for this addition length.
DIF Mode: Known as Data Interchange Format, which allows users to export data from Teradata to be directly utilized
for spreadsheet applications like Excel, FoxPro and Lotus.
The optional limit is to tell BTEQ to stop returning rows after a specific number (n) of rows. This might be handy in a
test environment to stop BTEQ before the end of transferring rows to the file.

BTEQ EXPORT Example Using Record (DATA) Mode


The following is an example that displays how to utilize the export Record (DATA) option. Notice the periods (.) at the
beginning some of script lines. A period starting a line indicates a BTEQ command. If there is no period, then the
command is an SQL command.
When doing an export on a Mainframe or a network-attached (e.g., LAN) computer, there is one primary difference in
the .EXPORT command. The difference is the following:

Mainframe syntax:

.EXPORT DATA DDNAME = data definition state name (JCL)

LAN syntax:

.EXPORT DATA FILE = actual file name

The following example uses a Record (DATA) Mode format. The output of the exported data will be a flat file.
Employee_Table

Figure 2-6

BTEQ EXPORT Example Using Field (Report) Mode


The following is an example that displays how to utilize the export Field (Report) option. Notice the periods (.) at the
beginning some of script lines. A period starting a line indicates a BTEQ command and needs no semi-colon.
Likewise, if there is no period, then the command is an SQL command and requires a semi-colon.

Page 4 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 2-7
After this script has completed, the following report will be generated on disk.
Employee_No

Last_name

First_name

Salary

Dept_No

2000000

Jones

Squiggy

32800.50

1256349

Harrison

Herbert

54500.00

400

1333454

Smith

John

48000.00

200

1121334

Strickling

Cletus

54500.00

400

1324657

Coffing

Billy

41888.88

200

2341218

Reilly

William

36000.00

400

1232578

Chambers

Mandee

56177.50

100

1000234

Smythe

Richard

64300.00

10

2312225

Larkins

Loraine

40200.00

300

I remember when my mom and dad purchased my first Lego set. I was so excited about building my first space station
that I ripped the box open, and proceeded to follow the instructions to complete the station. However, when I was
done, I was not satisfied with the design and decided to make changes. So I built another space ship and constructed
another launching station. BTEQ export works in the same manner, as the basic EXPORT knowledge is acquired, the
more we can build on that foundation.
With that being said, the following is an example that displays a more robust example of utilizing the Field (Report)
option. This example will export data in Field (Report) Mode format. The output of the exported data will appear like a
standard output of a SQL SELECT statement. In addition, aliases and a title have been added to the script.

Figure 2-8
After this script has been completed, the following report will be generated on disk.

Page 5 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Employee Profiles
Employee Number

Last Name

First Name

Salary

Department Number

2000000

Jones

Squiggy

32800.50

1256349

Harrison

Herbert

54500.00

400

1333454

Smith

John

48000.00

200

1121334

Strickling

Cletus

54500.00

400

1324657

Coffing

Billy

41888.88

200

2341218

Reilly

William

36000.00

400

1232578

Chambers

Mandee

56177.50

100

1000234

Smythe

Richard

64300.00

10

2312225

Larkins

Loraine

40200.00

300

From the above example, a number of BTEQ commands were added to the export script. Below is a review of those
commands.

The WIDTH specifies the width of screen displays and printed reports, based on characters per line.

The FORMAT command allows the ability to enable/inhibit the page-oriented format option.

The HEADING command specifies a header that will appear at the top every page of a report.

BTEQ IMPORT Example


BTEQ can also read a file from the hard disk and incorporate the data into SQL to modify the contents of one or more
tables. In order to do this processing, the name and record description of the file must be known ahead of time. These
will be defined within the script file.
Format of the IMPORT command:
.IMPORT { FILE | DNAME } = <filename>[,SKIP=n]
The script below introduces the IMPORT command with the Record (DATA) option. Notice the periods (.) at the
beginning some of script lines. A period starting a line indicates a BTEQ command. If there is no period, then the
command is an SQL command.
The SKIP option is used when you wish to bypass the first records in a file. For example, a mainframe tape may have
header records that should not be processed. Other times, maybe the job started and loaded a few rows into the table
with a UPI defined. Loading them again will cause an error. So, you can skip over them using this option.
The following example will use a Record (DATA) Mode format. The input of the imported data will populate the
Employee_Table.

Page 6 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 2-9
From the above example, a number of BTEQ commands were added to the import script. Below is a review of those
commands.

.QUIET ON limits BTEQ output to reporting only errors and request processing statistics. Note: Be careful how
you spell .QUIET, else forgetting the E becomes .QUIT and it will.

.REPEAT * causes BTEQ to read a specified number of records or until EOF. The default is one record. Using
REPEAT 10 would perform the loop 10 times.

The USING defines the input data fields and their associated data types coming from the host.
The following builds upon the IMPORT Record (DATA) example above. The example below will still utilize the Record
(DATA) Mode format. However, this script will add a CREATE TABLE statement. In addition, the imported data will
populate the newly created Employee_Profile table.

Page 7 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 2-10
Notice that some of the scripts have a .LOGOFF and .QUIT. The .LOGOFF is optional because when BTEQ quits, the
session is terminated. A logoff makes it a friendly departure and also allows you to logon with a different user name
and password.

Determining Out Record Lengths


Some hosts, such as IBM mainframes, require the correct LRECL (Logical Record Length) parameter in the JCL, and
will abort if the value is incorrect. The following page will discuss how to figure out the record lengths.
There are three issues involving record lengths and they are:

Fixed columns

Variable columns

NULL indicators
Fixed Length Columns: For fixed length columns you merely count the length of the column. The lengths are:

INTEGER

4 bytes

SMALLINT

2 bytes

BYTEINT

1 byte

CHAR(10)

10 bytes

CHAR(4)

4 bytes

DATE

4 bytes

DECIMAL(7,2)

4 bytes (packed data, total digits / 2 +1)

DECIMAL(12,2)

8 bytes
Page 8 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Variable columns: Variable length columns should be calculated as the maximum value plus two. This two bytes is for
the number of bytes for the binary length of the field. In reality you can save much space because trailing blanks are
not kept. The logical record will assume the maximum and add two bytes as a length field per column.

VARCHAR(8)

10 bytes

VARCHAR(10)

12 bytes

Indicator columns: As explained earlier, the indicators utilize a single bit for each field. If your record has 8 fields
(which require 8 bits), then you add one extra byte to the total length of all the fields. If your record has 9-16 fields,
then add two bytes.

BTEQ Return Codes


Return codes are two-digit values that BTEQ returns to the user after completing each job or task. The value of the
return code indicates the completion status of the job or task as follows:
Return Code Descirption

00 Job completed with no errors.

02 User alert to log on to the Teradata DBS.

04 Warning error.

08 User error.

12 Severe internal error.


You can over-ride the standard error codes at the time you terminate BTEQ. This might be handy for debug purposes.
The error code or "return code" can be any number you specify using one of the following:
Override Code Description

.QUIT 15

.EXIT 15

BTEQ Commands
The BTEQ commands in Teradata are designed for flexibility. These commands are not used directly on the data
inside the tables. However, these 60 different BTEQ commands are utilized in four areas.

Session Control Commands

File Control Commands

Sequence Control Commands

Format Control Commands

Page 9 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Session Control Commands

Figure 2-11

File Control Commands


These BTEQ commands are used to specify the formatting parameters of incoming and outgoing information. This
includes identifying sources and determining I/O streams.

Page 10 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 2-12

Sequence Control Commands


These commands control the sequence in which Teradata commands operate.

Figure 2-13

Format Control Commands


These commands control the formatting for Teradata and present the data in a report mode to the screen or printer.

Page 11 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 2-14

Page 12 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Chapter 3: FastExport
An Introduction to FastExport
Why it is called "FAST" Export
FastExport is known for its lightning speed when it comes to exporting vast amounts of data from Teradata and
transferring the data into flat files on either a mainframe or network-attached computer. In addition, FastExport has the
ability to except OUTMOD routines, which provides the user the capability to write, select, validate, and preprocess
the exported data. Part of this speed is achieved because FastExport takes full advantage of Teradata's parallelism.
In this book, we have already discovered how BTEQ can be utilized to export data from Teradata in a variety of
formats. As the demand increases to store data, the ever-growing requirement for tools to export massive amounts of
data.
This is the reason why FastExport (FEXP) is brilliant by design. A good rule of thumb is that if you have more than half
a million rows of data to export to either a flat file format or with NULL indicators, then FastExport is the best choice to
accomplish this task.
Keep in mind that FastExport is designed as a one-way utility-that is, the sole purpose of FastExport is to move data
out of Teradata. It does this by harnessing the parallelism that Teradata provides.
FastExport is extremely attractive for exporting data because it takes full advantage of multiple sessions, which
leverages Teradata parallelism. FastExport can also export from multiple tables during a single operation. In addition,
FastExport utilizes the Support Environment, which provides a job restart capability from a checkpoint if an error
occurs during the process of executing an export job.

How FastExport Works


When FastExport is invoked, the utility logs onto the Teradata database and retrieves the rows that are specified in the
SELECT statement and puts them into SPOOL. From there, it must build blocks to send back to the client. In
comparison, BTEQ starts sending rows immediately for storage into a file.
If the output data is sorted, FastExport may be required to redistribute the selected data two times across the AMP
processors in order to build the blocks in the correct sequence. Remember, a lot of rows fit into a 64K block and both
the rows and the blocks must be sequenced. While all of this redistribution is occurring, BTEQ continues to send rows.
FastExport is getting behind in the processing. However, when FastExport starts sending the rows back a block at a
time, it quickly overtakes and passes BTEQ's row at time processing.
The other advantage is that if BTEQ terminates abnormally, all of your rows (which are in SPOOL) are discarded. You
must rerun the BTEQ script from the beginning. However, if FastExport terminates abnormally, all the selected rows
are in worktables and it can continue sending them where it left off. Pretty smart and very fast!
Also, if there is a requirement to manipulate the data before storing it on the computer's hard drive, an OUTMOD
routine can be written to modify the result set after it is sent back to the client on either the mainframe or LAN. Just like
the BASF commercial states, "We don't make the products you buy, we make the products you buy better". FastExport
is designed off the same premise, it does not make the SQL SELECT statement faster, but it does take the SQL
SELECT statement and processes the request with lighting fast parallel processing!

FastExport Fundamentals
#1: FastExport EXPORTS data from Teradata. The reason they call it FastExport is because it takes data off of
Teradata (Exports Data). FastExport does not import data into Teradata. Additionally, like BTEQ it can output multiple
files in a single run.
#2: FastExport only supports the SELECT statement. The only DML statement that FastExport understands is
SELECT. You SELECT the data you want exported and FastExport will take care of the rest.
#3: Choose FastExport over BTEQ when Exporting Data of more than half a million+ rows. When a large
amount of data is being exported, FastExport is recommended over BTEQ Export. The only drawback is the total
number of FastLoads, FastExports, and MultiLoads that can run at the same time, which is limited to 15. BTEQ Export
Page 13 of 105

Teradata Utilities-Breaking the Barriers, First Edition


does not have this restriction. Of course, FastExport will work with less data, but the speed may not be much faster
than BTEQ.
#4: FastExport supports multiple SELECT statements and multiple tables in a single run. You can have multiple
SELECT statements with FastExport and each SELECT can join information up to 64 tables.
#5: FastExport supports conditional logic, conditional expressions, arithmetic calculations, and data
conversions. FastExport is flexible and supports the above conditions, calculations, and conversions.
#6: FastExport does NOT support error files or error limits. FastExport does not record particular error types in a
table. The FastExport utility will terminate after a certain number of errors have been encountered.
#7: FastExport supports user-written routines INMODs and OUTMODs. FastExport allows you write INMOD and
OUTMOD routines so you can select, validate and preprocess the exported data

FastExport Supported Operating Systems


The FastExport utility is supported on either the mainframe or on LAN. The information below illustrates which
operating systems are supported for each environment:
The LAN environment supports the following Operating Systems:

UNIX MP-RAS

Windows 2000

Windows 95

Windows NT

UNIX HP-UX

AIX

Solaris SPARC

Solaris Intel
The Mainframe (Channel Attached) environment supports the following Operating Systems:

MVS

VM

Maximum of 15 Loads
The Teradata RDBMS will only support a maximum of 15 simultaneous FastLoad, MultiLoad, or FastExport utility jobs.
This maximum value is determined and configured by the DBS Control record. This value can be set from 0 to 15.
When Teradata is initially installed, this value is set at 5.
The reason for this limitation is that FastLoad, MultiLoad, and FastExport all use large blocks to transfer data. If more
then 15 simultaneous jobs were supported, a saturation point could be reached on the availability of resources. In this
case, Teradata does an excellent job of protecting system resources by queuing up additional FastLoad, MultiLoad,
and FastExport jobs that are attempting to connect.
For example, if the maximum numbers of utilities on the Teradata system is reached and another job attempts to run
that job does not start. This limitation should be viewed as a safety control feature. A tip for remembering how the load
limit applies is this, "If the name of the load utility contains either the word 'Fast' or the word 'Load', then there can be
only a total of fifteen of them running at any one time".
BTEQ does not have this load limitation. FastExport is clearly the better choice when exporting data. However, if two
many load jobs are running. BTEQ is an alternate choice for exporting data.

FastExport Support and Task Commands


FastExport accepts both FastExport commands and a subset of SQL statements. The FastExport commands can be
broken down into support and task activities. The table below highlights the key FastExport commands and their
definitions. These commands provide flexibility and control during the export process.

Page 14 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Support Environment Commands (see Support Environment chapter for details)

Figure 3-1

Task Commands

Figure 3-2

FastExport Supported SQL Commands


FastExport accepts the following Teradata SQL statements. Each has been placed in alphabetic order for your
convenience.

Page 15 of 105

Teradata Utilities-Breaking the Barriers, First Edition

SQL Commands

Figure 3-3

A FastExport in its Simplest Form


The hobby of racecar driving can be extremely frustrating, challenging, and rewarding all at the same time. I always
remember my driving instructor coaching me during a practice session in a new car around a road course racetrack.
He said to me, "Before you can learn to run, you need to learn how to walk." This same philosophy can be applied
when working with FastExport. If FastExport is broken into steps, then several things that appear to be complicated
are really very simple. With this being stated, FastExport can be broken into the following steps:

Logging onto Teradata

Retrieves the rows you specify in your SELECT statement

Exports the data to the specified file or OUTMOD routine

Logs off of Teradata

Page 16 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 3-4

Sample FastExport Script


Now that the first steps have been taken to understand FastExport, the next step is to journey forward and review
another example that shows builds upon what we have learned. In the script below, Teradata comment lines have
been placed inside the script [/*. */]. In addition, FastExport and SQL commands are written in upper case in order to
highlight them. Another note is that the column names are listed vertically. The recommendation is to place the comma
separator in front of the following column. Coding this way makes reading or debugging the script easier to
accomplish.

Figure 3-5

FastExport Modes and Formats


FastExport Modes
FastExport has two modes: RECORD or INDICATOR. In the mainframe world, only use RECORD mode. In the UNIX
or LAN environment, RECORD mode is the default, but you can use INDICATOR mode if desired. The difference
between the two modes is INDICATOR mode will set the indicator bits to 1 for column values containing NULLS.
Page 17 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Both modes return data in a client internal format with variable-length records. Each individual record has a value for
all of the columns specified by the SELECT statement. All variable-length columns are preceded by a two-byte control
value indicating the length of the column data. NULL columns have a value that is appropriate for the column data
type. Remember, INDICATOR mode will set bit flags that identify the columns that have a null value.

FastExport Formats
FastExport has many possible formats in the UNIX or LAN environment. The FORMAT statement specifies the format
for each record being exported which are:

FASTLOAD

BINARY

TEXT

UNFORMAT
The default FORMAT is FASTLOAD in a UNIX or LAN environment.
FASTLOAD Format is a two-byte integer, followed by the data, followed by an end-of- record marker. It is called
FASTLOAD because the data is exported in a format ready for FASTLOAD.
BINARY Format is a two-byte integer, followed by data.
TEXT is an arbitrary number of bytes followed by an end-of-record marker.
UNFORMAT is exported as it is received from CLIv2 without any client modifications.

A FastExport Script using Binary Mode

Figure 3-6

Page 18 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Chapter 4: FastLoad
An Introduction to FastLoad
Why it is called "FAST" Load
FastLoad is known for its lightning-like speed in loading vast amounts of data from flat files from a host into empty
tables in Teradata. Part of this speed is achieved because it does not use the Transient Journal. You will see some
more of the reasons enumerated below. But, regardless of the reasons that it is fast, know that FastLoad was
developed to load millions of rows into a table.
The way FastLoad works can be illustrated by home construction, of all things! Let's look at three scenarios from the
construction industry to provide an amazing picture of how the data gets loaded.
Scenario One: Builders prefer to start with an empty lot and construct a house on it, from the foundation right on up to
the roof. There is no pre-existing construction, just a smooth, graded lot. The fewer barriers there are to deal with, the
quicker the new construction can progress. Building custom or spec houses this way is the fastest way to build them.
Similarly, FastLoad likes to start with an empty table, like an empty lot, and then populate it with rows of data from
another source. Because the target table is empty, this method is typically the fastest way to load data. FastLoad will
never attempt to insert rows into a table that already holds data.
Scenario Two: The second scenario in this analogy is when someone buys the perfect piece of land on which to build
a home, but the lot already has a house on it. In this case, the person may determine that it is quicker and more
advantageous just to demolish the old house and start fresh from the ground up-allowing for brand new construction.
FastLoad also likes this approach to loading data. It can just 1) drop the existing table, which deletes the rows, 2)
replace its structure, and then 3) populate it with the latest and greatest data. When dealing with huge volumes of new
rows, this process will run much quicker than using MultiLoad to populate the existing table. Another option is to
DELETE all the data rows from a populated target table and reload it. This requires less updating of the Data
Dictionary than dropping and recreating a table. In either case, the result is a perfectly empty target table that
FastLoad requires!
Scenario Three: Sometimes, a customer has a good house already but wants to remodel a portion of it or to add an
additional room. This kind of work takes more time than the work described in Scenario One. Such work requires
some tearing out of existing construction in order to build the new section. Besides, the builder never knows what he
will encounter beneath the surface of the existing home. So you can easily see that remodeling or additions can take
more time than new construction. In the same way, existing tables with data may need to be updated by adding new
rows of data. To load populated tables quickly with large amounts of data while maintaining the data currently held in
those tables, you would choose MultiLoad instead of FastLoad. MultiLoad is designed for this task but, like renovating
or adding onto an existing house, it may take more time.

How FastLoad Works


What makes FastLoad perform so well when it is loading millions or even billions of rows? It is because FastLoad
assembles data into 64K blocks (64,000 bytes) to load it and can use multiple sessions simultaneously, taking further
advantage of Teradata's parallel processing.
This is different from BTEQ and TPump, which load data at the row level. It has been said, "If you have it, flaunt it!"
FastLoad does not like to brag, but it takes full advantage of Teradata's parallel architecture. In fact, FastLoad will
create a Teradata session for each AMP (Access Module Processor-the software processor in Teradata responsible
for reading and writing data to the disks) in order to maximize parallel processing. This advantage is passed along to
the FastLoad user in terms of awesome performance. Teradata is the only data warehouse product in the world that
loads data, processes data and backs up data in parallel.

FastLoad Has Some Limits


There are more reasons why FastLoad is so fast. Many of these become restrictions and therefore, cannot slow it
down. For instance, can you imagine a sprinter wearing cowboy boots in a race? Of course, not! Because of its speed,
FastLoad, too, must travel light! This means that it will have limitations that may or may not apply to other load utilities.
Remembering this short list will save you much frustration from failed loads and angry colleagues. It may even foster
your reputation as a smooth operator!
Page 19 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Rule #1: No Secondary Indexes are allowed on the Target Table. High performance will only allow FastLoad to
utilize Primary Indexes when loading. The reason for this is that Primary (UPI and NUPI) indexes are used in Teradata
to distribute the rows evenly across the AMPs and build only data rows. A secondary index is stored in a subtable
block and many times on a different AMP from the data row. This would slow FastLoad down and they would have to
call it: get ready now, HalfFastLoad. Therefore, FastLoad does not support them. If Secondary Indexes exist already,
just drop them. You may easily recreate them after completing the load.
Rule #2: No Referential Integrity is allowed. FastLoad cannot load data into tables that are defined with Referential
Integrity (RI). This would require too much system checking to prevent referential constraints to a different table.
FastLoad only does one table. In short, RI constraints will need to be dropped from the target table prior to the use of
FastLoad.
Rule #3: No Triggers are allowed at load time. FastLoad is much too focused on speed to pay attention to the
needs of other tables, which is what Triggers are all about. Additionally, these require more than one AMP and more
than one table. FastLoad does one table only. Simply ALTER the Triggers to the DISABLED status prior to using
FastLoad.
Rule #4: Duplicate Rows (in Multi-Set Tables) are not supported. Multiset tables are tables that allow duplicate
rows-that is when the values in every column are identical. When FastLoad finds duplicate rows, they are discarded.
While FastLoad can load data into a multi-set table, FastLoad will not load duplicate rows into a multi-set table
because FastLoad discards duplicate rows!
Rule #5: No AMPs may go down (i.e., go offline) while FastLoad is processing. The down AMP must be repaired
before the load process can be restarted. Other than this, FastLoad can recover from system glitches and perform
restarts. We will discuss Restarts later in this chapter.
Rule #6: No more than one data type conversion is allowed per column during a FastLoad. Why just one? Data
type conversion is highly resource intensive job on the system, which requires a "search and replace" effort. And that
takes more time. Enough said!

Three Key Requirements for FastLoad to Run


FastLoad can be run from either MVS/ Channel (mainframe) or Network (LAN) host. In either case, FastLoad requires
three key components. They are a log table, an empty target table and two error tables. The user must name these at
the beginning of each script.
Log Table: FastLoad needs a place to record information on its progress during a load. It uses the table called Fastlog
in the SYSADMIN database. This table contains one row for every FastLoad running on the system. In order for your
FastLoad to use this table, you need INSERT, UPDATE and DELETE privileges on that table.
Empty Target Table: We have already mentioned the absolute need for the target table to be empty. FastLoad does
not care how this is accomplished. After an initial load of an empty target table, you are now looking at a populated
table that will likely need to be maintained.
If you require the phenomenal speed of FastLoad, it is usually preferable, both for the sake of speed and for less
interaction with the Data Dictionary, just to delete all the rows from that table and then reload it with fresh data. The
syntax DELETE <databasename>.<tablename> should be used for this. But sometimes, as in some of our FastLoad
sample scripts below (see Figure 4-1), you want to drop that table and recreate it versus using the DELETE option. To
do this, FastLoad has the ability to run the DDL statements DROP TABLE and CREATE TABLE. The problem with
putting DDL in the script is that is no longer restartable and you are required to rerun the FastLoad from the beginning.
Otherwise, we recommend that you have a script for an initial run and a different script for a restart.

Page 20 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Page 21 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 4-1
Two Error Tables: Each FastLoad requires two error tables. These are error tables that will only be populated should
errors occur during the load process. These are required by the FastLoad utility, which will automatically create them
for you; all you must do is to name them. The first error table is for any translation errors or constraint violations. For
example, a row with a column containing a wrong data type would be reported to the first error table. The second error
table is for errors caused by duplicate values for Unique Primary Indexes (UPI). FastLoad will load just one
occurrence for every UPI. The other occurrences will be stored in this table. However, if the entire row is a duplicate,
FastLoad counts it but does not store the row. These tables may be analyzed later for troubleshooting should errors
occur during the load. For specifics on how you can troubleshoot, see the section below titled, "What Happens When
FastLoad Finishes."

Maximum of 15 Loads
The Teradata RDBMS will only run a maximum number of fifteen FastLoads, MultiLoads, or FastExports at the same
time. This maximum is determined by a value stored in the DBS Control record. It can be any value from 0 to 15.
When Teradata is first installed, this value is set to 5 concurrent jobs.
Since these utilities all use the large blocking of rows, it hits a saturation point where Teradata will protect the amount
system resources available by queuing up the extra load. For example, if the maximum number of jobs are currently
running on the system and you attempt to run one more, that job will not be started. You should view this limit as a
safety control. Here is a tip for remembering how the load limit applies: If the name of the load utility contains either
the word "Fast" or the word "Load", then there can be only a total of fifteen of them running at any one time.

FastLoad Has Two Phases


Teradata is famous for its end-to-end use of parallel processing. Both the data and the tasks are divided up among the
AMPs. Then each AMP tackles its own portion of the task with regard to its portion of the data. This same "divide and
conquer" mentality also expedites the load process. FastLoad divides its job into two phases, both designed for speed.
They have no fancy names but are typically known simply as Phase 1 and Phase 2. Sometimes they are referred to
as Acquisition Phase and Application Phase.

PHASE 1: Acquisition
The primary function of Phase 1 is to transfer data from the host computer to the Access Module Processors (AMPs)
as quickly as possible. For the sake of speed, the Parsing Engine of Teradata does not does not take the time to hash
each row of data based on the Primary Index. That will be done later. Instead, it does the following:
When the Parsing Engine (PE) receives the INSERT command, it uses one session to parse the SQL just once. The
PE is the Teradata software processor responsible for parsing syntax and generating a plan to execute the request. It
then opens a Teradata session from the FastLoad client directly to the AMPs. By default, one session is created for
each AMP. Therefore, on large systems, it is normally a good idea to limit the number of sessions using the
SESSIONS command. This capability is shown below.

Page 22 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Simultaneously, all but one of the client sessions begins loading raw data in 64K blocks for transfer to an AMP. The
first priority of Phase 1 is to get the data onto the AMPs as fast as possible. To accomplish this, the rows are packed,
unhashed, into large blocks and sent to the AMPs without any concern for which AMP gets the block. The result is that
data rows arrive on different AMPs than those they would live, had they been hashed.
So how do the rows get to the correct AMPs where they will permanently reside? Following the receipt of every data
block, each AMP hashes its rows based on the Primary Index, and redistributes them to the proper AMP. At this point,
the rows are written to a worktable on the AMP but remain unsorted until Phase 1 is complete.
Phase 1 can be compared loosely to the preferred method of transfer used in the parcel shipping industry today. How
do the key players in this industry handle a parcel? When the shipping company receives a parcel, that parcel is not
immediately sent to its final destination. Instead, for the sake of speed, it is often sent to a shipping hub in a seemingly
unrelated city. Then, from that hub it is sent to the destination city. FastLoad's Phase 1 uses the AMPs in much the
same way that the shipper uses its hubs. First, all the data blocks in the load get rushed randomly to any AMP. This
just gets them to a "hub" somewhere in Teradata country. Second, each AMP forwards them to their true destination.
This is like the shipping parcel being sent from a hub city to its destination city!

PHASE 2: Application
Following the scenario described above, the shipping vendor must do more than get a parcel to the destination city.
Once the packages arrive at the destination city, they must then be sorted by street and zip code, placed onto local
trucks and be driven to their final, local destinations.
Similarly, FastLoad's Phase 2 is mission critical for getting every row of data to its final address (i.e., where it will be
stored on disk). In this phase, each AMP sorts the rows in its worktable. Then it writes the rows into the table space on
disks where they will permanently reside. Rows of a table are stored on the disks in data blocks. The AMP uses the
block size as defined when the target table was created. If the table is Fallback protected, then the Fallback will be
loaded after the Primary table has finished loading. This enables the Primary table to become accessible as soon as
possible. FastLoad is so ingenious, no wonder it is the darling of the Teradata load utilities!

FastLoad Commands
Here is a table of some key FastLoad commands and their definitions. They are used to provide flexibility in control of
the load process. Consider this your personal redireference guide! You will notice that there are only a few SQL
commands that may be used with this utility (Create Table, Drop Table, Delete and Insert). This keeps FastLoad from
becoming encumbered with additional functions that would slow it down

A FastLoad Example in its Simplest Form


The load utilities often scare people because there are many things that appear complicated. In actuality, the load
scripts are very simple. Think of FastLoad as:

Logging onto Teradata

Defining the Teradata table that you want to load (target table)

Defining the INPUT data file

Telling the system to start loading

Telling the system to start loading


This first script example is designed to show FastLoad in its simplest form. The actual script is in the left column and
our comments are on the right.

Page 23 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 4-2

Sample FastLoad Script


Let's look at an actual FastLoad script that you might see in the real world. In the script below, every comment line is
placed inside the normal Teradata comment syntax, [/*.*/]. FastLoad and SQL commands are written in upper case
in order to make them stand out. In reality, Teradata utilities, like Teradata itself, are by default not case sensitive. You
will also note that when column names are listed vertically we recommend placing the comma separator in front of the
following column. Coding this way makes reading or debugging the script easier for everyone. The purpose of this
script is to update the Employee_Profile table in the SQL01 database. The input file used for the load is named
EMPS.TXT. Below the sample script each step will be described in detail.
Normally it is not a good idea to put the DROP and CREATE statements in a FastLoad script. The reason is that when
any of the tables that FastLoad is using are dropped, the script cannot be restarted. It can only be rerun from the
beginning. Since FastLoad has restart logic built into it, a restart is normally the better solution if the initial load attempt
should fail. However, for purposes of this example, it shows the table structure and the description of the data being
read.

Page 24 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 4-4

Step One: Before logging onto Teradata, it is important to specify how many sessions you need. The syntax is
[SESSIONS {n}].

Step Two: Next, you LOGON to the Teradata system. You will quickly see that the utility commands in
FastLoad are similar to those in BTEQ. FastLoad commands were designed from the underlying commands in
BTEQ. However, unlike BTEQ, most of the FastLoad commands do not allow a dot ["."] in front of them and
therefore need a semicolon. At this point we chose to have Teradata tell us which version of FastLoad is being
used for the load. Why would we recommend this? We do because as FastLoad's capabilities get enhanced with
newer versions, the syntax of the scripts may have to be revisited.

Step Three: If the input file is not a FastLoad format, before you describe the INPUT FILE structure in the
DEFINE statement, you must first set the RECORD layout type for the file being passed by FastLoad. We have
used VARTEXT in our example with a comma delimiter. The other options are FastLoad, TEXT, UNFORMATTED
OR VARTEXT. You need to know this about your input file ahead of time.

Step Four: Next, comes the DEFINE statement. FastLoad must know the structure and the name of the flat
file to be used as the input FILE, or source file for the load.

Step Five: FastLoad makes no assumptions from the DROP TABLE statements with regard to what you want
loaded. In the BEGIN LOADING statement, the script must name the target table and the two error tables for the
load. Did you notice that there is no CREATE TABLE statement for the error tables in this script? FastLoad will
automatically create them for you once you name them in the script. In this instance, they are named "Emp_Err1"
and "Emp_Err2". Phase 1 uses "Emp_Err1" because it comes first and Phase 2 uses "Emp_Err2". The names
are arbitrary, of course. You may call them whatever you like. At the same time, they must be unique within a
database, so using a combination of your userid and target table name helps insure this uniqueness between
multiple FastLoad jobs occurring in the same database.
In the BEGIN LOADING statement we have also included the optional CHECKPOINT parameter. We included
[CHECKPOINT 100000]. Although not required, this optional parameter performs a vital task with regard to the
load. In the old days, children were always told to focus on the three 'R's' in grade school ("reading, riting, and
rithmatic"). There are two very different, yet equally important, R's to consider whenever you run FastLoad. They
are RERUN and RESTART. RERUN means that the job is capable of running all the processing again from the
beginning of the load. RESTART means that the job is capable of running the processing again from the point
Page 25 of 105

Teradata Utilities-Breaking the Barriers, First Edition

where it left off when the job was interrupted, causing it to fail. When CHECKPOINT is requested, it allows
FastLoad to resume loading from the first row following the last successful CHECKPOINT. We will learn more
about CHECKPOINT in the section on Restarting FastLoad.
Step Six: FastLoad focuses on its task of loading data blocks to AMPs like little Yorkshire terrier's do when
playing with a ball! It will not stop unless you tell it to stop. Therefore, it will not proceed to Phase 2 without the
END LOADING command.
In reality, this provides a very valuable capability for FastLoad. Since the table must be empty at the start of the
job, it prevents loading rows as they arrive from different time zones. However, to accomplish this processing,
simply omit the END LOADING on the load job. Then, you can run the same FastLoad multiple times and
continue loading the worktables until the last file is received. Then run the last FastLoad job with an END
LOADING and you have partitioned your load jobs into smaller segments instead of one huge job. This makes
FastLoad even faster!

Of course to make this work, FastLoad must be restartable. Therefore, you cannot use the DROP or CREATE
commands within the script. Additionally, every script is exactly the same with the exception of the last one, which
contains the END LOADING causing FastLoad to proceed to Phase 2. That's a pretty clever way to do a
partitioned type of data load.
Step Seven: All that goes up must come down. And all the sessions must LOGOFF. This will be the last utility
command in your script. At this point the table lock is released and if there are no rows in the error tables, they
are dropped automatically. However, if a single row is in one of them, you are responsible to check it, take the
appropriate action and drop the table manually.

Converting Data Types with FastLoad


Converting data is easy. Just define the input data types in the input file. Then, FastLoad will compare that to the
column definitions in the Data Dictionary and convert the data for you! But the cardinal rule is that only one data type
conversion is allowed per column. In the example below, notice how the columns in the input file are converted from
one data type to another simply by redefining the data type in the CREATE TABLE statement.
FastLoad allows six kinds of data conversions. Here is a chart that displays them:

Figure 4-5
When we said that converting data is easy, we meant that it is easy for the user. It is actually quite resource intensive,
thus increasing the amount of time needed for the load. Therefore, if speed is important, keep the number of columns
being converted to a minimum!

A FastLoad Conversion Example


This next script example is designed to show how FastLoad converts data automatically when the INPUT data type
differs from the Target Teradata Table data type. The actual script is in the left column and our comments are on the
right.

Page 26 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 4-5

When You Cannot RESTART FastLoad


There are two types of FastLoad scripts: those that you can restart and those that you cannot without modifying the
script. If any of the following conditions are true of the FastLoad script that you are dealing with, it is NOT restartable:

The Error Tables are DROPPED

The Target Table is DROPPED

The Target Table is CREATED


Can you tell from the following sample fastLoad script why it is not restartable?

Figure 4-7

Page 27 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Why might you have to RESTART a FastLoad job, anyway? Perhaps you might experience a system reset or some
glitch that stops the job one half way through it. Maybe the mainframe went down. Well, it is not really a big deal
because FastLoad is so lightning-fast that you could probably just RERUN the job for small data loads.
However, when you are loading a billion rows, this is not a good idea because it wastes time. So the most common
way to deal with these situations is simply to RESTART the job. But what if the normal load takes 4 hours, and the
glitch occurs when you already have two thirds of the data rows loaded? In that case, you might want to make sure
that the job is totally restartable. Let's see how this is done.

When You Can RESTART FastLoad


If all of the following conditions are true, then FastLoad is ALWAYS restartable:

The Error Tables are NOT DROPPED in the script

The Target Table is NOT DROPPED in the script

The Target Table is NOT CREATED in the script

You have defined a checkpoint


So, if you need to drop or create tables, do it in a separate job using BTEQ. Imagine that you have a table whose data
changes so much that you typically drop it monthly and build it again. Let's go back to the script we just reviewed
above and see how we can break it into the two parts necessary to make it fully RESTARTABLE. It is broken up
below.
STEP ONE: Run the following SQL statements in Queryman or BTEQ before you start FastLoad:

Figure 4-8
First, you ensure that the target table and error tables, if they existed previously, are blown away. If there had been no
errors in the error tables, they would be automatically dropped. If these tables did not exist, you have not lost anything.
Next, if needed, you create the empty table structure needed to receive a FastLoad.

STEP TWO: Run the FastLoad script


This is the portion of the earlier script that carries out these vital steps:

Defines the structure of the flat file

Tells FastLoad where to load the data and store the errors

Specifies the checkpoint so a RESTART will not go back to row one

Loads the data


If these are true, all you need do is resubmit the FastLoad job and it starts loading data again with the next record
after the last checkpoint. Now, with that said, if you did not request a checkpoint, the output message will normally
indicate how many records were loaded.
You may optionally use the RECORD command to manually restart on the next record after the one indicated in the
message.
Now, if the FastLoad job aborts in Phase 2, you can simply submit a script with only the BEGIN LOADING and END
LOADING. It will then restart right into Phase 2.

What Happens When FastLoad Finishes


You Receive an Outcome Status
The most important thing to do is verify that FastLoad completed successfully. This is accomplished by looking at the
last output in the report and making sure that it is a return code or status code of zero (0). Any other value indicates
that something wasn't perfect and needs to be fixed.
Page 28 of 105

Teradata Utilities-Breaking the Barriers, First Edition


The locks will not be removed and the error tables will not be dropped without a successful completion. This is
because FastLoad assumes that it will need them for its restart. At the same time, the lock on the target table will not
be released either. When running FastLoad, you realistically have two choices once it is started. First choice is that
you get it to run to a successful completion, or lastly, rerun it from the beginning. As you can imagine, the best course
of action is normally to get it to finish successfully via a restart.

You Receive a Status Report


What happens when FastLoad finishes running? Well, you can expect to see a summary report on the success of the
load. Following is an example of such a report.

Figure 4-9
The first line displays the total number of records read from the input file. Were all of them loaded? Not really. The
second line tells us that there were fifty rows with constraint violations, so they were not loaded. Corresponding to this,
fifty entries were made in the first error table. Line 3 shows that there were zero entries into the second error table,
indicating that there were no duplicate Unique Primary Index violations. Line 4 shows that there were 999950 rows
successfully loaded into the empty target table. Finally, there were no duplicate rows. Had there been any duplicate
rows, the duplicates would only have been counted. They are not stored in the error tables anywhere. When FastLoad
reports on its efforts, the number of rows in lines 2 through 5 should always total the number of records read in line 1.
Note on duplicate rows: Whenever FastLoad experiences a restart, there will normally be duplicate rows that are
counted. This is due to the fact that a error seldom occurs on a checkpoint (quiet or quiescent point) when nothing is
happening within FastLoad. Therefore, some number of rows will be sent to the AMPs again because the restart starts
on the next record after the value stored in the checkpoint. Hence, when a restart occurs, the first row after the
checkpoint and some of the consecutive rows are sent a second time. These will be caught as duplicate rows after the
sort. This restart logic is the reason that FastLoad will not load duplicate rows into a MULTISET table. It assumes they
are duplicates because of this logic.

You can Troubleshoot


In the example above, we know that the load was not entirely successful. But that is not enough. Now we need to
troubleshoot in order identify the errors and correct them. FastLoad generates two error tables that will enable us to
find the culprits. The first error table, which we named Errorfile1, contains just three columns: The column ErrorCode
contains the Teradata FastLoad code number to a corresponding translation or constraint error. The second column,
named ErrorField, specifies which column in the table contained the error. The third column, DataParcel, contains the
row with the problem. Both error tables contain the same three columns; they just track different types of errors.
As a user, you can select from either error table. To check errors in Errorfile1 you would use this syntax:
SELECT DISTINCT ErrorCode, Errorfieldname FROM Errortable1;
Corrected rows may be inserted to the target table using another utility that does not require an empty table.
To check errors in Errorfile2 you would the following syntax:
SELECT * FROM Errortable2;
The definition of the second error table is exactly the same as the target table with all the same columns and data
types.

Restarting FastLoad: A More In-Depth Look


How the CHECKPOINT Option Works
CHECKPOINT option defines the points in a load job where the FastLoad utility pauses to record that Teradata has
processed a specified number of rows. When the parameter "CHECKPOINT [n]" is included in the BEGIN LOADING
clause the system will stop loading momentarily at increments of [n] rows.
Page 29 of 105

Teradata Utilities-Breaking the Barriers, First Edition


At each CHECKPOINT, the AMPs will all pause and make sure that everything is loading smoothly. Then FastLoad
sends a checkpoint report (entry) to the SYSADMIN.Fastlog table. This log contains a list of all currently running
FastLoad jobs and the last successfully reached checkpoint for each job. Should an error occur that requires the load
to restart, FastLoad will merely go back to the last successfully reported checkpoint prior to the error. It will then restart
from the record immediately following that checkpoint and start building the next block of data to load. If such an error
occurs in Phase 1, with CHECKPOINT 0, FastLoad will always restart from the very first row.

Restarting with CHECKPOINT


Sometimes you may need to restart FastLoad. If the FastLoad script requests a CHECKPOINT (other than 0), then it
is restartable from the last successful checkpoint. Therefore, if the job fails, simply resubmit the job. Here are the two
options: Suppose Phase 1 halts prematurely; the Data Acquisition phase is incomplete. Resubmit the FastLoad script.
FastLoad will begin from RECORD 1 or the first record past the last checkpoint. If you wish to manually specify where
FastLoad should restart, locate the last successful checkpoint record by referring to the SYSADMIN.FASTLOG table.
To specify where a restart will start from, use the RECORD command. Normally, it is not necessary to use the
RECORD command-let FastLoad automatically determine where to restart from.
If the interruption occurs in Phase 2, the Data Acquisition phase has already completed. We know that the error is in
the Application Phase. In this case, resubmit the FastLoad script with only the BEGIN and END LOADING
Statements. This will restart in Phase 2 with the sort and building of the target table.

Restarting without CHECKPOINT (i.e., CHECKPOINT 0)


When a failure occurs and the FastLoad Script did not utilize the CHECKPOINT (i.e., CHECKPOINT 0), one
procedure is to DROP the target table and error tables and rerun the job. Here are some other options available to
you:
1. Resubmit job again and hope there is enough PERM space for all the rows already sent to the unsorted target
table plus all the rows that are going to be sent again to the same target table. Other than using space, these
rows will be rejected as duplicates. As you can imagine, this is not the most efficient way since it processes
many of the same rows twice.
2. If CHECKPOINT wasn't specified, then CHECKPOINT defaults to 100,000. You can perform a manual restart
using the RECORD statement. If the output print file shows that checkpoint 100000 occurred, use something
like the following command: [RECORD 100001;]. This statement will skip records 1 through 10000 and resume
on record 100001.

Using INMODs with FastLoad


When you find that FastLoad does not read the file type you have or you wish to control the access for any reason,
then it might be desirable to use an INMOD. An INMOD (Input Module), is fully compatible with FastLoad in either
mainframe or LAN environments, providing that the appropriate programming languages are used. However, INMODs
replace the normal mainframe DDNAME or LAN defined FILE name with the following statement: DEFINE
INMOD=<INMOD-name>. For a more in- depth discussion of INMODs, see the chapter of this book titled, "INMOD
Processing".

Page 30 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Chapter 5: MultiLoad
An Introduction to MultiLoad
Why it is called "Multi" Load
If we were going to be stranded on an island with a Teradata Data Warehouse and we could only take along one
Teradata load utility, clearly, MultiLoad would be our choice. MultiLoad has the capability to load multiple tables at one
time from either a LAN or Channel environment. This is in stark contrast to its fleet-footed cousin, FastLoad, which can
only load one table at a time. And it gets better, yet!
This feature rich utility can perform multiple types of DML tasks, including INSERT, UPDATE, DELETE and UPSERT
on up to five (5) empty or populated target tables at a time. These DML functions may be run either solo or in
combinations, against one or more tables. For these reasons, MultiLoad is the utility of choice when it comes to
loading populated tables in the batch environment. As the volume of data being loaded or updated in a single block,
the performance of MultiLoad improves. MultiLoad shines when it can impact more than one row in every data block.
In other words, MultiLoad looks at massive amounts of data and says, "Bring it on!"
Leo Tolstoy once said, "All happy families resemble each other." Like happy families, the Teradata load utilities
resemble each other, although they may have some differences. You are going to be pleased to find that you do not
have to learn all new commands and concepts for each load utility. MultiLoad has many similarities to FastLoad. It has
even more commands in common with TPump. The similarities will be evident as you work with them. Where there are
some quirky differences, we will point them out for you.

Two MultiLoad Modes: IMPORT and DELETE


MultiLoad provides two types of operations via modes: IMPORT and DELETE. In MultiLoad IMPORT mode, you
have the freedom to "mix and match" up to twenty (20) INSERTs, UPDATEs or DELETEs on up to five target tables.
The execution of the DML statements is not mandatory for all rows in a table. Instead, their execution hinges upon the
conditions contained in the APPLY clause of the script. Once again, MultiLoad demonstrates its user-friendly flexibility.
For UPDATEs or DELETEs to be successful in IMPORT mode, they must reference the Primary Index in the WHERE
clause.
The MultiLoad DELETE mode is used to perform a global (all AMP) delete on just one table. The reason to use
.BEGIN DELETE MLOAD is that it bypasses the Transient Journal (TJ) and can be RESTARTed if an error causes it to
terminate prior to finishing. When performing in DELETE mode, the DELETE SQL statement cannot reference the
Primary Index in the WHERE clause. This due to the fact that a primary index access is to a specific AMP; this is a
global operation.
The other factor that makes a DELETE mode operation so good is that it examines an entire block of rows at a time.
Once all the eligible rows have been removed, the block is written one time and a checkpoint is written. So, if a restart
is necessary, it simply starts deleting rows from the next block without a checkpoint. This is a smart way to continue.
Remember, when using the TJ all deleted rows are put back into the table from the TJ as a rollback. A rollback can
take longer to finish then the delete. MultiLoad does not do a rollback; it does a restart.

Page 31 of 105

Teradata Utilities-Breaking the Barriers, First Edition


In the above diagram, monthly data is being stored in a quarterly table. To keep the contents limited to four months,
monthly data is rotated in and out. At the end of every month, the oldest month of data is removed and the new
month is added. The cycle is "add a month, delete a month, add a month, delete a month." In our illustration, that
means that January data must be deleted to make room for May's data.
Here is a question for you: What if there was another way to accomplish this same goal without consuming all of these
extra resources? To illustrate, let's consider the following scenario: Suppose you have TableA that contains 12 billion
rows. You want to delete a range of rows based on a date and then load in fresh data to replace these rows. Normally,
the process is to perform a MultiLoad DELETE to DELETE FROM TableA WHERE <date-column> < 2002-02-01'. The
final step would be to INSERT the new rows for May using MultiLoad IMPORT.

Block and Tackle Approach


MultiLoad never loses sight of the fact that it is designed for functionality, speed, and the ability to restart. It tackles the
proverbial I/O bottleneck problem like FastLoad by assembling data rows into 64K blocks and writing them to disk on
the AMPs. This is much faster than writing data one row at a time like BTEQ. Fallback table rows are written after the
base table has been loaded. This allows users to access the base table immediately upon completion of the MultiLoad
while fallback rows are being loaded in the background. The benefit is reduced time to access the data.
Amazingly, MultiLoad has full RESTART capability in all of its five phases of operation. Once again, this demonstrates
its tremendous flexibility as a load utility. Is it pure magic? No, but it almost seems so. MultiLoad makes effective use
of two error tables to save different types of errors and a LOGTABLE that stores built-in checkpoint information for
restarting. This is why MultiLoad does not use the Transient Journal, thus averting time-consuming rollbacks when a
job halts prematurely.
Here is a key difference to note between MultiLoad and FastLoad. Sometimes an AMP (Access Module Processor)
fails and the system administrators say that the AMP is "down" or "offline." When using FastLoad, you must restart the
AMP to restart the job. MultiLoad, however, can RESTART when an AMP fails, if the table is fallback protected. As the
same time, you can use the AMPCHECK option to make it work like FastLoad if you want.

MultiLoad Imposes Limits


Rule #1: Unique Secondary Indexes are not supported on a Target Table. Like FastLoad, MultiLoad does not
support Unique Secondary Indexes (USIs). But unlike FastLoad, it does support the use of Non-Unique Secondary
Indexes (NUSIs) because the index subtable row is on the same AMP as the data row. MultiLoad uses every AMP
independently and in parallel. If two AMPs must communicate, they are not independent. Therefore, a NUSI (same
AMP) is fine, but a USI (different AMP) is not.
Rule #2: Referential Integrity is not supported. MultiLoad will not load data into tables that are defined with
Referential Integrity (RI). Like a USI, this requires the AMPs to communicate with each other. So, RI constraints must
be dropped from the target table prior to using MultiLoad.
Rule #3: Triggers are not supported at load time. Triggers cause actions on related tables based upon what
happens in a target table. Again, this is a multi-AMP operation and to a different table. To keep MultiLoad running
smoothly, disable all Triggers prior to using it.
Rule #4: No concatenation of input files is allowed. MultiLoad does not want you to do this because it could impact
are restart if the files were concatenated in a different sequence or data was deleted between runs.
Rule #5: The host will not process aggregates, arithmetic functions or exponentiation. If you need data
conversions or math, you might be better off using an INMOD to prepare the data prior to loading it.

Error Tables, Work Tables and Log Tables


Besides target table(s), MultiLoad requires the use of four special tables in order to function. They consist of two error
tables (per target table), one worktable (per target table), and one log table. In essence, the Error Tables will be used
to store any conversion, constraint or uniqueness violations during a load. Work Tables are used to receive and sort
data and SQL on each AMP prior to storing them permanently to disk. A Log Table (also called, "Logtable") is used to
store successful checkpoints during load processing in case a RESTART is needed.
HINT: Sometimes a company wants all of these load support tables to be housed in a particular database. When
these tables are to be stored in any database other than the user's own default database, then you must give them a
Page 32 of 105

Teradata Utilities-Breaking the Barriers, First Edition


qualified name (<databasename>.<tablename>) in the script or use the DATABASE command to change the current
database.
Where will you find these tables in the load script? The Logtable is generally identified immediately prior to the
.LOGON command. Worktables and error tables can be named in the BEGIN MLOAD statement. Do not
underestimate the value of these tables. They are vital to the operation of MultiLoad. Without them a MultiLoad job can
not run. Now that you have had the "executive summary", let's look at each type of table individually.
Two Error Tables: Here is another place where FastLoad and MultiLoad are similar. Both require the use of two error
tables per target table. MultiLoad will automatically create these tables. Rows are inserted into these tables only when
errors occur during the load process. The first error table is the acquisition Error Table (ET). It contains all translation
and constraint errors that may occur while the data is being acquired from the source(s).
The second is the Uniqueness Violation (UV) table that stores rows with duplicate values for Unique Primary
Indexes (UPI). Since a UPI must be unique, MultiLoad can only load one occurrence into a table. Any duplicate value
will be stored in the UV error table. For example, you might see a UPI error that shows a second employee number
"99." In this case, if the name for employee "99" is Kara Morgan, you will be glad that the row did not load since Kara
Morgan is already in the Employee table. However, if the name showed up as David Jackson, then you know that
further investigation is needed, because employee numbers must be unique.
Each error table does the following:

Identifies errors

Provides some detail about the errors

Stores the actual offending row for debugging


You have the option to name these tables in the MultiLoad script (shown later). Alternatively, if you do not name them,
they default to ET_<target_table_name> and UV_<target_table_name>. In either case, MultiLoad will not accept error
table names that are the same as target table names. It does not matter what you name them. It is recommended that
you standardize on the naming convention to make it easier for everyone on your team. For more details on how
these error tables can help you, see the subsection in this chapter titled, "Troubleshooting MultiLoad Errors."
Log Table: MultiLoad requires a LOGTABLE. This table keeps a record of the results from each phase of the load so
that MultiLoad knows the proper point from which to RESTART. There is one LOGTABLE for each run. Since
MultiLoad will not resubmit a command that has been run previously, it will use the LOGTABLE to determine the last
successfully completed step.
Work Table(s): MultiLoad will automatically create one worktable for each target table. This means that in IMPORT
mode you could have one or more worktables. In the DELETE mode, you will only have one worktable since that
mode only works on one target table. The purpose of worktables is to hold two things:
1. The Data Manipulation Language (DML) tasks
2. The input data that is ready to APPLY to the AMPs
The worktables are created in a database using PERM space. They can become very large. If the script uses multiple
SQL statements for a single data record, the data is sent to the AMP once for each SQL statement. This replication
guarantees fast performance and that no SQL statement will ever be done more than once. So, this is very important.
However, there is no such thing as a free lunch, the cost is space. Later, you will see that using a FILLER field can
help reduce this disk space by not sending unneeded data to an AMP. In other words, the efficiency of the MultiLoad
run is in your hands.

Supported Input Formats


Data input files come in a variety of formats but MultiLoad is flexible enough to handle many of them. MultiLoad
supports the following five format options: BINARY, FASTLOAD, TEXT, UNFORMAT and VARTEXT.

Page 33 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-1

MultiLoad Has Five IMPORT Phases


MultiLoad IMPORT has five phases, but don't be fazed by this! Here is the short list:

Phase 1: Preliminary Phase

Phase 2: DML Transaction Phase

Phase 3: Acquisition Phase

Phase 4: Application Phase

Phase 5: Cleanup Phase


Let's take a look at each phase and see what it contributes to the overall load process of this magnificent utility.
Should you memorize every detail about each phase? Probably not. But it is important to know the essence of each
phase because sometimes a load fails. When it does, you need to know in which phase it broke down since the
method for fixing the error to RESTART may vary depending on the phase. And if you can picture what MultiLoad
actually does in each phase, you will likely write better scripts that run more efficiently.

Phase 1: Preliminary Phase


The ancient oriental proverb says, "Measure one thousand times; Cut once." MultiLoad uses Phase 1 to conduct
several preliminary set-up activities whose goal is to provide a smooth and successful climate for running your load.
The first task is to be sure that the SQL syntax and MultiLoad commands are valid. After all, why try to run a script
when the system will just find out during the load process that the statements are not useable? MultiLoad knows that it
is much better to identify any syntax errors, right up front. All the preliminary steps are automated. No user intervention
is required in this phase.
Second, all MultiLoad sessions with Teradata need to be established. The default is the number of available AMPs.
Teradata will quickly establish this number as a factor of 16 for the basis regarding the number of sessions to create.
The general rule of thumb for the number of sessions to use for smaller systems is the following: use the number of
AMPs plus two more. For larger systems with hundreds of AMP processors, the SESSIONS option is available to
lower the default. Remember, these sessions are running on your poor little computer as well as on Teradata.
Each session loads the data to Teradata across the network or channel. Every AMP plays an essential role in the
MultiLoad process. They receive the data blocks, hash each row and send the rows to the correct AMP. When the
rows come to an AMP, it stores them in worktable blocks on disk. But, lest we get ahead of ourselves, suffice it to say
that there is ample reason for multiple sessions to be established.
What about the extra two sessions? Well, the first one is a control session to handle the SQL and logging. The second
is a back up or alternate for logging. You may have to use some trial and error to find what works best on your system
configuration. If you specify too few sessions it may impair performance and increase the time it takes to complete
load jobs. On the other hand, too many sessions will reduce the resources available for other important database
activities.
Third, the required support tables are created. They are the following:

Page 34 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-2
The final task of the Preliminary Phase is to apply utility locks to the target tables. Initially, access locks are placed on
all target tables, allowing other users to read or write to the table for the time being. However, this lock does prevent
the opportunity for a user to request an exclusive lock. Although, these locks will still allow the MultiLoad user to drop
the table, no one else may DROP or ALTER a target table while it is locked for loading. This leads us to Phase 2.

Phase 2: DML Transaction Phase


In Phase 2, all of the SQL Data Manipulation Language (DML) statements are sent ahead to Teradata. MultiLoad
allows the use of multiple DML functions. Teradata's Parsing Engine (PE) parses the DML and generates a step-bystep plan to execute the request. This execution plan is then communicated to each AMP and stored in the
appropriate worktable for each target table. In other words, each AMP is going to work off the same page.
Later, during the Acquisition phase the actual input data will also be stored in the worktable so that it may be applied in
Phase 4, the Application Phase. Next, a match tag is assigned to each DML request that will match it with the
appropriate rows of input data. The match tags will not actually be used until the data has already been acquired and
is about to be applied to the worktable. This is somewhat like a student who receives a letter from the university in the
summer that lists his courses, professor's names, and classroom locations for the upcoming semester. The letter is a
"match tag" for the student to his school schedule, although it will not be used for several months. This matching tag
for SQL and data is the reason that the data is replicated for each SQL statement using the same data record.

Phase 3: Acquisition Phase


With the proper set-up complete and the PE's plan stored on each AMP, MultiLoad is now ready to receive the INPUT
data. This is where it gets interesting! MultiLoad now acquires the data in large, unsorted 64K blocks from the host
and sends it to the AMPs.
At this point, Teradata does not care about which AMP receives the data block. The blocks are simply sent, one after
the other, to the next AMP in line. For their part, each AMP begins to deal with the blocks that they have been dealt. It
is like a game of cards-you take the cards that you have received and then play the game. You want to keep some
and give some away.
Similarly, the AMPs will keep some data rows from the blocks and give some away. The AMP hashes each row on the
primary index and sends it over the BYNET to the proper AMP where it will ultimately be used. But the row does not
get inserted into its target table, just yet. The receiving AMP must first do some preparation before that happens. Don't
you have to get ready before company arrives at your house? The AMP puts all of the hashed rows it has received
from other AMPs into the worktables where it assembles them into the SQL. Why? Because once the rows are
reblocked, they can be sorted into the proper order for storage in the target table. Now the utility places a load lock on
each target table in preparation for the Application Phase. Of course, there is no Acquisition Phase when you perform
a MultiLoad DELETE task, since no data is being acquired.

Phase 4: Application Phase


The purpose of this phase is to write, or APPLY, the specified changes to both the target tables and NUSI subtables.
Once the data is on the AMPs, it is married up to the SQL for execution. To accomplish this substitution of data into
SQL, when sending the data, the host has already attached some sequence information and five (5) match tags to
each data row. Those match tags are used to join the data with the proper SQL statement based on the SQL
statement within a DMP label. In addition to associating each row with the correct DML statement, match tags also
guarantee that no row will be updated more than once, even when a RESTART occurs.
The following five columns are the matching tags:

Page 35 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-3
Remember, MultiLoad allows for the existence of NUSI processing during a load. Every hash-sequence sorted block
from Phase 3 and each block of the base table is read only once to reduce I/O operations to gain speed. Then, all
matching rows in the base block are inserted, updated or deleted before the entire block is written back to disk, one
time. This is why the match tags are so important. Changes are made based upon corresponding data and DML
(SQL) based on the match tags. They guarantee that the correct operation is performed for the rows and blocks with
no duplicate operations, a block at a time. And each time a table block is written to disk successfully, a record is
inserted into the LOGTABLE. This permits MultiLoad to avoid starting again from the very beginning if a RESTART is
needed.
What happens when several tables are being updated simultaneously? In this case, all of the updates are scripted as
a multi-statement request. That means that Teradata views them as a single transaction. If there is a failure at any
point of the load process, MultiLoad will merely need to be RESTARTed from the point where it failed. No rollback is
required. Any errors will be written to the proper error table.

Phase 5: Clean Up Phase


Those of you reading these paragraphs that have young children or teenagers will certainly appreciate this final
phase! MultiLoad actually cleans up after itself. The utility looks at the final Error Code (&SYSRC). MultiLoad believes
the adage, "All is well that ends well." If the last error code is zero (0), all of the job steps have ended successfully
(i.e., all has certainly ended well). This being the case, all empty error tables, worktables and the log table are
dropped. All locks, both Teradata and MultiLoad, are released. The statistics for the job are generated for output
(SYSPRINT) and the system count variables are set. After this, each MultiLoad session is logged off. So what
happens if the final error code is not zero? Stay tuned. Restarting MultiLoad is a topic that will be covered later in this
chapter.

MultiLoad Commands
Two Types of Commands
You may see two types of commands in MultiLoad scripts: tasks and support functions. MultiLoad tasks are
commands that are used by the MultiLoad utility for specific individual steps as it processes a load. Support functions
are those commands that involve the Teradata utility Support Environment (covered in Chapter 9), are used to set
parameters, or are helpful for monitoring a load.
The chart below lists the key commands, their type, and what they do.

Page 36 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-4

Parameters for .BEGIN IMPORT MLOAD


Here is a list of components or parameters that may be used in the .BEGIN IMPORT command. Note: The parameters
do not require the usual dot prior to the command since they are actually sub-commands.

Page 37 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-5

Parameters for .BEGIN DELETE MLOAD


Here is a list of components or parameters that may be used in the BEGIN DELETE command. Note: The parameters
do not require the usual dot prior to the command since parameters are actually sub-commands.

A Simple MultiLoad IMPORT Script


MultiLoad can be somewhat intimidating to the new user because there are many commands and phases. In reality,
the load scripts are understandable when you think through what the IMPORT mode does:

Setting up a Logtable

Logging onto Teradata

Identifying the Target, Work and Error tables

Defining the INPUT flat file

Defining the DML activities to occur

Naming the IMPORT file

Telling MultiLoad to use a particular LAYOUT

Telling the system to start loading

Finishing loading and logging off of Teradata


This first script example is designed to show MultiLoad IMPORT in its simplest form. It depicts the loading of a threecolumn Employee table. The actual script is in the left column and our comments are on the right. Below the script is a
step-by-step description of how this script works.
Step One: Setting up a Logtable and Logging onto Teradata- MultiLoad requires you specify a log table right at the
outset with the .LOGTABLE command. We have called it CDW_Log. Once you name the Logtable, it will be
automatically created for you. The Logtable may be placed in the same database as the target table, or it may be
placed in another database. Immediately after this you log onto Teradata using the .LOGON command. The order of
these two commands is interchangeable, but it is recommended to define the Logtable first and then to Log on,
Page 38 of 105

Teradata Utilities-Breaking the Barriers, First Edition


second. If you reverse the order, Teradata will give a warning message. Notice that the commands in MultiLoad
require a dot in front of the command key word.
Step Two: Identifying the Target, Work and Error tables- In this step of the script you must tell Teradata which
tables to use. To do this, you use the .BEGIN IMPORT MLOAD command. Then you will preface the names of these
tables with the sub-commands TABLES, WORKTABLES AND ERROR TABLES. All you must do is name the tables
and specify what database they are in. Work tables and error tables are created automatically for you. Keep in mind
that you get to name and locate these tables. If you do not do this, Teradata might supply some defaults of its own!
At the same time, these names are optional. If the WORKTABLES and ERRORTABLES had not specifically been
named, the script would still execute and build these tables. They would have been built in the default database for the
user. The name of the worktable would be WT_EMPLOYEE_DEPT1 and the two error tables would be called
ET_EMPLOYEE_DEPT1 and UV_EMPLOYEE_DEPT1, respectively.
Sometimes, large Teradata systems have a work database with a lot of extra PERM space. One customer calls this
database CORP_WORK. This is where all of the logtables and worktables are normally created. You can use a
DATABASE command to point all table creations to it or qualify the names of these tables individually.
Step Three: Defining the INPUT flat file record structure- MultiLoad is going to need to know the structure the
INPUT flat file. Use the .LAYOUT command to name the layout. Then list the fields and their data types used in your
SQL as a .FIELD. Did you notice that an asterisk is placed between the column name and its data type? This means
to automatically calculate the next byte in the record. It is used to designate the starting location for this data based on
the previous fields length. If you are listing fields in order and need to skip a few bytes in the record, you can either
use the .FILLER (like above) to position to the cursor to the next field, or the "*" on the Dept_No field could have been
replaced with the number 132 ( CHAR(11)+CHAR(20)+CHAR(100)+1 ). Then, the .FILLER is not needed. Also, if the
input record fields are exactly the same as the table, the .TABLE can be used to automatically define all the .FIELDS
for you. The LAYOUT name will be referenced later in the .IMPORT command. If the input file is created with
INDICATORS, it is specified in the LAYOUT.
Step Four: Defining the DML activities to occur- The .DML LABEL names and defines the SQL that is to execute. It
is like setting up executable code in a programming language, but using SQL. In our example, MultiLoad is being told
to INSERT a row into the SQL01.Employee_Dept table. The VALUES come from the data in each FIELD because it is
preceded by a colon (:). Are you allowed to use multiple labels in a script? Sure! But remember this: Every label must
be referenced in an APPLY clause of the .IMPORT clause.
Step Five: Naming the INPUT file and its format type- This step is vital! Using the .IMPORT command, we have
identified the INFILE data as being contained in a file called "CDW_Join_Export.txt". Then we list the FORMAT type
as TEXT. Next, we referenced the LAYOUT named FILEIN to describe the fields in the record. Finally, we told
MultiLoad to APPLY the DML LABEL called INSERTS - that is, to INSERT the data rows into the target table. This is
still a sub-component of the .IMPORT MLOAD command. If the script is to run on a mainframe, the INFILE name is
actually the name of a JCL Data Definition (DD) statement that contains the real name of the file.
Notice that the .IMPORT goes on for 4 lines of information. This is possible because it continues until it finds the semicolon to define the end of the command. This is how it determines one operation from another. Therefore, it is very
important or it would have attempted to process the END LOADING as part of the IMPORT-it wouldn't work.
Step Six: Finishing loading and logging off of Teradata- This is the closing ceremonies for the load. MultiLoad to
wrap things up, closes the curtains, and logs off of the Teradata system.
Important note: Since the script above in Figure 5-6 does not DROP any tables, it is completely capable of being
restarted if an error occurs. Compare this to the next script in Figure 5-7. Do you think it is restartable? If you said no,
part yourself on the back.

Figure 5-6
Page 39 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-7

MultiLoad IMPORT Script


Let's take a look at MultiLoad IMPORT script that comes from real life. This sample script will look much more like
what you might encounter at your workplace. It is more detailed. The notes to the right are brief and too the point.
They will help you can grasp the essence of what is happening in the script.

Page 40 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-8

Error Treatment Options for the .DML LABEL Command


MultiLoad allows you to tailor how it deals with different types of errors that it encounters during the load process, to fit
your needs. Here is a summary of the options available to you:

Figure 5-9
In IMPORT mode, you may specify as many as five distinct error-treatment options for one.DML statement. For
example, if there is more than one instance of a row, do you want MultiLoad to IGNORE the duplicate row, or to MARK
it (list it) in an error table? If you do not specify IGNORE, then MultiLoad will MARK, or record all of the errors. Imagine
you have a standard INSERT load that you know will end up recording about 20,000 duplicate row errors. Using the
following syntax "IGNORE DUPLICATE INSERT ROWS;" will keep them out of the error table. By ignoring those
errors, you gain three benefits:
1. You do not need to see all the errors.
2. The error table is not filled up needlessly.
3. MultiLoad runs much faster since it is not conducting a duplicate row check.
When doing an UPSERT, there are two rules to remember:

The default is IGNORE MISSING UPDATE ROWS. Mark is the default for all operations. When doing an
UPSERT, you anticipate that some rows are missing, otherwise, why do an UPSERT. So, this keeps these rows
out of your error table.

The DO INSERT FOR MISSING UPDATE ROWS is mandatory. This tells MultiLoad to insert a row from the
data source if that row does not exist in the target table because the update didn't find it.
Page 41 of 105

Teradata Utilities-Breaking the Barriers, First Edition


The table that follows shows you, in more detail, how flexible your options are:

Figure 5-10

An IMPORT Script with Error Treatment Options


The command . DML LABEL names any DML options (INSERT, UPDATE OR DELETE) that immediately follow it in
the script. Each label must be given a name. In IMPORT mode, the label will be referenced for use in the APPLY
Phase when certain conditions are met. The following script provides an example of just one such possibility:

Figure 5-11
Page 42 of 105

Teradata Utilities-Breaking the Barriers, First Edition

A IMPORT Script that Uses Two Input Data Files

Figure 5-12

Redefining the INPUT


Sometimes, instead of using two different INPUT DATA files, which require two separate LAYOUTs, you can combine
them into one INPUT DATA file. And you can use that one file, with just one LAYOUT to load more than one table!
You see, a flat file may contain more than one type of data record. As long as each record has a unique code to
identify it, MultiLoad can check this code and know which layout to use for using different names in the same layout.
To do this you will need to REDEFINE the INPUT.
You do this by redefining a field's position in the .FIELD or .FILLER section of the LAYOUT. Unlike the asterisk (*),
which means that a field simply follows the previous one, redefining will cite a number that tells MultiLoad to take a
certain portion of the INPUT file and jump to the redefined position to back toward the beginning of the record.

A Script that Uses Redefining the Input


The following script uses the ability to define two record types in the same input data file. It uses a .FILLER to define
the code since it is never used in the SQL, only to determine which SQL to run.

Page 43 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-13

A DELETE MLOAD Script Using a Hard Coded Value


The next script demonstrates how to use the MultiLoad DELETE task. In this example, students no longer enrolled in
the university are being removed from the Student_Profile table, based upon the registration date. The profile of any
student who enrolled prior to this date will be removed.

Figure 5-14
How many differences from a MultiLoad IMPORT script readily jump off of the page at you? Here are a few that we
saw:

At the beginning, you must specify the word "DELETE" in the .BEGIN MLOAD command. You need not
specify it in the .END MLOAD command.
Page 44 of 105

Teradata Utilities-Breaking the Barriers, First Edition

You will readily notice that this mode has no .DML LABEL command. Since it is focused on just one absolute
function, no APPLY clause is required so you see no .DML LABEL.
Notice that the DELETE with a WHERE clause is an SQL function, not a MultiLoad command, so it has no dot
prefix.
Since default names are available for worktables (WT_<target_tablename>) and error tables
(ET_<target_tablename> and UV_<target_tablename>), they need not be specifically named, but be sure to
define the Logtable.

Do not confuse the DELETE MLOAD task with the SQL delete task that may be part of a MultiLoad IMPORT. The
IMPORT delete is used to remove small volumes of data rows based upon the Primary Index. On the other hand, the
MultiLoad DELETE does global deletes on tables, bypassing the Transient Journal. Because there is no Transient
Journal, there are no rollbacks when the job fails for any reason. Instead, it may be RESTARTed from a
CHECKPOINT. Also, the MultiLoad DELETE task is never based upon the Primary Index.
Because we are not importing any data rows, there is neither a need for worktables or an Acquisition Phase. One
DELETE statement is sent to all the AMPs with a match tag parcel. That statement will be applied to every table row. If
the condition is met, then the row is deleted. Using the match tags, each target block is read once and the appropriate
rows are deleted.

A DELETE MLOAD Script Using a Variable


This illustration demonstrates how passing the values of a data row rather than a hard coded value may be used to
help meet the conditions stated in the WHERE clause. When you are passing values, you must add some additional
commands that were not used in the DELETE example with hard coded values. You will see .LAYOUT and .IMPORT
INFILE in this script.

Figure 5-15

An UPSERT Sample Script


The following sample script is provided to demonstrate how do an UPSERT-that is, to update a table and if a row from
the data source table does not exist in the target table, then insert a new row. In this instance we are loading the
Student_Profile table with new data for the next semester. The clause "DO INSERT FOR MISSING UPDATE ROWS"
indicates an UPSERT. The DML statements that follow this option must be in the order of a single UPDATE statement
followed by a single INSERT statement.

Page 45 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-16

What Happens when MultiLoad Finishes


MultiLoad Statistics

Figure 5-17

Page 46 of 105

Teradata Utilities-Breaking the Barriers, First Edition

MultiLoad Output From and UPSERT

Page 47 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-18

Troubleshooting MultiLoad Errors-More on the Error Tables


The output statistics in the above example indicate that the load was entirely successful. But that is not always the
case. Now we need to troubleshoot in order identify the errors and correct them, if desired. Earlier on, we noted that
MultiLoad generates two error tables, the Acquisition Error and the Application error table. You may select from these
tables to discover the problem and research the issues.
For the most part, the Acquisition error table logs errors that occur during that processing phase. The Application error
table lists Unique Primary Index violations, field overflow errors on non-PI columns, and constraint errors that occur in
the APPLY phase. MultiLoad error tables not only list the errors they encounter, they also have the capability to
STORE those errors. Do you remember the MARK and IGNORE parameters? This is where they come into play.
MARK will ensure that the error rows, along with some details about the errors are stored in the error table. IGNORE
does neither; it is as if the error never occurred.

Figure 5-19

Page 48 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 5-20

RESTARTing MultiLoad
Who hasn't experienced a failure at some time when attempting a load? Don't take it personally! Failures can and do
occur on the host or Teradata (DBC) for many reasons. MultiLoad has the impressive ability to RESTART from failures
in either environment. In fact, it requires almost no effort to continue or resubmit the load job. Here are the factors that
determine how it works:
First, MultiLoad will check the Restart Logtable and automatically resume the load process from the last successful
CHECKPOINT before the failure occurred. Remember, the Logtable is essential for restarts. MultiLoad uses neither
the Transient Journal nor rollbacks during a failure. That is why you must designate a Logtable at the beginning of
your script. MultiLoad either restarts by itself or waits for the user to resubmit the job. Then MultiLoad takes over right
where it left off.
Second, suppose Teradata experiences a reset while MultiLoad is running. In this case, the host program will restart
MultiLoad after Teradata is back up and running. You do not have to do a thing!
Third, if a host mainframe or network client fails during a MultiLoad, or the job is aborted, you may simply resubmit the
script without changing a thing. MultiLoad will find out where it stopped and start again from that very spot.
Fourth, if MultiLoad halts during the Application Phase it must be resubmitted and allowed to run until complete.
Fifth, during the Acquisition Phase the CHECKPOINT (n) you stipulated in the .BEGIN MLOAD clause will be enacted.
The results are stored in the Logtable. During the Application Phase, CHECKPOINTs are logged each time a data
block is successfully written to its target table.
HINT: The default number for CHECKPOINT is 15 minutes, but if you specify the CHECKPOINT as 60 or less,
minutes are assumed. If you specify the checkpoint at 61 or above, the number of records is assumed.

RELEASE MLOAD: When You DON'T Want to Restart MultiLoad


What if a failure occurs but you do not want to RESTART MultiLoad? Since MultiLoad has already updated the table
headers, it assumes that it still "owns" them. Therefore, it limits access to the table(s). So what is a user to do? Well
there is good news and bad news. The good news is that if the job you may use the RELEASE MLOAD command to
release the locks and rollback the job. The bad news is that if you have been loading multiple millions of rows, the
rollback may take a lot of time. For this reason, most customers would rather just go ahead and RESTART.
Before V2R3: In the earlier days of Teradata it was NOT possible to use RELEASE MLOAD if one of the following
three conditions was true:

In IMPORT mode, once MultiLoad had reached the end of the Acquisition Phase you could not use RELEASE
MLOAD. This is sometimes referred to as the "point of no return."

In DELETE mode, the point of no return was when Teradata received the DELETE statement.

If the job halted in the Apply Phase, you will have to RESTART the job.
With and since V2R3: The advent of V2R3 brought new possibilities with regard to using the RELEASE MLOAD
command. It can NOW be used in the APPLY Phase, if:

You are running a Teradata V2R3 or later version

You use the correct syntax:

RELEASE MLOAD <target-table> IN APPLY


The load script has NOT been modified in any way
The target tables either:
o
Must be empty, or
o
Must have no Fallback, no NUSIs, no Permanent Journals

Page 49 of 105

Teradata Utilities-Breaking the Barriers, First Edition


You should be very cautious using the RELEASE command. It could potentially leave your table half updated.
Therefore, it is handy for a test environment, but please don't get too reliant on it for production runs. They should be
allowed to finish to guarantee data integrity.

MultiLoad and INMODs


INMODs, or Input Modules, may be called by MultiLoad in either mainframe or LAN environments, providing the
appropriate programming languages are used. INMODs are user written routines whose purpose is to read data from
one or more sources and then convey it to a load utility, here MultiLoad, for loading into Teradata. They allow
MultiLoad to focus solely on loading data by doing data validation or data conversion before the data is ever touched
by MultiLoad. INMODs replace the normal MVS DDNAME or LAN file name with the following statement:
.IMPORT INMOD=<INMOD-NAME>
You will find a more detailed discussion on how to write INMODs for MultiLoad in the chapter of this book titled,
"INMOD Processing".

How MultiLoad Compares with FastLoad

Figure 5-21

Page 50 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Chapter 6: TPump
An Introduction to TPump
The chemistry of relationships is very interesting. Frederick Buechner once stated, "My assumption is that the story of
any one of us is in some measure the story of us all." In this chapter, you will find that TPump has similarities with the
rest of the family of Teradata utilities. But this newer utility has been designed with fewer limitations and many
distinguishing abilities that the other load utilities do not have.
Do you remember the first Swiss Army knife you ever owned? Aside from its original intent as a compact survival
tool, this knife has thrilled generations with its multiple capabilities. TPump is the Swiss Army knife of the Teradata
load utilities. Just as this knife was designed for small tasks, TPump was developed to handle batch loads with low
volumes. And, just as the Swiss Army knife easily fits in your pocket when you are loaded down with gear, TPump is
a perfect fit when you have a large, busy system with few resources to spare. Let's look in more detail at the many
facets of this amazing load tool.

Why It Is Called "TPump"


TPump is the shortened name for the load utility Teradata Parallel Data Pump. To understand this, you must know
how the load utilities move the data. Both FastLoad and MultiLoad assemble massive volumes of data rows into 64K
blocks and then moves those blocks. Picture in your mind the way that huge ice blocks used to be floated down long
rivers to large cities prior to the advent of refrigeration. There they were cut up and distributed to the people. TPump
does NOT move data in the large blocks. Instead, it loads data one row at a time, using row hash locks. Because it
locks at this level, and not at the table level like MultiLoad, TPump can make many simultaneous, or concurrent,
updates on a table.
Envision TPump as the water pump on a well. Pumping in a very slow, gentle manner results in a steady trickle of
water that could be pumped into a cup. But strong and steady pumping results in a powerful stream of water that
would require a larger container. TPump is a data pump which, like the water pump, may allow either a trickle-feed of
data to flow into the warehouse or a strong and steady stream. In essence, you may "throttle" the flow of data based
upon your system and business user requirements. Remember, TPump is THE PUMP!

TPump Has Many Unbelievable Abilities


Just in Time: Transactional systems, such those implemented for ATM machines or Point-of-Sale terminals, are
known for their tremendous speed in executing transactions. But how soon can you get the information pertaining to
that transaction into the data warehouse? Can you afford to wait until a nightly batch load? If not, then TPump may be
the utility that you are looking for! TPump allows the user to accomplish near real-time updates from source systems
into the Teradata data warehouse.
Throttle-switch Capability: What about the throttle capability that was mentioned above? With TPump you may
stipulate how many updates may occur per minute. This is also called the statement rate. In fact, you may change the
statement rate during the job, "throttling up" the rate with a higher number, or "throttling down" the number of
updates with a lower one. An example: Having this capability, you might want to throttle up the rate during the period
from 12:00 noon to 1:30 PM when most of the users have gone to lunch. You could then lower the rate when they
return and begin running their business queries. This way, you need not have such clearly defined load windows, as
the other utilities require. You can have TPump running in the background all the time, and just control its flow rate.
DML Functions: Like MultiLoad, TPump does DML functions, including INSERT, UPDATE and DELETE. These can
be run solo, or in combination with one another. Note that it also supports UPSERTs like MultiLoad. But here is one
place that TPump differs vastly from the other utilities: FastLoad can only load one table and MultiLoad can load up to
five tables. But, when it pulls data from a single source, TPump can load more than 60 tables at a time! And the
number of concurrent instances in such situations is unlimited. That's right, not 15, but unlimited for Teradata! Well OK,
maybe by your computer. I cannot imagine my laptop running 20 TPump jobs, but Teradata does not care.
How could you use this ability? Well, imagine partitioning a huge table horizontally into multiple smaller tables and
then performing various DML functions on all of them in parallel. Keep in mind that TPump places no limit on the
number of sessions that may be established. Now, think of ways you might use this ability in your data warehouse
environment. The possibilities are endless.
More benefits: Just when you think you have pulled out all of the options on a Swiss Army knife, there always
seems to be just one more blade or tool you had not noticed. Similar to the knife, TPump always seems to have
Page 51 of 105

Teradata Utilities-Breaking the Barriers, First Edition


another advantage in its list of capabilities. Here are several that relate to TPump requirements for target tables.
TPump allows both Unique and Non-Unique Secondary Indexes (USIs and NUSIs), unlike FastLoad, which allows
neither, and MultiLoad, which allows just NUSIs. Like MultiLoad, TPump allows the target tables to either be empty or
to be populated with data rows. Tables allowing duplicate rows (MULTISET tables) are allowed. Besides this,
Referential Integrity is allowed and need not be dropped. As to the existence of Triggers, TPump says, "No problem!"
Support Environment compatibility: The Support Environment (SE) works in tandem with TPump to enable the
operator to have even more control in the TPump load environment. The SE coordinates TPump activities, assists in
managing the acquisition of files, and aids in the processing of conditions for loads. The Support Environment aids in
the execution of DML and DDL that occur in Teradata, outside of the load utility.
Stopping without Repercussions: Finally, this utility can be stopped at any time and all of locks may be dropped
with no ill consequences. Is this too good to be true? Are there no limits to this load utility? TPump does not like to
steal any thunder from the other load utilities, but it just might become one of the most valuable survival tools for
businesses in today's data warehouse environment.

TPump Has Some Limits


TPump has rightfully earned its place as a superstar in the family of Teradata load utilities. But this does not mean that
it has no limits. It has a few that we will list here for you:

Rule #1: No concatenation of input data files is allowed. TPump is not designed to support this.

Rule #2: TPump will not process aggregates, arithmetic functions or exponentiation.

If you need data conversions or math, you might consider using an INMOD to prepare the data prior to loading
it.

Rule #3: The use of the SELECT function is not allowed. You may not use SELECT in your SQL
statements.

Rule #4: No more than four IMPORT commands may be used in a single load task. This means that a
most, four files can be directly read in a single run.

Rule #5: Dates before 1900 or after 1999 must be represented by the yyyy format for the year portion
of the date, not the default format of yy. This must be specified when you create the table. Any dates using the
default yy format for the year are taken to mean 20th century years.

Rule #6: On some network attached systems, the maximum file size when using TPump is 2GB. This is
true for a computer running under a 32-bit operating system.

Rule #7: TPump performance will be diminished if Access Logging is used. The reason for this is that
TPump uses normal SQL to accomplish its tasks. Besides the extra overhead incurred, if you use Access Logging
for successful table updates, then Teradata will make an entry in the Access Log table for each operation. This
can cause the potential for row hash conflicts between the Access Log and the target tables.

Supported Input Formats


TPump, like MultiLoad, supports the following five format options: BINARY, FASTLOAD, TEXT, UNFORMAT and
VARTEXT. But TPump is quite finicky when it comes to data format errors. Such errors will generally cause TPump to
terminate. You have got to be careful! In fact, you may specify an Error Limit to keep TPump from terminating
prematurely when faced with a data format error. You can specify a number (n) of errors that are to be tolerated before
TPump will halt. Here is a data format chart for your reference:

Figure 6-1

Page 52 of 105

Teradata Utilities-Breaking the Barriers, First Edition

TPump Commands and Parameters


Each command in TPump must begin on a new line, preceded by a dot. It may utilize several lines, but must always
end in a semi-colon. Like MultiLoad, TPump makes use of several optional parameters in the .BEGIN LOAD
command. Some are the same ones used by MultiLoad. However, TPump has other parameters. Let's look at each
group.

LOAD Parameters IN COMMON with MultiLoad

Figure 6-2

.BEGIN LOAD Parameters UNIQUE to TPump

Figure 6-3

A Simple TPump Script-A Look at the Basics

Setting up a Logtable and Logging onto Teradata


Begin load process, add Parameters, naming the error table
Defining the INPUT flat file
Defining the DML activities to occur
Naming the IMPORT file and defining its FORMAT
Telling TPump to use a particular LAYOUT
Telling the system to start loading data rows
Page 53 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Finishing loading and logging off of Teradata

The following script assumes the existence of a Student_Names table in the SQL01 database. You may use preexisting target tables when running TPump or TPump may create the tables for you. In most instances you will use
existing tables. The CREATE TABLE statement for this table is listed for your convenience.
CREATE TABLE SQL01.Student Names
( Student_ID

INTEGER

,Last_Name

CHAR (20)

,First_Name

VARCHAR (14))

Unique Primary Index ( Student_ID);


Much of the TPump command structure should look quite familiar to you. It is quite similar to MultiLoad. In this
example, the Student_Names table is being loaded with new data from the university's registrar. It will be used as an
associative table for linking various tables in the data warehouse.

Figure 6-4
Step One: Setting up a Logtable and Logging onto Teradata-First, you define the Logtable using the .LOGTABLE
command. We have named it LOG_PUMP in the WORK_DB database. The Logtable is automatically created for you.
It may be placed in any database by qualifying the table name with the name of the database by using syntax like this:
<databasename>.<tablename>
Next, the connection is made to Teradata. Notice that the commands in TPump, like those in MultiLoad, require a dot
in front of the command key word.
Step Two: Begin load process, add Parameters, naming the Error Table- Here, the script reveals the parameters
requested by the user to assist in managing the load for smooth operation. It also names the one error table, calling it
SQL01.ERR_PUMP. Now let's look at each parameter:

ERRLIMIT 5 says that the job should terminate after encountering five errors. You may set the limit that is
tolerable for the load.
Page 54 of 105

Teradata Utilities-Breaking the Barriers, First Edition

CHECKPOINT 1 tells TPump to pause and evaluate the progress of the load in increments of one minute. If
the factor is between 1 and 60, it refers to minutes. If it is over 60, then it refers to the number of rows at which
the checkpointing should occur.
SESSIONS 64 tells TPump to establish 64 sessions with Teradata.
TENACITY 2 says that if there is any problem establishing sessions, then to keep on trying for a period of two
hours.
PACK 40 tells TPump to "pack" 40 data rows and load them at one time.
RATE 1000 means that 1,000 data rows will be sent per minute.

Step Three: Defining the INPUT flat file structure- TPump, like MultiLoad, needs to know the structure the INPUT
flat file record. You use the .LAYOUT command to name the layout. Following that, you list the columns and data types
of the INPUT file using the .FIELD, .FILLER or .TABLE commands. Did you notice that an asterisk is placed between
the column name and its data type? This means to automatically calculate the next byte in the record. It is used to
designate the starting location for this data based on the previous field's length. If you are listing fields in order and
need to skip a few bytes in the record, you can either use the .FILLER with the correct number of bytes as character to
position to the cursor to the next field, or the "*" can be replaced by a number that equals the lengths of all previous
fields added together plus 1 extra byte. When you use this technique, the .FILLER is not needed. In our example, this
says to begin with Student_ID, continue on to load Last_Name, and finish when First_Name is loaded.
Step Four: Defining the DML activities to occur- At this point, the .DML LABEL names and defines the SQL that is
to execute. It also names the columns receiving data and defines the sequence in which the VALUES are to be
arranged. In our example, TPump is to INSERT a row into the SQL01.Student_NAMES. The data values coming in
from the record are named in the VALUES with a colon prior to the name. This provides the PE with information on
what substitution is to take place in the SQL. Each LABEL used must also be referenced in an APPLY clause of the
.IMPORT clause.
Step Five: Naming the INPUT file and defining its FORMAT- Using the .IMPORT INFILE command, we have
identified the INPUT data file as "CDW_Export.txt". The file was created using the TEXT format.
Step Six: Associate the data with the description- Next, we told the IMPORT command to use the LAYOUT called,
"FILELAYOUT."
Step Seven: Telling TPump to start loading- Finally, we told TPump to APPLY the DML LABEL called INSREC-that
is, to INSERT the data rows into the target table.
Step Seven: Finishing loading and logging off of Teradata- The .END LOAD command tells TPump to finish the
load process. Finally, TPump logs off of the Teradata system.

Page 55 of 105

Teradata Utilities-Breaking the Barriers, First Edition

TPump Script with Error Treatment Options

Figure 6-5

TPump Output Statistics


This illustration shows the actual TPump statistics for the sample script above. Notice how well TPump breaks out
what happened during each part of the load process.

Page 56 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 6-6

Page 57 of 105

Teradata Utilities-Breaking the Barriers, First Edition

A TPump Script that Uses Two Input Data Files

Figure 6-7

Page 58 of 105

Teradata Utilities-Breaking the Barriers, First Edition

A TPump UPSERT Sample Script

Figure 6-8
The following is the output from the above UPSERT:

Page 59 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 6-9
NOTE: The above UPSERT uses the same syntax as MultiLoad. This continues to work. However, there might soon
be another way to accomplish this task. NCR has built an UPSERT and we have tested the following statement,
without success:
UPDATE SQL01.Student_Profile
SET

WHERE

Last_Name

= :Last_Name

,First_Name

= :First_Name

,Class_Code

= :Class_Code

,Grade_Pt

= :Grade_Pt

Student_ID = :Student_ID;

ELSE INSERT INTO SQL01.Student_Profile


VALUES (:Student_ID
,:Last_Name
,:First_Name
,:Class_Code
,:Grade_Pt);
We are not sure if this will be a future technique for coding a TPump UPSERT, or if it is handled internally. For now,
use the original coding technique.

Monitoring TPump
TPump comes with a monitoring tool called the TPump Monitor. This tool allows you to check the status of TPump
jobs as they run and to change (remember "throttle up" and "throttle down?") the statement rate on the fly. Key to this
monitor is the "SysAdmin.TpumpStatusTbl" table in the Data Dictionary Directory. If your Database Administrator
Page 60 of 105

Teradata Utilities-Breaking the Barriers, First Edition


creates this table, TPump will update it on a minute-by-minute basis when it is running. You may update the table to
change the statement rate for an IMPORT. If you want TPump to run unmonitored, then the table is not needed.
You can start a monitor program under UNIX with the following command:
tpumpmon [-h] [TDPID/] <UserName>,<Password>[,<AccountID>]
Below is a chart that shows the Views and Macros used to access the "SysAdmin.TpumpStatusTbl" table. Queries
may be written against the Views. The macros may be executed.

Figure 6-9

Handling Errors in TPump Using the Error Table


One Error Table
Unlike FastLoad and MultiLoad, TPump uses only ONE Error Table per target table, not two. If you name the table,
TPump will create it automatically. Entries are made to these tables whenever errors occur during the load process.
Like MultiLoad, TPump offers the option to either MARK errors (include them in the error table) or IGNORE errors (pay
no attention to them whatsoever). These options are listed in the .DML LABEL sections of the script and apply ONLY
to the DML functions in that LABEL. The general default is to MARK. If you specify nothing, TPump will assume the
default. When doing an UPSERT, this default does not apply.
The error table does the following:

Identifies errors

Provides some detail about the errors

Stores a portion the actual offending row for debugging


When compared to the error tables in MultiLoad, the TPump error table is most similar to the MultiLoad Acquisition
error table. Like that table, it stores information about errors that take place while it is trying to acquire data. It is the
errors that occur when the data is being moved, such as data translation problems that TPump will want to report on. It
will also want to report any difficulties compiling valid Primary Indexes. Remember, TPump has less tolerance for
errors than FastLoad or Multiload.

Figure 6-10

Page 61 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Common Error Codes and What They Mean


TPump users often encounter three error codes that pertain to

Missing data rows

Duplicate data rows

Extra data rows


Become familiar with these error codes and what they mean. This could save you time getting to the root of some
common errors you could see in your future!
#1: Error 2816: Failed to insert duplicate row into TPump Target Table. Nothing is wrong when you see this error.
In fact, it can be a very good thing. It means that TPump is notifying you that it discovered a DUPLICATE row. This
error jumps to life when one of the following options has been stipulated in the .DML LABEL:

MARK DUPLICATE INSERT ROWS

MARK DUPLICATE UPDATE ROWS


Note that the original row will be inserted into the target table, but the duplicate row will not.
#2: Error 2817: Activity count greater than ONE for TPump UPDATE/DELETE.
Sometimes you want to know if there were too may "successes." This is the case when there are EXTRA rows when
TPump is attempting an UPDATE or DELETE.
TPump will log an error whenever it sees an activity count greater than zero for any such extra rows if you have
specified either of these options in a .DML LABEL:

MARK EXTRA UPDATE ROWS

MARK EXTRA DELETE ROW


At the same time, the associated UPDATE or DELETE will be performed.
#3: Error 2818: Activity count zero for TPump UPDATE or DELETE.
Sometimes, you want to know if a data row that was supposed to be updated or deleted wasn't! That is when you want
to know that the activity count was zero, indicating that the UPDATE or DELETE did not occur. To see this error, you
must have used one of the following parameters:

MARK MISSING UPDATE ROWS

MARK MISSING DELETE ROWS

RESTARTing TPump
Like the other utilities, a TPump script is fully restartable as long as the log table and error tables are not dropped. As
mentioned earlier you have a choice of setting ROBUST either ON (default) or OFF. There is more overhead using
ROBUST ON, but it does provide a higher degree of data integrity, but lower performance.

Page 62 of 105

Teradata Utilities-Breaking the Barriers, First Edition

TPump and MultiLoad Comparision Chart

Figure 6-11

Page 63 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Chapter 7: INMOD Processing


What is an INMOD
When data is being loaded or incorporated into the Teradata Relational Database Management System (RDBMS), the
processing of the data is performed by the utility. All of the NCR Teradata RDBMS utilities are able to read files that
contain a variety of formatted and unformatted data. They are able to read from disk and from tape. These files and
devices must support a sequential access method. Then, the utility is responsible for incorporating the data into SQL
for use by Teradata. However, there are times when it is advantageous or even necessary to use a different access
technique or a special device.
When special input processing is desired, than an INMOD (acronym for INput MODule) is a potential approach to
solving the problem. An INMOD is written to perform the input of the data from a data source. It removes the
responsibility of performing input data from the utility. Many times an INMOD is written because the utility is not
capable of performing the particular input processing. Other times, it is written for convenience.
The INMOD is a user written routine to do the specialized access from the file system, device or database. The
INMOD does not replace the utility; it becomes a part of and an extension of the utility. The major difference is that
instead of the utility receiving the data directly, it receives the data from the INMOD. An INMOD can be written to work
with FastLoad, MultiLoad, TPump and FastExport.
As an example, an INMOD might be written to access the data directly from another RDBMS besides Teradata. It
would be written to do the following steps:
1. Connect to the RDBMS
2. Retrieve a row using a SELECT or DECLARE CURSOR
3. Pass the row to the utility
4. Loop back and do steps 2 & 3 until there is no more data
5. When there is no more data, disconnect from the RDBMS

How an INMOD Works


An INMOD is sometimes called an exit routine. This is because the utility exits itself by calling the INMOD and passing
control to it. The INMOD performs its processing and exits back as its method for passing the data back to the utility.
The following diagram illustrates the normal logic flow when using the utility:

The following diagram illustrates the logic flow when using an INMOD with the utility:

As seen in the above diagrams, there is an extra step involved with the processing of an INMOD. On the other hand, it
can eliminate the need to create an intermediate file by literally using another RDBMS as its data source. However,
the user still scripts and executes the utility, like when using a file, that portion does not change.
The following chart shows the appropriate languages for mainframe and network-attached systems: written in.

Figure 7-1
Page 64 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Calling an INMOD from Fastload


As shown in the diagrams above, the user still executes the utility and the utility is responsible for calling the INMOD.
Therefore, the utility needs an indication from the user that it is supposed to call the INMOD instead of reading a file.
Normally the utility script contains the name of the file or JCL statement (DDNAME). When using an INMOD, the file
designation is no longer specified. Instead, the name of the program to call is defined in the script.
The following chart indicates the appropriate statement to define the INMOD:
Figure 7-2

Writing an INMOD
The writing of an INMOD is primarily concerned with processing an input data source. However, it cannot do the
processing haphazardly. It must wait for the utility to tell it what and when to perform every operation.
It has been previously stated that the INMOD returns data to the utility. At the same time, the utility needs to know that
it is expecting to receive the data. Therefore, a high degree of handshake processing is necessary for the two
components (INMOD and utility) to know what is expected.
As well as passing the data, a status code is sent back and forth between the utility and the INMOD. As with all
processing, we hope for a successful completion. Earlier in this book, it was shown that a zero status code indicates a
successful completion. That same situation is true for communications between the utility and the INMOD.
Therefore, a memory area must be allocated that is shared between the INMOD and the utility. The area contains the
following elements:
1. The return or status code
2. The length of the data that follows
3. The data area

Writing for FastLoad


The following charts show the various programming statements to define the data elements, status codes and other
considerations for the various programming languages.
Parameter definition for FastLoad

Figure 7-3
Return/status codes from FastLoad to the INMOD

Figure 7-4
Return/status codes for the INMOD to FastLoad
Page 65 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Figure 7-5
Entry point for FastLoad used in the DEFINE:
Figure 7-6
NCR Corporation provides two examples for writing a FastLoad INMOD. The first is called BLKEXIT.C, which does not
contain the checkpoint and restart logic, and the other is BLKEXITR.C that does contain both checkpoint and restart
logic.

Writing for MultiLoad, TPump and FastExport


The following charts show the data statements used to define the two parameter areas for the various languages.
First Parameter definition for MultiLoad, TPump and FastExport to the INMOD

Figure 7-7
Second Parameter definition for INMOD to MultiLoad, TPump and FastExport

Figure 7-8
Return/status codes for MultiLoad, TPump and FastExport to the INMOD

Figure 7-9
The following diagram shows how to use the return codes of 6 and 7: Return/status codes for the INMOD to
MultiLoad, TPump and FastExport:

Page 66 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Return/status codes for the INMOD to MultiLoad, TPump and FastExport:


Figure 7-10
Entry point for MultiLoad, TPump and FastExport:
Figure 7-11

Migrating an INMOD
As seen in figures 7-4 and 7-9, many of the return codes are the same. However, it should also be noted that
FastLoad must remember the record count in case a restart is needed. Where as, the other utilities send the record
count to the INMOD. If the INMOD fails to accept the record count when sent to it, the job will abort or hang and never
finish successfully.
This means that if a FastLoad INMOD is used in one of the other utilities, it will work as long as the utility never
requests that a checkpoint take place. Remember that unlike FastLoad, the newer utilities default to a checkpoint
every 15 minutes. The only way to turn it off is to set the CHECKPOINT option of the .BEGIN to a number than is
higher than the number of records being processed.
Therefore, it is not the best practice to simply use a FastLoad INMOD as if it is interchangeable. It is better to modify
the INMOD logic for the restart and checkpoint processing necessary to receive the record count and use it for the
repositioning operation

Writing a NOTIFY Routine


As seen earlier in this book, there is a NOTIFY statement. If the standard values are acceptable, you should use them.
However, if they are not, you may write your own NOTIFY routine.
If you chose to do this, refer to the NCR Utilities manual for guidance for writing this processing. We just want you to
know here that it is something you can do.

Sample INMOD
Below is and example of the PROCEDURE DIVISION commands that might be used for MultiLoad, TPump or
FastExport.
PROCEDURE DIVISION USING PARM-1, PARM-2.
BEGIN.
MAIN. {specific user processing goes here, followed by:}
IF RETCODE= 0 THEN
DISPLAY "INMOD RECEIVED - RETURN CODE 0 - INITIALIZE & READ"
PERFORM 100-OPEN-FILES
PERFORM 200-READ-INPUT
GOBACK
Page 67 of 105

Teradata Utilities-Breaking the Barriers, First Edition


ELSE
IF RETCODE= 1 THEN DISPLAY "INMOD RECEIVED - RETURN CODE 1- READ"
PERFORM 200-READ-INPUT
GOBACK
ELSE
IF RETCODE= 2 THEN DISPLAY "INMOD RECEIVED - RETURN CODE 2 - RESTART"
PERFORM 900-GET-REC-COUNT PERFORM 950-FAST-FORWARD-INPUT
GOBACK
ELSE
IF RETCODE= 3 THEN DISPLAY "INMOD RECEIVED - RETURN CODE 3 - CHECKPOINT"
PERFORM 600-SAVE-REC-COUNT
GOBACK
ELSE
IF RETCODE= 5 THEN DISPLAY "INMOD RECEIVED - RETURN CODE 5 - DONE"
MOVE 0 TO RETLENGTH
MOVE 0 TO RETCODE
GOBACK
ELSE DISPLAY "INMOD RECEIVED - INVALID RETURN CODE"
MOVE 0 TO RETLENGTH
MOVE 16 TO RETCODE
GOBACK.
100-OPEN-FILES.
OPEN INPUT DATA-FILE.
MOVE 0 TO RETCODE.
200-READ-INPUT.
READ INMOD-DATA-FILE INTO DATA-AREA1
AT END GO TO END-DATA.
ADD 1 TO NUMIN.
MOVE 80 TO RETLENGTH.
MOVE 0 TO RETCODE.
ADD 1 TO NUMOUT.
END-DATA.
CLOSE DATA-FILE.
DISPLAY "NUMBER OF INPUT RECORDS = "NUMIN.
DISPLAY "NUMBER OF OUTPUT RECORDS = "NUMOUT.
MOVE 0 TO RETLENGTH.
MOVE 0 TO RETCODE.
GOBACK.
INMOD

Page 68 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Chapter 8: OUTMOD Processing


What is of an OUTMOD
The FastExport utility is able to write a file that contains a variety of formatted and unformatted data. It can write the
data to disk and to tape. This works because these files and devices all support a sequential access method.
However, there are times when it is necessary or even advantageous to use some other technique or a special device.
When special output processing is desired, than an OUTMOD (acronym for OUTput MODule) is a potential solution. It
is a user written routine to do the specialized access to the file system, device or database. The OUTMOD does not
replace the utility. Instead, it becomes like a part of the utility. An OUTMOD can be only written to work with
FastExport.
As an example, an OUTMOD might be written to move the data from Teradata and directly into an RDBMS or test
database. Therefore, it must be written to do the following steps:
1. Connect to the RDBMS
2. Receive a row from the FastExport
3. Send the row to another database as an INSERT
4. Loop back and do steps 2 & 3 until there is no more data
5. When there is no more data, disconnect from the database

How an OUTMOD Works


The OUTMOD is written to perform the output of the data to a data source. It removes the responsibility of performing
output from the utility. Many times an OUTMOD is written because the utility is not capable of performing the particular
output processing. Other times, it is written for convenience.
When data is being unloaded from the Teradata Relational Database Management System (RDBMS), the processing
of the data is performed by the utility. The utility is responsible for retrieving the data via an SQL SELECT from
Teradata. This is still the situation when using an OUTMOD. The major difference is that instead of the utility writing
the data directly, the data is sent to the OUTMOD.
An OUTMOD is sometimes called an exit routine. This is because the utility exits itself by passing control to the
OUTMOD. The OUTMOD performs its processing and exits back to the utility after storing the data.
The following diagram illustrates the normal logic flow when using the utility:

As seen in the above diagram, there is an extra step involved with the processing of an OUTMOD. On the other hand,
it eliminates the need to create an intermediate file. The data destination can be another RDBMS. However, the user
still executes the utility, that portion does not change.
The following chart shows the available languages for mainframe and networkattached systems:
Figure 8-1

Calling an OUTMOD from FastExport


As shown in the diagrams above, the user still executes the utility and the utility is responsible for calling the
OUTMOD. Therefore, the utility needs an indicator from the user that it is supposed to call the OUTMOD instead of
reading a file.
Normally the utility script contains the name of the file or JCL statement (DDNAME). When using an OUTMOD, the
FILE designation is no longer specified. Instead, the name of the program to call is defined in the script.
Page 69 of 105

Teradata Utilities-Breaking the Barriers, First Edition


The following chart indicates the appropriate statement to define the OUTMOD:
Figure 8-2

Writing an OUTMOD
The writing of an OUTMOD is primarily concerned with processing the output data destination. However, it cannot do
the processing haphazardly. It must wait for the utility to tell it what and when to perform every operation.
It has been previously stated that the OUTMOD receives data from the utility. At the same time, the utility needs to
know that it is expecting to receive the data. Therefore, a handshake degree of processing is necessary for the two
components (OUTMOD and FastExport) to know what is expected.
As well as passing the data, a status code is sent back and forth between them. Just like all processing, we hope for a
successful completion. Earlier in this book, it was shown that a zero status code indicates a successful completion.
A memory area must be allocated that is shared between the OUTMOD and the utility. The area contains the following
elements:
1. The return or status code
2. The sequence number of the SELECT within FastExport
3. The length of the data area in bytes
4. The response row from Teradata
5. The length of the output data record
6. The output data record
Cart of the various programming language definitions for the parameters

Figure 8-3
Return/status codes from FastExport to the OUTMOD

Figure 8-4
Return/status codes for the OUTMOD to FastExport
Figure 8-5
Entry point for FastExport
Figure 8-6
Page 70 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Writing a NOTIFY Routine


As seen earlier in this book, there is a NOTIFY statement. If the standard values are acceptable, you should use them.
However, if they are not, you may write your own NOTIFY routine.
If you chose to do this, refer to the NCR Utilities manual for guidance for writing this processing. We just want you to
know here that it is something you can do.

Sample OUTMOD
Below is and example of the PROCEDURE DIVISION commands that might be used for MultiLoad, TPump or
FastExport.
LINKAGE SECTION
01 OUTCODE PIC S9(5) COMP.
01 OUTSEQNUM S9(5) COMP.
01 OUTRECLEN PIC S9(5) COMP.
01 OUTRECORD
05 INDICATOR PIC 9.
05 REGN PIC XXX.
05 PRODUCT PIC X(8).
05 QTY PIC S9(8) COMP.
05 PRICE PIC S9(8) COMP.
01 OUTRECLEN

PIC S9(5) COMP.

01 OUTDATA PIC XXXX.


PROCEDURE DIVISION USING
OUTCODE, STATEMENT-NO, OUTRECLEN, OUTRECORD,
OUTRECLEN, OUTDATA.
BEGIN.
MAIN.
IF OUTCODE = 1 THEN
OPEN OUTPUT SALES-DROPPED-FILE
OPEN OUTPUT BAD-REGN-SALES-FILE
GOBACK.
IF OUTCODE = 2 THEN
CLOSE SALES-DROPPED-FILE
CLOSE BAD-REGN-SALES-FILE
GOBACK.
IF OUTCODE = 3 THEN
PERFORM TYPE-3
GOBACK.
IF OUTCODE = 4 THEN GOBACK
IF OUTCODE = 5 THEN
CLOSE SLAES-DROPPED-FILE
OPEN OUTPUT SALES-DROPPED-FILE
CLOSE BAD-REGN-SALES-FILE
OPEN OUTPUT BAD-REGN-SALES-FILE
GOBACK.
Page 71 of 105

Teradata Utilities-Breaking the Barriers, First Edition


IF OUTCODE = 6 THEN
OPEN OUTPUT SALES-DROPPED-FILE
OPEN OUTPUT BAD-REGN-SALES-FILE
GOBACK.
TYPE-3
IF QTY IN OUTRECORD * PRICE IN OUTRECORD < 100 THEN
MOVE 0 TO OUTRECLEN
WRITE DROPPED-TRANLOG FROM OUTRECORD
ELSE
PERFORM TEST-NULL-REGN.
TEST-NULL-REGN.
IF REGN IN OUTRECORD = SPACES
MOVE 999 TO REGN IN OUTRECORD
WRITE BAD-REGN-OUTRECORD FROM OU

Page 72 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Chapter 9: Support Environment


The Teradata Utilities and the Support Environment
As seen in the many of the Teradata Utilities, the introduction of the capabilities of the Support Environment (SE) is a
valuable asset. It is an inherit part of the utilities and acts as a front-end to these newer utilities: FastExport, MultiLoad,
and TPump. The purpose of the SE is to provide a feature rich scripting tool.
As the newer load and extract functionalities were being proposed for use with the Teradata RDBMS, it became
obvious that certain capabilities were going to be needed by all the utilities. Rather than writing these capabilities over
and over again into multiple programs, it was written once into a single module/environment called the SE. This
environment/module is included with the newer utilities

The Support Environment Commands


Alphabetic Command List

Figure 9-1
The SE allows the writer of the script to perform housekeeping chores prior to calling the desired utility with a .BEGIN.
At a minimum, these chores include the specification of the restart log table and logging onto Teradata. Yet, it brings to
the party the ability to perform any Data Definition Language (DDL) and Data Control Language (DCL) command
available to the user as defined in the Data Dictionary. In addition, all Data Manipulation Language (DML) commands
except a SELECT are allowed within the SE.

Required Operational Command List


A few of the SE commands are mandatory. The rest of the commands are optional and only used when they satisfy a
need. The following section in this chapter elaborates on the required commands. The optional commands are
covered in later sections. Once the explanation and syntax is shown, an example of their use is shown in a script at
the end of this chapter.

Creating a Restart Log Table


The Restart Log table is a mandatory requirement to run a utility that may need to perform a restart. It is used by the
utility to monitor its own progress and provide the basis for restart from the point of a failure. This restart facility
becomes critical when processing millions of data rows. This is normally better to restart where the error occurred
rather than rerunning the job from the beginning (like BTEQ).
The utilities use the restart log table to ascertain what type of restart, if any, is required as a result of the type of failure.
Failures can occur at a Teradata, network or client system level. The Restart log makes the process of restarting the
utility very much automatic once the problem causing the failure has been corrected.
The syntax for creating a log table:
.LOGTABLE [<database-name>.]<table-name>;
When the utility completes successfully with a return code of zero, the restart log table is automatically dropped.

Page 73 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Creating Teradata Sessions


Teradata will not perform any operation for a user who has not logged onto the system. It needs the user information
to determine whether or not the proper privileges exist before allowing the operation requested. Therefore, it is
necessary to require the user to provide authentication via a LOGON request.
As a matter of performance, the utilities that use the SE look at the number of AMP tasks to determine the number of
sessions to establish. However, the number of sessions is configurable, but not as a part of the .LOGON. Instead,
setting the number of sessions to establish that is covered in the .BEGIN paragraph (next).
The syntax for logging onto Teradata:
.LOGON [<tdpid>/]<user-name>,<user-password>[,'acct-id'];
Notice that we are discussing the .LOGON after the .LOGTABLE command. Although a log table cannot be created
until after a session is established, the .LOGTABLE command is coded first. At the same time, the order isn't strictly
enforced and the logon can come first. However, you will see a warning message displayed from the SE if the
.LOGON command is issued first. So, it is best to make a habit of always specifying the .LOGTABLE command first.
Once a session is established, based on privileges, the user can perform any of the following:

DDL

DCL

Any DML (with the exception of SELECT)

Establish system variables

Accept parameter values from a file

Perform dynamic substitution of values including object names

Beginning a Utility
Once the script has connected to Teradata and established all needed environmental conditions, it is time to run the
desired utility. This is accomplished using the .BEGIN command. Beyond running the utility, it is used to define most of
the options used within the execution of the utility. As an example, setting the number of sessions is requested here.
See each of the individual utilities for the names, usage and any recommendations for the options specific to it.
The syntax for writing a .BEGIN command:
.BEGIN <utility-task> [utility-options>];
The utility task is defined as one of the following:

Figure 9-2

Ending a Utility
Once the utility finishes its task, it needs to be ended. To request the termination, use the .END command.
The syntax for writing a .END command:
.END <utility-task>
When the utility ends, control is returned to the SE. It can then check the return code (see Figure 9-4) status and verify
that the utility finished the task successfully. Based on the status value in the return code, the SE can be used to
determine what processing should occur next.

Terminating a Teradata Sessions


Once the sessions are no longer needed, they also should to be ended. To request their termination, use the
.LOGOFF.
The syntax for logging onto Teradata:
.LOGOFF[<return-code>];
Page 74 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Optionally, the user may request a specific return code be sent to the host computer that was used to start the utility.
This might include the job control language (JCL) on a mainframe, the shell script for a UNIX system, or bat file on
DOS. This value can then be checked by that system to determine conditional processing as a result of the completion
code specified.

Optional Command List


The following commands are available to add functionality to the SE. They allow for additional processing within the
preparation for the utility instead of requiring the user to access BTEQ or other external tools. As with the required
commands above, an example of their use is shown in a script at the end of this chapter.

Accepting a Parameter Value(s)


Allowing the use of parameter values within the SE is a very powerful tool. A parameter can be substituted into the
script much like the substitution of values within a Teradata macro. However, it is much more elaborate in that the
substitution includes the object names used in the SQL, not just data.
When accepting one or more parameter values, they must be in a single record. If multiple records are needed, they
can be read using multiple .ACCEPT commands from different files. Each record may contain one or more values
delimited by a space. Therefore, it is necessary to put character strings in single quotes. Once accepted by the script,
these values are examined and are stored dynamically stored into parameters named within the script.
The syntax for writing a .ACCEPT command:
.ACCEPT <parameter-name> [..., <parameter-name>]
{[FROM] FIL <file-name> |
{ENVIROMENT | ENV} VARIABLE|VAR} <sys-variable>}
[IGNORE <character-position>[ THRU <character-position>]];

The format of the accepted record is comprised of either character or numeric data. Character data must be enclosed
in single quotes () and numeric data does not need quotes. When multiple values are specified on a single record, a
space is used to delimit them from one another. The assignment of a value to a parameter is done sequentially as the
names appear in the .ACCEPT and the data appears on the record. The first value is assigned to the first parameter
and so forth until there are no more parameter names in which to assign values.
The system variables are defined later in this chapter. They are automatically set by the system to provide information
regarding the execution of the utility. For example, they include the date, time and return code, to name a few. Here
they can be used to establish the value for a user parameter instead of reading the data from a file.
Example of using a .ACCEPT command:
.ACCEPT char_parm, int_num_parm, dec_num_parm FILE parm-record;

Contents of parm-record:
'This is some character data enclosed in quotes with spaces in it' 123 35.67

Once accepted, this data is available for use within the script. Optionally, an IGNORE can be used to skip one or more
of the specified variables in the record. This makes it easy to provide one parameter record that is used by multiple job
scripts and allowing the script to determine which and how many of the values it needs.
To not use the integer data, the above .ACCEPT would be written as:
.ACCEPT char_parm, dec_num_parm FILE parm-record IGNORE 39 THRU 42;
Page 75 of 105

Teradata Utilities-Breaking the Barriers, First Edition


Note: if the system is a mainframe, the FILE is used to name the DD statement in the Job Control Language (JCL).
For example, for the above .ACCEPT, the following JCL would be required:
//PARM-RECORD DD DSN=<pds-member-name>, DISP=(old, keep)

Establishing the Default Date Format


Depending on the mode (Teradata or ANSI) defined within the DBC Control Record, the dates are displayed and read
according to that default format. When reading date data that does not match that format, it is rejected and stored in
an error table. This rejection includes a valid ANSI date when it is looking for a Teradata date.
To ease the writing of the code by eliminating the need to specifically define the format of incoming dates, the
.DATEFORM is a useful command. It allows for the user to declare an incoming date with the ANSI format (YYYY-MMDD) or the Teradata format (YY/MM/DD).
The syntax for writing a .DATEFORM command:
.DATAFORM { ANSIDATE | INTEGERDATE } /* INTEGERDATE is the default */;

Since these are the only two pre-defined formats, any other format must be defined in the INSERT of the utility, as in
the following example for a MM/DD/YYYY date:
INSERT INTO my_date_table VALUES( :incoming_date (DATE, FORMAT 'mm/dd/yyyy'));

Displaying an Output Message


The .DISPLAY command is used to write a text message to a file name specified in the command. Normally, this
technique is used to provide operational or informational information to the user regarding one or more conditions
encountered during the processing of the utility or SE. The default file is system printer (SYSPRINT) on a mainframe
and standard out (STDOUT) on other platforms.
The message is normally built using a literal character string. However, a user may request the output to consist of
substituted variable or parameter data. This is accomplished using an ampersand (&) in front of the variables name.
See the section below on using a variable in a script for more details.
The syntax for writing a .DISPLAY command:
.DISPLAY '<message-text-here>' [TO] FILE <file-name>;

Note: If the system is a mainframe, the FILE portion of the command is used to name the DD statement in the JCL.
The JCL must also contain any names, space requirements, record and block size, or disposition information needed
by the system to create the file.

Comparing Variable Data


The .IF command is used to compare the contents of named variable data. Normally, a variable is compared to a
known literal value for control purposes. However, anything can be compared where it makes sense to do so.
.IF {<variable-name> | <literal>} <comparison> {<literal> | <variable-name>}
[THEN] <operation-to-perform> [...,<operation-to-perform>]
[ELSE {{<variable-name> | <literal>}<comparison> {<literal> | <variable-name>}
|<operation-to-perform> [...,<operation-to-perform>]}
Page 76 of 105

Teradata Utilities-Breaking the Barriers, First Edition


.ENDIF;

The comparison symbols are normally one of the following:


Figure 9-3

Routing Messages
The .ROUTE command is used to write messages to an output file. This is normally system information generated by
the SE during the execution of a utility. The default file is SYSPRINT on a mainframe and STDOUT on other platforms.
The syntax for writing a .ROUTE command:
.ROUTE <messages>[ TO ] FILE <file-name>
[[WITH] ECHO { OFF | [ TO ] FILE <file-name>];

Note: If the system is a mainframe, the FILE is used to name the DD statement in the JCL. The JCL must also contain
any names, space requirements, record and block size, or disposition information needed by the system.

Running Commands from a File


The .RUN command is used to read and execute other commands from a file. This is a great tool for using pre-defined
and stored command files. This is especially a good way to secure your user id and password from being written into
the script.
In other words, you save your .LOGON in a secured file that only you can see. Then, use the .RUN to access it for
processing. In addition, more than one command can be put into the file. Therefore, it can add flexibility to the utility by
building commands into the file instead of into the script.
The syntax for writing a .RUN command:
.RUN FILE <file-name> [IGNORE <character-position> [THRU character-position> ]]

The IGNORE and the THRU options work here the same as they do as explained in the .ACCEPT above.
Note: If the system is a mainframe, the FILE is used to name the DD statement in the JCL.

Setting Variables to Values


The .SET command is used to assign a new value or change an existing value within a variable. This is done to make
the execution of the script more flexible and provide user with more control of the processing.
The syntax for writing a .SET command:
.SET <variable-name> [ TO ] <expression>;

Note: The expression can be a literal value based on the data type of the variable or a mathematical operation for
numeric data. The math can use one or more variables and one or more literals.

Running a System Command


The .SYSTEM command is used as a hook to the operating system on which the utility is running. This is done to
communicate with the host computer and request an operation that the SE cannot do on its own. When using this
Page 77 of 105

Teradata Utilities-Breaking the Barriers, First Edition


command, it is important to know which operating system is being used. This information can be obtained from one of
the system variables below.
The syntax for writing a .SYSTEM command:
.SYSTEM '<operating-system-specific-command>';

Note: There is a system variable that contains this data and can be found in the System Variable section of this
chapter.

Using a Variable in a Script


The SE dynamically establishes a memory area definition for a variable at the point it is first referenced. The data used
to initialize it also determines the data type it is to use. To distinguish the referenced name as a variable instead of
being a database object name, a special character is needed. The character used to identify the substitution of
variable data into the SQL, is the ampersand (&) in front of the variable name. However, the ampersand is not used
when the value is being set.

The Support Environment System Variables


The following variables are available within the SE to help determine conditions and system data for processing of the
script.

Figure 9-4

Support Environment Example


/* build the restart log called MJL_Util_log in database WORKDB */
.LOGTABLE WORKDB.MJL_Util_log;
.DATEFORM ansidate;
/* get the logon from a file called logon-file */
.RUN FILE logon-file;
/* test the system day to see if it is Friday
notice that the character string is used in single quotes so that it is compared as
a character string. Contrast this below for the table name */
.IF '&SYSDAY' = 'FRI' THEN;
.DISPLAY '&SYSDATE(4) is a &SYSDAY' FILE outfl.txt
.ELSE;
.DISPLAY '&SYSUSER, &SYSDATE(4) is not Friday';
.LOGOFF 16;
/* notice that the endif allows for more than one operation after the comparison */
.ENDIF;
/* establish and store data into a variable */
.SET variable1 TO &parm_data1 + 125;
/* the table name and two values are obtained from a file */
.ACCEPT tablename, parm_data1, parm_data2 FILE myparmfile;
/* the table name is not in quotes here because it is not character data. But the value
in parm_data2 is in quotes because it is character data. This is the power of it all !
*/
Page 78 of 105

Teradata Utilities-Breaking the Barriers, First Edition


INSERT INTO &tablename VALUES (&variable1, '&parm_data2', &parm_data1);
.LOGOFF;
- - - - - - - - - - - - - - - - Contents of logon-file:

.logon ProdSys/mikel,larkins1

Contents of myparmfile:

My_test_table 125 'Some character data'

- - - - - - - - - - - - - - - - The following SYSOUT file is created from a run of the above script on a day other than Friday:
========================================================
=

FastExport Utility

Release FEXP.07.02.01

========================================================
**** 13:40:45 UTY2411 Processing start date: TUE AUG 13, 2002
========================================================
=

= Logon/Connection

========================================================
0001 .LOGTABLE WORKDB.MJL_Util_log;
0002 .DATEFORM ansidate;
**** 13:40:45 UTY1200 DATEFORM has been set to ANSIDATE.
0003 .RUN FILE logonfile.txt;
0004 .logon cdw/mikel,;
**** 13:40:48 UTY8400 Maximum supported buffer size: 64K
**** 13:40:50 UTY8400 Default character set: ASCII
**** 13:40:52 UTY6211 A successful connect was made to the RDBMS.
**** 13:40:52 UTY6217 Logtable 'SQL00.MJL_Util_log' has been created.
=========================================================
=
=

=
Processing Control Statements

=
=

=========================================================
0005 /* test the system day to see if it is Friday
notice that the character string is used in single quotes so that it is
compared as a character string. Contrast this below for the table name */
.IF '&SYSDAY' = 'FRI' THEN;
**** 13:40:52 UTY2402 Previous statement modified to:
0006 .IF 'TUE' = 'FRI' THEN;
0007
0008
0009

.DISPLAY '&SYSDATE(4) is a &SYSDAYday' FILE outfl.txt;


.ELSE;
.DISPLAY '&SYSUSER, &SYSDATE(4) is not Friday' FILE outfl.txt;

**** 13:40:52 UTY2402 Previous statement modified to:


0010
0011 /*

.DISPLAY 'Michael, 02/08/13(4) is not Friday' FILE outfl.txt;


.LOGOFF 16; */

===============================================================
Page 79 of 105

Teradata Utilities-Breaking the Barriers, First Edition


=

Logoff/Disconnect

===============================================================
**** 13:40:55 UTY6212 A successful disconnect was made from the RDBMS.
**** 13:40:55 UTY6216 The restart log table has been dropped.
**** 13:40:55 UTY2410 Total processor time used = '10.906 Seconds'
.

Start : 13:40:45 - TUE AUG 13, 2002

End : 13:40:55 - TUE AUG 13, 2002

Highest return code encountered = '16'.

The following SYSOUT file is created from a run of the above script on a day other than Friday:
========================================================
=

FastExport Utility Release FEXP.07.02.01

========================================================
**** 13:40:45 UTY2411 Processing start date: FRI AUG 16, 2002
========================================================
=

Logon/Connection

========================================================
0001 .LOGTABLE WORKDB.MJL_Util_log;
0002 .DATEFORM ansidate;
**** 13:40:45 UTY1200 DATEFORM has been set to ANSIDATE.
0003 .RUN FILE logonfile.txt;
0004 .logon cdw/mikel,;
**** 13:40:48 UTY8400 Maximum supported buffer size: 64K
**** 13:40:50 UTY8400 Default character set: ASCII
**** 13:40:52 UTY6211 A successful connect was made to the RDBMS.
**** 13:40:52 UTY6217 Logtable 'SQL00.MJL_Util_log' has been created.
=========================================================
=

Processing Control Statements

=
=

=========================================================
0005 /* test the system day to see if it is Friday
notice that the character string is used in single quotes so that it is
compared as a character string. Contrast this below for the table name */
.IF '&SYSDAY' = 'FRI' THEN;
**** 13:40:52 UTY2402 Previous statement modified to:
0006 .IF 'FRI' = 'FRI' THEN;
0007

.DISPLAY '&SYSDATE(4) is a &SYSDAYday' FILE outfl.txt;

**** 13:40:52 UTY2402 Previous statement modified to:


0008

.DISPLAY '02/08/13(4) is a FRIday' FILE outfl.txt;


Page 80 of 105

Teradata Utilities-Breaking the Barriers, First Edition


0009

.ELSE;

0010

DISPLAY '&SYSUSER, &SYSDATE(4) is not Friday' FILE outfl.txt;

0011 /*

.LOGOFF 16; */

/* notice that the endif allows for more than one operation after the
comparison */
.ENDIF;
0012 /* establish and store data into a variable */
/* the table name and two values are obtained from a file */
.ACCEPT tablename, parm_data1, parm_data2 FILE myparmfile.txt;
0013 /* the table name is not in quotes here because it is not character data */
.SET variable1 TO &parm_data1 + 125;
**** 13:40:52 UTY2402 Previous statement modified to:
0014 /* the table name is not in quotes here because it is not character data */
.SET variable1 TO 123 + 125;
0015 INSERT INTO &tablename VALUES (&variable1, '&parm_data2', &parm_data1);
**** 13:40:52 UTY2402 Previous statement modified to:
0016 INSERT INTO My_test_table VALUES (248, 'some character data', 123);
**** 13:40:54 UTY1016 'INSERT' request successful.
0017 .LOGOFF;
=================================================================
=

Logoff/Disconnect

=
=

=================================================================
**** 13:40:55 UTY6212 A successful disconnect was made from the RDBMS.
**** 13:40:55 UTY6216 The restart log table has been dropped.
**** 13:40:55 UTY2410 Total processor time used = '10.906 Seconds'
.

Start : 13:40:45 - FRI AUG 16, 2002

End : 13:40:55 - FRI AUG 16, 2002

Highest return code encountered = '0'.

Page 81 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Appendix A: Mainframe Load Utility Examples


BTEQ MAINFRAME EXPORT EXAMPLE
//

B09XXZD2 JOB (T,AA,XXZ),'BTEQ

TEMPLATE',CLASS=S,MSGCLASS=0,
//

REGION=6M,NOTIFY=B09XXZ

//*
//*

+JBS BIND TDP0.UP

//*
//*-----------------------------------------------------------------------------------------------------//*

JOB INFORMATION AND COMMENTS

//*-------------------------------------------------------------//

JOBLIB

//

DD

DD

DSN=C309.B0SNCR.NM.R60.APPLOAD,DISP=SHR

DSN=C309.B0SNCR.NM.R60.TRLOAD,DISP=SHR

//*-----------------------------------------------------------------------------------------------------//

BTEQ1

EXEC PGM=BTQMAIN

//

LOGON

DD

//

IDBENV

DD

//

SYSIN

DD

//

SYSPRINT DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(ILOGON),DISP=SHR
DSN=B09XXZ.APPLUTIL.CLASS.JCL(IDBENV),DISP=SHR
DSN=B09XXZ.APPLUTIL.CLASS.JCL(BTEQSCPT),DISP=SHR

SYSOUT=*

BTEQ MAINFRAME IMPORT SCRIPT EXAMPLE - DATA MODE


/*--------------------------------------------------*/
/*---------------- PROGRAM INFORMATION --------------------------*/
/*---------------------------------------------------------------*/
/*

SCRIPT=XYYY9999

*/

/*

SCRIPT TYPE=TERADATA BTEQ

*/

/*

LANGUAGE=UTILITY COMMANDS AND SQL

*/

/*

RUN MODE=BATCH

*/

/*---------------------------------------------------------------*/
/*------------------ PROGRAM DESCRIPTION ------------------------*/
/*---------------------------------------------------------------*/
/* PURPOSE & FLOW:

*/

/* SPECIAL OR UNUSUAL LOGIC:

*/

/* PARM

- NONE

*/

/* ABEND CODES:

*/

/* XXXX -

*/

/*---------------------------------------------------------------*/
Page 82 of 105

Teradata Utilities-Breaking the Barriers, First Edition

.SESSIONS 1
.RUN FILE=ILOGON;

/*JCL

ILOGON

/*JCL

IDBENV

- .LOGON CDW/SQL01,WHYNOT;

*/
.RUN FILE=IDBENV;

DATABASE

SQL_CLASS; */
.EXPORT DATA DDNAME=REPORT
SELECT
EMPLOYEE_NO,
LAST_NAME,
FIRST_NAME,
SALARY,
DEPT_NO
FROM EMPLOYEE_TABLE
.IF ERRORCODE > 0 THEN .GOTO Done
.EXPORT RESET
.LABEL Done
.QUIT

BTEQ MAINFRAME IMPORT EXAMPLE


BTEQ MAINFRAME IMPORT EXAMPLE - JCL
//

B09XXZD2 JOB (T,AA,XXZ),'BTEQ

TEMPLATE',CLASS=S,MSGCLASS=0,
//

REGION=6M,NOTIFY=B09XXZ

//*
//*

+JBS BIND TDP0.UP

//*
//*-----------------------------------------------------------------------------------------------------//*

JOB INFORMATION AND COMMENTS

//*-------------------------------------------------------------//

JOBLIB

//

DD

DD

DSN=C309.B0SNCR.NM.R60.APPLOAD,DISP=SHR

DSN=C309.B0SNCR.NM.R60.TRLOAD,DISP=SHR

//*-----------------------------------------------------------------------------------------------------//

BTEQ1

EXEC PGM=BTQMAIN

//

LOGON

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(ILOGON),DISP=SHR

//

IDBENV

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(IDBENV),DISP=SHR

//

SYSIN

DD

//

SYSPRINT DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(BTEQSCPT),DISP=SHR
SYSOUT=*

//
Page 83 of 105

Teradata Utilities-Breaking the Barriers, First Edition

BTEQ MAINFRAME IMPORT SCRIPT EXAMPLE - DATA MODE


/*---------------------------------------------------------------------*/
/*-------------------PROGRAM INFORMATION ------------------------------*/
/*---------------------------------------------------------------------*/
/*

SCRIPT=XYYY9999

*/

/*

SCRIPT TYPE=TERADATA BTEQ

*/

/*

LANGUAGE=UTILITY COMMANDS AND SQL

*/

/*

RUN MODE=BATCH

*/

/*---------------------------------------------------------------------*/
/*------------------PROGRAM DESCRIPTION -------------------------------*/
/*---------------------------------------------------------------------*/
/* PURPOSE & FLOW:

*/

/* SPECIAL OR UNUSUAL LOGIC:

*/

/* PARM

- NONE

*/

/* ABEND CODES:

*/

/* XXXX -

*/

/*---------------------------------------------------------------------*/
.SESSIONS 1
.RUN FILE=ILOGON;

/*JCL

*/ .RUN FILE=IDBENV;

/*JCL

ILOGON
IDBENV

- .LOGON CDW/SQL01,WHYNOT;
-

DATABASE SQL08; */

.IMPORT DATA DDNAME=REPORT


.QUIET ON
.REPEAT *
USING
EMPLOYEE_NO

(INTEGER),

LAST_NAME

(CHAR(20)),

FIRST_NAME

(VARCHAR(12)),

SALARY

(DECIMAL(8,2)),

DEPT_NO

(SMALLINT)

INSERT INTO EMPLOYEE_TABLE


VALUES
(:EMPLOYEE_NO,
:LAST_NAME

:FIRST_NAME ,
:SALARY

:DEPT_NO);
.QUIT

FASTLOAD MAINFRAME EXAMPLE


FASTEXPORT MAINFRAME EXAMPLE - JCL
//

B09XXZXH

JOB

(T,B0,XXZ),'FAST EXPORT
Page 84 of 105

Teradata Utilities-Breaking the Barriers, First Edition


TEMPLATE',CLASS=S,MSGCLASS=0,
//

REGION=6M,NOTIFY=B09XXZ

//*

+JBS BIND TDP0.UP

//*
//*-------------------------------------------------------------//

JOBLIB

//

DD

DD

DSN=C309.B0SNCR.NM.R60.APPLOAD,DISP=SHR

DSN=C309.B0SNCR.NM.R60.TRLOAD,DISP=SHR

//*-------------------------------------------------------------//

FEXP1

EXEC PGM=XPORT

//

ILOGON

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(ILOGON),DISP=SHR

//

IDBENV

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(IDBENV),DISP=SHR

//

SYSIN

//

OUTDATA

//

DISP=(NEW,CATLG,DELETE),

//

UNIT=SYSDA,SPACE=(CYL,(1,1),RLSE),

//

DCB=(RECFM=FB,LRECL=80,BLKSIZE=0)

//

SYSPRINT DD

SYSOUT=*

//

SYSABEND DD

SYSOUT=*

//

SYSTERM

SYSOUT=*

//

SYSDEBUG DD

DD
DD

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(FEXPSCPT),DISP=SHR
DSN=B09XXZ.OUTPUT_DATASET_NAME

DUMMY

FASTEXPORT MAINFRAME SCRIPT EXAMPLE-RECORD MODE


/*---------------------------------------------------------------------*/
/*----------------PROGRAM INFORMATION ---------------------------------*/
/*---------------------------------------------------------------------*/
/*

SCRIPT=XYYY9999

/*

*/

SCRIPT TYPE=TERADATA FAST EXPORT SCRIPT

*/

/*

LANGUAGE=UTILITY COMMANDS AND SQL

*/

/*

RUN MODE=BATCH

*/

/*---------------------------------------------------------------------*/
/*------------------PROGRAM DESCRIPTION -------------------------------*/
/*---------------------------------------------------------------------*/
/*

PURPOSE & FLOW:

*/

/*

SPECIAL OR UNUSUAL LOGIC:

*/

/*

PARM

- NONE

*/

/*

ABEND CODES:

*/

/*

XXXX -

*/

/*---------------------------------------------------------------------*/
/*---------------

PROGRAM MODIFICATION -----------------------*/

/*---------------------------------------------------------------------*/
/* MAINTENANCE LOG - ADD LATEST CHANGE TO THE TOP*/
/* MOD-DATE

AUTHOR

MOD DESCRIPTION

*/

/*---------------------------------------------------------------------*/
.LOGTABLE SQL08.SQL08_RESTART_LOG;
.RUN FILE ILOGON;

/*JCL

ILOGON

- .LOGON CDW/SQL01,WHYNOT;
Page 85 of 105

Teradata Utilities-Breaking the Barriers, First Edition


*/
.RUN FILE IDBENV;

/*JCL

IDBENV

DATABASE

SQL_CLASS; */
.BEGIN EXPORT SESSIONS 1; .EXPORT OUTFILE OUTDATA
MODE RECORD FORMAT TEXT;
SELECT
STUDENT_ID

(CHAR(11)),

LAST_NAME

(CHAR(20)),

FIRST_NAME

(CHAR(14)),

CLASS_CODE

(CHAR(2)),

GRADE_PT

(CHAR(7))

FROM STUDENT_TABLE;
.END EXPORT;
.LOGOFF;

FASTLOAD MAINFRAME EXAMPLE


FASTLOAD MAINFRAME EXAMPLE - JCL
//

B09XXZFX JOB (T,AA,XXZ),'FASTLOAD TEMPLATE',CLASS=S,MSGCLASS=0,

//

REGION=6M,NOTIFY=B09XXZ

//*
//*

+JBS BIND TDP0.UP

//*
//*-------------------------------------------------------------//*

THIS JOB EXECUTES TERADATA FASTLOAD

//*-------------------------------------------------------------//

JS010

EXEC PGM=FASTLOAD

//

STEPLIB

DD

//

DD

//*

IDBENV

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(IDBENV),DISP=SHR

//*

ILOGON

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(ILOGON),DISP=SHR

DSN=C009.B0SNCR.NM.R60.APPLOAD,DISP=SHR

DSN=C009.B0SNCR.NM.R60.TRLOAD,DISP=SHR

//*-------------------------------------------------------------//*

FAST LOAD INPUT FILE

//

INFILE

DD

DSN=B09XXZ.FASTLOAD.INPUT.FILE,DISP=SHR

//*-------------------------------------------------------------//*

FAST LOAD SCRIPT FILE

//

SYSIN

//

SYSPRINT DD

SYSOUT=*

//

SYSUDUMP DD

SYSOUT=*

//

SYSTERM

SYSOUT=*

DD

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(FLODSCPT),DISP=SHR

FASTLOAD MAINFRAME SCRIPT EXAMPLE-TEXT MODE


/*----------------------------------------------------------------------*/
/*----------------PROGRAM INFORMATION ----------------------------------*/
/*----------------------------------------------------------------------*/
Page 86 of 105

Teradata Utilities-Breaking the Barriers, First Edition


/*

SCRIPT=XXXXXXXX

*/

/*

SCRIPT TYPE=TERADATA FASTLOAD

*/

/*

LANGUAGE=UTILITY COMMANDS AND SQL

*/

/*

RUN MODE=BATCH

*/

/*----------------------------------------------------------------------*/
/*------------------ PROGRAM DESCRIPTION -------------------------------*/
/*----------------------------------------------------------------------*/
/*

PURPOSE & FLOW:

*/

/*

SPECIAL OR UNUSUAL LOGIC:

*/

/*

PARM

- NONE

*/

/*

ABEND CODES:

*/

/*

XXXX -

*/

/*----------------------------------------------------------------------*/
/*-----------------PROGRAM MODIFICATION --------------------------------*/
/*----------------------------------------------------------------------*/
/*MAINTENANCE LOG-ADD LATEST CHANGE TO THE TOP

*/

/* MOD-DATE

*/

AUTHOR

MOD DESCRIPTION

/*----------------------------------------------------------------------*/
.SESSIONS 1;
LOGON TDP0/SQL08,SQL08;
DROP TABLE SQL08.ERROR_ET;
DROP TABLE SQL08.ERROR_UV;
DELETE FROM SQL08.EMPLOYEE_PROFILE;
DEFINE
EMPLOYEE_NO

(INTEGER),

LAST_NAME

(CHAR(20)),

FIRST_NAME

(VARCHAR(12)),

SALARY

(DECIMAL(8,2)),

DEPT_NO

(SMALLINT)

FASTLOAD MAINFRAME SCRIPT EXAMPLE-CONTINUED


DDNAME=DATAIN;
BEGIN LOADING SQL08.EMPLOYEE_PROFILE
ERRORFILES SQL08.ERROR_ET, SQL08.ERROR_UV
CHECKPOINT 5;
INSERT INTO SQL08.EMPLOYEE_PROFILE VALUES
(:EMPLOYEE_NO,
:LAST_NAME,
:FIRST_NAME,
:SALARY,
:DEPT_NO);
END LOADING;
LOGOFF;

Page 87 of 105

Teradata Utilities-Breaking the Barriers, First Edition

MULTLOAD MAINFRAME EXAMPLE


MULTILOAD MAINFRAME EXAMPLE-JCL
//

B09XXZPT

JOB

(T,FA,XXZ),'MLOAD TEMPLATE',CLASS=S,MSGCLASS=0,

//

REGION=6M,NOTIFY=B09XXZ

//*

+JBS BIND TDP0.UP

//* ----------------------------------------------------------//*

JOB INFORMATION AND COMMENTS

//* ----------------------------------------------------------//

MLOAD

EXEC PGM=MLOAD

//

STEPLIB

DD

//

DD

DSN=C009.B0SNCR.NM.R60.APPLOAD,DISP=SHR

DSN=C009.B0SNCR.NM.R60.TRLOAD,DISP=SHR

//*
//

ILOGON

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(ILOGON),DISP=SHR

//

IDBENV

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(IDBENV),DISP=SHR

//*

SPECIFY THE MULTILOAD INPUT DATA FILE (LOAD FILE)

//

INPTFILE DD

DSN=XXXXXX.YYYYYYY.INPUT.FILENAME,DISP=SHR

//

SYSPRINT DD

SYSOUT=*

//

SYSABEND

//

SYSTERM

//

SYSDEBUG DD

//*

SPECIFY THE MULTILOAD SCRIPT TO EXECUTE

//

SYSIN

DD
DD

SYSOUT=*
SYSOUT=*
DUMMY

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(MLODSCPT),DISP=SHR

MULTILOAD MAINFRAME SCRIPT EXAMPLE-TEXT MODE


/*----------------------------------------------------------------------*/
/*---------------- PROGRAM INFORMATION ---------------------------------*/
/*----------------------------------------------------------------------*/
/*

SCRIPT=XXXXXXXX

*/

/*

SCRIPT TYPE=TERADATA MULTILOAD

*/

/*

LANGUAGE=UTILITY COMMANDS AND SQL

*/

/*

RUN MODE=BATCH

*/

/*----------------------------------------------------------------------*/
/*------------------ PROGRAM DESCRIPTION -------------------------------*/
/*----------------------------------------------------------------------*/
/*

PURPOSE & FLOW:

*/

/*

SPECIAL OR UNUSUAL LOGIC:

*/

/*

PARM

- NONE

*/

/*

ABEND CODES:

*/

/*

XXXX -

*/

/*----------------------------------------------------------------------*/
/*---------------

PROGRAM MODIFICATION --------------------------------*/

/*----------------------------------------------------------------------*/
/*MAINTENANCE LOG-ADDED CHANGE TO THE TOP

*/
Page 88 of 105

Teradata Utilities-Breaking the Barriers, First Edition


/* MOD-DATE

AUTHOR

MOD DESCRIPTION

*/

/*----------------------------------------------------------------------*/
.LOGTABLE SQL08.UTIL_RESART_LOG;
.RUN FILE ILOGON;

/*JCL

ILOGON

/*JCL

IDBENV

- .LOGON CDW/SQL01,WHYNOT;

*/
.RUN FILE IDBENV;

DATABASE SQL08;

*/

.BEGIN MLOAD
TABLES Student_Profile1
ERRLIMIT 1
SESSIONS 1;
.LAYOUT INPUT_FILE;
.FIELD

STUDENT_ID

1 CHAR(11)

.FIELD

LAST_NAME

* CHAR(20)

.FIELD

FIRST_NAME

* CHAR(14)

.FIELD

CLASS_CODE

* CHAR(2)

.FIELD

GRADE_PT

.FIELD

FILLER

* CHAR(7)
* CHAR(26)

;
;

FASTLOAD MAINFRAME SCRIPT EXAMPLE-CONTINUED


.DML LABEL INPUT_INSERT;
INSERT INTO Student_Profile1
VALUES
(:STUDENT_ID

(INTEGER),

:LAST_NAME

(CHAR(20)),

:FIRST_NAME

(VARCHAR(12)),

:CLASS_CODE

(CHAR(2)),

:GRADE_PT

(DECIMAL(5,2))

);
.IMPORT INFILE INPTFILE
LAYOUT

INPUT_FILE

APPLY

INPUT_INSERT;

.END MLOAD;
.LOGOFF;

TPUMP MAINFRAME EXAMPLE


TPUMP MAINFRAME EXAMPLE-JCL
//

B09XXZTP

JOB

(T,FA,XXZ),'TPUMP TEMPLATE',CLASS=S,MSGCLASS=0,

//

REGION=6M,NOTIFY=B09XXZ

//*
//*

+JBS BIND TDP0.UP

//*
//* ----------------------------------------------------------//*

JOB INFORMATION AND COMMENTS

//* ----------------------------------------------------------Page 89 of 105

Teradata Utilities-Breaking the Barriers, First Edition


//

TPUMP

EXEC PGM=TPUMP

//

STEPLIB

DD

//

DD

//

ILOGON

//

IDBENV

//*

SPECIFY THE TPUMP INPUT DATA FILE (LOAD FILE)

//

INPUT

//*

SPECIFY TPUMP SCTIPT FILE NAME

//

SYSIN

//

SYSPRINT DD

SYSOUT=*

//

SYSABEND DD

SYSOUT=*

//

SYSTERM

SYSOUT=*

//

SYSDEBUG DD

DSN=C009.B0SNCR.NM.R60.APPLOAD,DISP=SHR

DSN=C009.B0SNCR.NM.R60.TRLOAD,DISP=SHR
DD
DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(ILOGON),DISP=SHR

DD
DD

DD

DSN=B09XXZ.APPLUTIL.CLASS.JCL(IDBENV),DISP=SHR
DSN=B09XXZ.INPUT_FILENAME,DISP=SHR
DSN=B09XXZ.APPLUTIL.CLASS.JCL(TPMPSCPT),DISP=SHR

DUMMY

TPUMP MAINFRAMESCRIPT EXAMPLE-TEXT


/*----------------------------------------------------------------------*/
/*---------------- PROGRAM INFORMATION ---------------------------------*/
/*----------------------------------------------------------------------*/
/*

SCRIPT=XXXXXXXX

*/

/*

SCRIPT TYPE=TERADATA TPUMP

*/

/*

LANGUAGE=UTILITY COMMANDS AND SQL

*/

/*

RUN MODE=BATCH

*/

/*----------------------------------------------------------------------*/
/*------------------ PROGRAM DESCRIPTION -------------------------------*/
/*----------------------------------------------------------------------*/
/* PURPOSE & FLOW:

*/

/* SPECIAL OR UNUSUAL LOGIC:

*/

/* PARM

- NONE

*/

/* ABEND CODES:

*/

/*

*/

XXXX -

/*----------------------------------------------------------------------*/
/*---------------

PROGRAM MODIFICATION --------------------------------*/

/*----------------------------------------------------------------------*/
/*MAINTENANCE LOG-ADDED CHANGE TO THE TOP

*/

/* MOD-DATE

*/

AUTHOR

MOD DESCRIPTION

/*----------------------------------------------------------------------*/
.LOGTABLE SQL08.TPUMP_RESTART;
.RUN FILE ILOGON;

/*JCL

ILOGON

/*JCL

IDBENV

- .LOGON CDW/SQL01,WHYNOT;

*/
.RUN FILE IDBENV;

DATABASE SQL08; */

DROP TABLE SQL08.TPUMP_UTIL_ET;


.BEGIN LOAD
SESSIONS 1 TENACITY 2
ERRORTABLE TPUMP_UTIL_ET
ERRLIMIT

5
Page 90 of 105

Teradata Utilities-Breaking the Barriers, First Edition


CHECKPOINT 1
PACK 40
RATE 1000
ROBUST OFF;

FASTLOAD MAINFRAME SCRIPT EXAMPLE-CONTINUED


.LAYOUT INPUT_LAYOUT;
.FIELD

STUDENT_ID

1 CHAR(11);

.FIELD

LAST_NAME

* CHAR(20);

.FIELD

FIRST_NAME

* CHAR(14);

.FIELD

CLASS_CODE

* CHAR(2) ;

.FIELD

GRADE_PT

* CHAR(7) ;

.FIELD

FILLER

* CHAR(26);

.DML LABEL INPUT_INSERT IGNORE DUPLICATE ROWS


IGNORE MISSING ROWS;
INSERT INTO Student_Profile4
VALUES
(:STUDENT_ID

(INTEGER),

:LAST_NAME

(CHAR(20)),

:FIRST_NAME

(VARCHAR(12)),

:CLASS_CODE

(CHAR(2)),

:GRADE_PT

(DECIMAL(5,2))

);
.IMPORT INFILE INPUT
LAYOUT INPUT_LAYOUT
APPLY INPUT_INSERT;
.END LOAD;
.LOGOFF;

Teradata Parallel Transporter(TPT) utility in


Informatica
Ratings: (0)|Views: 1,405|Likes: 1
Published by Jaydeep Patel
What is TPT connection in Informatica? How is it useful for reducing the data load time?
See more

Page 91 of 105

Teradata Utilities-Breaking the Barriers, First Edition

TPT Connection
with export
operator:
C

onnection that uses


export operator can be
used only as a source
connection
.

Note:
Page 92 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Export. Extracts data


from Teradata. Select
Export if the session uses
a Teradata Parallel
Transporter Reader.

Page 93 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Page 94 of 105

Teradata Utilities-Breaking the Barriers, First Edition

TPT connection
using Stream system
operator:
Stream operator uses
macros to modify tables.
So macro database has to
be specified in the
session properties while
using this type of
connection.

Macro Database:
Page 95 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Name of the database


that stores the macros
Teradata PT API creates
when you select the
Stream system operator.
The Stream system
operator uses macros to
modify tables. It creates
macros before Teradata
PT API begins loading
data and removes them
from the database after
Page 96 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Teradata PT API loads all


rows to the target.

Page 97 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Page 98 of 105

Teradata Utilities-Breaking the Barriers, First Edition

If you do not specify a


macro database, Teradata
PT API stores the macros
in the log database.

Restart in case of
failure:Restart for System
Operators

Update, Export or
Stream
Page 99 of 105

Teradata Utilities-Breaking the Barriers, First Edition

In case there is a session


failure where the session
uses one of the operators
like Update, Export or
Stream then the session
can rerun successfully
once all the intermediate
tables are dropped.
Restart for System
Operator

Load
Page 100 of 105

Teradata Utilities-Breaking the Barriers, First Edition

In case of a session failure


where it uses System
operator as 'Load' then we
need to follow the below
mentioned steps to rerun
the session successfully:
There is no need to drop
and recreate the target
table for load operator ,
instead a simple checkout
of the session and an '
uncheck
Page 101 of 105

Teradata Utilities-Breaking the Barriers, First Edition

' for the below options in


session properties to run
the failed TPT job is
sufficient. i) Drop
Error/Work/Log Tables ii)
Truncate Table options
Also, there is no need to
drop the intermediate
tables. And there is no
manual intervention
required at the Database
side. Once the session
Page 102 of 105

Teradata Utilities-Breaking the Barriers, First Edition

reruns successfully after


the failure one must keep
in mind that the session
has to be
'Undo checkout'
for the next future
successful runs.

Page 103 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Page 104 of 105

Teradata Utilities-Breaking the Barriers, First Edition

Activity (3)

Page 105 of 105