Vous êtes sur la page 1sur 89

Introduction to Teradata

Teradata Architecture

LEVEL LEARNER

Icons Used

Hands-on
Exercise

Coding
Standards
2

Referenc
e

Lend A
Hand

Question
s

Summar
y

Points To
Ponder

Test Your
Understanding

Module 1: Teradata basics


Objectives:
After completing this chapter you will be able to answer below
questions
What is Teradata?
What are the unique features of Teradata?
What are Teradata components and its functions?
What is Teradata Architecture?

Introduction to Teradata Database

Teradata is a relational database management system that drives


companys data warehouse
Compatible with Industry standards (ANSI Complaint)
The architecture supports both single-node, Symmetric Multiprocessing
(SMP) systems and multinode,. Massively Parallel Processing (MPP) systems
It uses parallelism to manage terabytes of data
It is built on a parallel Architecture
Its scalability ranges from 10GB to 100+TB of data
Teradata runs on UNIX MP RAS, Windows 2000 server platform
It is capable of supporting many concurrent users from various platforms
Over TCP/IP or IBM channel connection

Unique Features of Teradata


Parallel processing
Each AMP holds a portion of the data and they them in parallel
Linear Scalability
Double the AMPS and double the speed
Mature Optimizer
PE is the Matured optimizer
Automatic Data distribution
Each table has Primary index which is hashed and distributes to AMP
automatically
Shared Nothing Architecture
Each AMP has their own Memory, CPU and disk, so called shared
Nothing Architecture
Single Data Store
Teradata scalability allows all data to be on one system. This is Single
data store

Teradata Parallel processing

The rows of a Teradata table are spread across the AMPs, so


each AMP can then process in parallel when a USER queries
the table.
Parsing engine
(PE)

BYNET

Teradata Linear Scalability


Teradata Systems can Add AMPs for Linear Scalability
Linear Scalability means if you double your AMPs and their
supporting nodes the performance doubles!

Teradata Architecture
Teradata Components
Parsing engine (PE)
BYNET (BanYan NETwork)
AMP
Disk

What is a Node?

Gateway and Channel-drive software run as processes.


Users connecting via the Mainframe access Teradata
though the Channel and all other users utilize the LAN
gateway.
The Parallel Database Extension (PDE) controls the Access
Module Processors (AMPs) and Parsing Engines (PEs) which
are referred to as Virtual Processors (Vprocs) and they
reside in the nodes memory.
The operating system running the node is Linux.

Node
Each Node is attached via a Network to a Disk Farm
A Teradata AMP will be assigned a Virtual disk to store its
tables and the rows .
Only the AMP assigned to the virtual disk can read or write
to that disk.
A node holds 40-50 AMPs.

Number of Nodes and Amps


Query to identify number of nodes in Teradata server
SELECT NodeID FROM dbc.ResUsageSPma
GROUP BY 1
Query to identify number of AMPs in Teradata server
SELECT Vproc FROM dbc.diskspace
GROUP BY 1

SMP Node

SMP stands for symmetric multi-processing which means


each CPU processor performs equally, and all CPUs share a
pool of memory and operate under one operating system.

MPP

Two SMP nodes connected via the BYNETs are now one
Massively Parallel Processing (MPP) system.

Teradata Functional Overview


Picture depicts LAN Connections for Network Attached
Client

Teradata Functional Overview


Picture depicts Mainframe connection to Teradata

Parsing Engine
When a user logs into Teradata, a PE will log them in and be
responsible for their entire session
The PE checks the SQL Syntax
The PE creates the EXPLAIN plan checks security and builds a
plan for the AMPs to follow. Hence PE is also known as
Optimizer.
The PE converts EBCDIC (from the mainframe queries) to
ASCII on the way in and the AMPs are responsible for
converting from ASCII to EBCDIC on the way out.
The PE always delivers the final answer set to the user.
The Parsing Engine's biggest responsibility is
building a parallel-aware, cost-based plan for the AMPs to follow
to retrieve the data

Parsing Engine Components


Parsing
Engine
Elements

Process

Manages session activities, such as logon,


password validation, and logoff.
Session Control
Recovers sessions following client or server
failures.
Decomposes SQL into relational data
Parser
management processing steps.
Determines the most efficient path to access
Optimizer
data.
Receives processing steps from the parser
and sends them to the appropriate AMPs via
the BYNET.
Dispatcher
Monitors the completion of steps and
handles errors encountered during
processing.

How does PE builds best plan?


The PE uses the COLLECTED STATISTICS to build the best
plan (least cost plan).
Collect stats defines the confidence level of PE in estimating
"how many rows it is going to access ? how many unique
values does a table have , null values and all this info is stored
in data dictionary. Once you submit a query in Teradata, the
parsing engine checks if the stats are available for the
requested table , if it has collected stats earlier PE generates a
plan with "high confidence" . in absence of collect stats plan
will be with "low confidence" in data dictionary

BYNET

BYNET connects PE and AMP for passing various instructions and


corresponding outputs.
In Teradata system, there are two BYNET systems viz. BYNET 0 and
BYNET 1. This is because, in case one BYNET fails, the other one carries
the instruction. It also fastens communication and hence enhances query
performance.
Symmetric Multiprocessing Node (SMP) It has Boardless BYNET and no
Physical BYNET
Massively Parallel Processing system (MPP) - Nodes are connected by then
two physical BYNET boards.
BYNET is responsible for Broadcast, multicast and point to point
communications between nodes and virtual processors.

AMP

AMPS are responsible for storing and retrieving rows from their
assigned disk (Vdisk).
AMPs lock the tables and rows.
AMPs sort rows and do all aggregation.
AMPs handle all space management and space accounting.
AMPs convert ASCII to EBCDIC when returning answer sets to the
mainframe.
In Teradata 13, the AMP Worker Task (AWT) per AMP is increased for better
performance.
All Teradata Tables are spread across ALL AMPS

Disk Array

Each AMP Vproc is assigned to a disk


A Vdisk may contain 119 GB of its disk space

Teradata Components

The maximum number of vprocs per node can be as high as


128
Each Parsing Engine (PE) can manage up to 120 individual
sessions
Each nodes will hold up to 40-50 AMPs
The maximum number of vprocs that can be supported in a
single system is 16,384
Each BYNET supports up to 1024 nodes in a system

Questions

23

Test Your Understanding


Questions:
1.
2.
3.
4.
5.

24

What is Parsing engine?


AMP stands for ?
What is the function performed by BYNET?
How many BYNET systems are there in Teradata? Explain
their functionalities.
What is TDP?

Summary
The chapters give a detailed overview of the following
processes in Teradata:
The PE checks the syntax of the query, also checks the
security right of the user accessing.
The PE comes up with the best optimized plan for execution
of the query.
The PE passes this plan through BYNET to AMP.
The AMPs follow the plan to retrieve data from its DISKS.
The AMP passes the data to PE through BYNET.
The PE then passes the data to the user.

25

Module 2: RDBMS Overview


Objectives:
After completing this chapter you will be able to answer the
following questions
What is RDBMS?
Describe Logical/Relational Modeling?
What is the relationship between primary and
foreign keys?
What are the advantages of Relational Modeling?

Introduction to RBMS
A

database is the collection of permanently stored data that is


Logically related data relates to other data
Shared many users may access data
Protected access to data is controlled
Managed Data has integrity and value
Based on relational model

Logical/Relational Model
The Logical Model
Should be designed without regard to usage
It cannot accommodate wide variety of front end tools
It allows database to be created more quickly
Should be same regardless of data volume
Represents real world business in a tabular (relational) form.
Includes all the data definitions within the scope of
enterprise or application
Is generic , Logical model is the template for physical
implementation on any RDBMS platform.
Teradata supports fully normalized logical models
Ability to perform 64 table joins
Ability to perform large aggregations

Logical/Relational Model
A column always contain like data
Relational database contains set of logically related tables
A table is a two dimensional representation of a data consisting of
rows and columns
Column always contain like data
A row is one instance of all the columns in a table
In a relational database, tables are defined as a named collection of
one or more named columns that can have zero or many rows of
related information
Each row represents an occurrence of entity defined by the table. An
entity is defined as a person, place, thing or event about which the
table causes information.
In relational math, the following stand true

Table = a relation or equivalent to that


Row a tuple
Column an attribute

Primary and Foreign keys


Primary Key rules:
A Primary Key is required for every table.
Only one Primary key is allowed in a table.
Primary keys may consists of one or more columns.
Primary keys cannot have duplicate values (ND).
Primary keys cannot be Null (NN).
Primary keys are considered non- changing values (NC)
Foreign Key rules:
FK are optional.
More than one Foreign key is allowed in a table.
FKs may consists of one or more columns.
Foreign keys can have duplicate values .
Foreign keys can be Null.
Changes to Foreign keys are allowed.
Each FK must exist somewhere as primary key (Referential integrity)

Relational Advantage
Advantages of relational database:
Ease of use: The revision of any information as tables consisting of rows and columns is much easier to
understand .
Flexibility: Different tables from which information has to be linked and extracted can be easily
manipulated by operators such as project and join to give information in the form in which it is desired.
Security: Security control and authorization can also be implemented more easily by moving sensitive
attributes in a given table into a separate relation with its own authorization controls. If authorization
requirement permits, a particular attribute could be joined back with others to enable full information
retrieval.
Data Independence: Data independence is achieved more easily with normalization structure used in
a relational database than in the more complicated tree or network structure.
Data Manipulation Language: The possibility of responding to query by means of a language based
on relational algebra and relational calculus e.g SQL is easy in the relational database approach. For data
organized in other structure the query language either becomes complex or extremely limited in its
capabilities.
Cater for future requirements: By having data held in separate tables, it is simple to add records
that are not yet needed but may be in the future. For example, the city table could be expanded to
include every city and town in the country, even though no other records are using them all as yet. A flat
file database cannot do this

Module 3: Teradata Index


Objectives:
After completing this chapter you will be able to answer below
questions
What is Primary Index?
What is Secondary Index?
How data rows are stored and retrieved?

Indexing
Index is the physical mechanism to store the data

Primary keys Vs. Primary Indexes


Indexes are conceptually different from Keys
A PK is a relational modeling convention which allows each
row to be uniquely identified
A PI is a Teradata convention which determines how row will
be stored and accessed

Primary Index

The Primary Index is defined when the table is created.


The Primary Index cannot be changed. Changing the PI
requires dropping and recreating the table.
It is a mechanism to assign a row to an AMP

When the Primary Index is not specified , Teradata will default to


the first column in the table, and it will be defined as NonUnique.

Unique Primary Index (UPI)

If Index choice of column is Unique then it is UPI.


UPI will result in even distribution of the rows of table
across all AMPs

Unique Primary Index (UPI)

Use the Primary Index column in your SQL WHERE clause


and only 1-AMP retrieves
UPI is a one AMP operation and returns one row

Non-Unique Primary Index (NUPI)


If Index choice of column is not Unique then it is NUPI.
NUPI will result in even distribution of the rows of table proportional to
the degree of uniqueness of the Index.

A Non-Unique Primary Index (NUPI) will have duplicates grouped together on


the same AMP, so data will always be skewed (uneven). The above skew is
reasonable

Non-Unique Primary Index (NUPI)

Use the Primary Index column in your SQL WHERE clause


and only 1-AMP retrieves.
NUPI is a one AMP operation and returns multiple rows

Multi-Column Primary Index


A table can have only one Primary Index, but you can combine
up to 64 columns together max to form one Multi-Column
Primary Index.

Multi-Column Primary Index

Use the Primary Index column in your SQL WHERE clause,


and only 1-AMP retrieves

NO Primary Index

A table that specifically states NO PRIMARY INDEX will


receive no primary index. It will distribute the data evenly
but randomly, and this is often used as a staging table.

NO Primary Index
To retrieve a record , Teradata performs Full table scan as
there is no primary index.

NO Primary Index

NoPI is generally preferred when the need is to load records


temporarily into staging table.
Data can be quickly loaded from the source to the staging
table. From the staging table the data can be moved to
Production table using Insert/select statement.

How Teradata distributes and retrieves data

The Teradata Parsing Engine will take the Primary Index Value of a row and
run a math calculation called the Hash Formula on that Primary Index
column value.
It produces 32 - bit row hash which equates to an integer
The Row Hash will go to a bucket in the Hash Map and is assigned to an
AMP
32 bit row hash
00000000000000000101 = 13

Every Teradata System has one Hash Map with a million buckets. Inside the
buckets are AMP numbers

Placing rows on AMP

The below example hashed Emp_No 1001 (Primary Index value) and the
output was a Row Hash of 13. Teradata counted over to bucket 13 in the
Hash Map, and it has the number one (1) inside that bucket. This means
that this row will go to AMP 1.
Emp_No 1002 (Primary Index value) and the output was a Row Hash of 5.
Teradata counted over to bucket 5 in the Hash Map, and it has the number
two (2) inside that bucket. This means that this row will go to AMP 2.
There is one Hashing Formula in Teradata, and it is consistent.

Emp No 1001

Emp No 1002

Review of Hashing process

Hash the Primary Index Value for a row with the Hash
Formula.
The output of the Hash Formula is a 32-bit Row Hash.
Take the Row Hash and find its corresponding bucket in the
Hash Map.
Send the row and its Row Hash to the AMP listed in the
Hash Map Bucket.

Skew Factor

Skew refers to the row distribution on AMPs. If the data is highly


skewed, it means some AMPs are having more rows and some
very less i.e. data is not properly/evenly distributed. This in turn
will result in poor performance. Choice of Indexes should be made
with utmost care to avoid Skewness.

NULL values in the Primary Index is the main reason for skew. A
Table with a Unique Primary Index can have only one Null value,
but a NUPI table can have many NULL values, and each NULL
value hashes to the same AMP.

Uniqueness Value

Each AMP will place a Uniqueness Value after the row hash
to track duplicate values
The Hash Formula is consistent so every Smith has the
same Row Hash and the same goes for each Jones and each
Patel. Therefore, duplicate values land on the same AMP.

Row-ID equals the Row Hash of the Primary Index column


and the Uniqueness Value.

Row ID
UNIQUE PRIMARY INDEX
The Uniqueness Value on
each Row-ID is 1.
Each AMP sorts their rows by
the Row-ID.

NON - UNIQUE PRIMARY


INDEX
Uniqueness Value increases
on all duplicate names
Each AMP sorts their rows by
the Row-ID.

AMPs sort rows by Row-ID so like data is grouped


together and for Binary searches.

Example
Sel * from Employee_table where
last_name =Smith;
Plan:
1. PE sees the last name as Priamry index
2. It hash Smith and get row hash
3. Row hash =7
4. Counts the bucket in hash map 7 times
and it says Amp 1
5. Passes message to AMP1 through
BYNET to retrieve row has 7s
6. Bring back all columns for Row hash 7
(Smith)

Binary Search - Example


Sel * from order_table where
Order_Number=50;
Plan:
1.
PE sees the order_number as Priamry
index
2.
It hash 50 and get row hash
3.
Row hash =75
4.
Counts the bucket in hash map 75
times and it says Amp 1
5.
Passes message to AMP1 through
BYNET to retrieve row has 75
6.
Perform a Binary Search

Primary Index Example

A Unique Primary Index will


spread the data perfectly
evenly

A Non-Unique Primary
Index will NOT spread the
data perfectly evenly.

Primary Index Example

Multi-Column Primary
Index is often used to fix a
data skew problem

In No Primary Index , all


AMPs read all of their rows
(full table scan) because
there is no Primary Index.

Secondary Index

Secondary Index can be created and dropped dynamically


Syntax

Secondary index requires a separate physical structure (the


subtable), but a Primary Index do NOT require a separate
physical structure
Unique Secondary Index (USI) Subtable contains two
columns

1.
2.

Emp_No (The USI column)


Row-ID of the real Primary Index of the base table

Primary Index Vs Secondary Index

How Parsing Engine uses the USI Subtable


Parsing Engine plan - It is a 2 AMP operation

Emp_no is a USI.
PE will hash 1004 and see which AMP holds row in subtable. (AMP 3).
PE will have the BYNET contact with AMP 3 and retrieves row 1004 (Single AMP).
AMP will pass the real row id of base table row (1,4) back up to PE.
PE will use the ROW ID to find the base table row with another single AMP retrieve.
A USI is a Two-AMP Operation
The first AMP is assigned to read the subtable and the second the base table.
Two binary searches are performed in total, and one row is returned.

Non Unique Secondary Index

Syntax

Non Unique Secondary Index (NUSI) Subtable contains two


columns
1.
2.

Emp_No (The USI column) First_Name (The NUSI column)


Row-ID of the real Primary Index of the base table

. The NUSI rows get their own Row-ID, but they are not
hashed to different AMPs and stay AMP local.

NUSI are AMP -Local

Subtable rows match those of the base rows on the same


AMP , hence it is AMP Local.
A NUSI query always searches all AMPs, but the intent is not
to do a Full Table Scan. If there are 50 AMPs, then a
minimum of 50 binary searches are done.

How Parsing Engine uses the NUSI Subtable


Parsing Engine plan - It is ALL AMP operation

First_name is a NUSI.
PE will order each AMP to search if they have kyle in their NUSI subtable
Each AMP will simultaneously perform a binary search on their NUSI Subtable
If AMP has Kyle, PE will order them to retrieve the base row.
If there are 50 AMPs, then all 50 AMPs will perform a binary search simultaneously and
if they find Kyle they perform another binary search on base table.

A NUSI is ALL AMP Operation

Primary Index vs. Secondary Index


Index Feature
UPI NUPI
Required?
Yes*
Yes*
Single-AMP Retrieve
Yes
Yes
Number of Binary Searches
1
1
Number per Table
1
1
Max Columns
64
64
Unique
Y
N
Affects Row Distribution
Y
Y
Created/Dropped Dynamically
N
N
Improves Access
Y
Y
Can be multiple data types
Y
Y
Separate physical structure
N
N
Extra Processing Overhead
N
N
May be ordered by value
N
N
May be partitioned
Y
Y
* Teradata has a NoPI table now in V13.10

USI
No
No
2
"0-32"
64
Y
N
Y
Y
Y
Sub-table
Y
N
N

NUSI
No
No
Many
"0-32"
64
N
N
Y
Y
Y
Sub-table
Y
Y
N

Full- Table Scans

Teradata Database always uses a full-table scan to access


the data of a table if a query:
Accesses a NoPI table that does not have an index
defined on it
Does not specify a WHERE clause
The Index columns are not used in the query
An index is used in a non Equality test
A range of values is specified for the primary index
A full-table scan is always an all-AMP operation, and should
be avoided when possible

Questions

63

Summary
Index is the physical mechanism to store the data
A PK is a relational modeling convention which allows each row to be
uniquely identified
The Primary Index is defined when the table is created.
A table can have only one Primary Index, but you can combine up to 64
columns together max to form one Multi-Column Primary Index.
Hash the Primary Index Value for a row with the Hash Formula.
The output of the Hash Formula is a 32-bit Row Hash.
Row-ID equals the Row Hash of the Primary Index column and the
Uniqueness Value.
Secondary Index can be created and dropped dynamically
Non Unique Secondary Index (NUSI) Subtable contains two columns
Emp_No (The USI column) First_Name (The NUSI column)
Row-ID of the real Primary Index of the base table
NUSI are AMP -Local

Test Your Understanding


1.
2.
3.
4.
5.

How are both tables sorted?


What was the Row-ID when Minal was hashed?
Looking in the subtable what is the Row-ID of the base for employee
1006?
When 1006 was placed in the subtable, which bucket in the hash
map was chosen?
How many times is the Hash Map consulted on a query using a USI in
the WHERE Clause?

Module 4: Space
Objectives:
After completing this chapter, you will be able to answer the
following questions
What is Teradata database and user?
How are space allocated to Teradata objects?
What is the hierarchy of objects in Teradata syatem?

Space
There are three types of space in Teradata
Perm Space : PERM space houses permanent tables,
Secondary Indexes, Join Indexes and Permanent Journals
Temp Space: Temp space is store temporary tables
Spool Space : Spool space is used by each AMP in order to
build the answer set for the user.

A Teradata Database(Example)
A Teradata database is a logical repository for
Tables (requires perm space)
Views (uses no perm space)
Macros (use no perm space)
When a system arrives, there is only one user called DBC.
USER DBC
System user DBC contains all Teradata Database software components and all system
tables.
Syntax:
CREATE DATABASE new_db FROM existing_db
AS
PERMANENT = 20000000
,SPOOL= 50000000
,TEMP = 20000000
new_db is owned by existign_db
A database is empty until all objects are created within it
A database with no PERM space can have view and macros but not tables

A Teradata User
A Teradata user is a database with an assigned password
A Teradata user may also own tables, view, macros, triggers but users with no
perm space may not own tables
A user may logon to Teradata and access objects within:
Itself
Other database for which it has access rights
Syntax:
CREATE USER new_user FROM existing_user
AS
PERMANENT = 10000000
PASSWORD =Acdmy
,SPOOL= 50000000
,TEMP = 20000000
new_user is owned by existing_user
A user is empty until all objects are created within it

The Teradata Hierarchy

Initially DBC owns 10 TB of PERM space. DBC created


Spool_Reserve (4 TB), USER Retail (2 TB) and USER
Financial (2 TB) and later that DBC has only 2 TB of PERM
space.
USER Retail and USER Financial can create the databases
and users desired as below.

Difference between PERM and Spool space


Assume User A has 2TB of permanent space ,10
GB of spool space and has 1000 users under them
User A can create and load up to 2 TB of Tables
data in his PERM space
Every 1000 user under A say A1, A2, A3. can
run queries up to 10GB of spool space
simultaneously

Test Your Understanding


What is the difference between
Teradata Database and Teradata
User?

Module 5: Data Protection


Objectives
After completing this module you will be able to answer
How locks prevents loss of data integrity?
What are the types of locking provided by Teradata?
What are FALLBACK tables?

Locks
There are four types of Locks
Exclusive Lock: This is placed only on a database or table when the
object is going through a structural change. Prevents any other type of
concurrent access to database or tables and never to rows
Write Lock: This happens on an INSERT, DELETE, or UPDATE request. It
prevents other Read, Write and Exclusive locks
Read Lock: This is placed in response to a SELECT request. This restricts
access by users who require Exclusive or Write locks. If you have a multiuser environment with updates occurring and you need to keep data
consistent, you want a read lock.
Access Locks(Dirty-Read or Stale-Read): An Access lock permits the
user to access to READ an object that may already be locked for READ or
WRITE. An access lock does not restrict access by another user except
when an Exclusive lock is required. This is placed in response to a userdefined LOCKING FOR ACCESS phrase. A user requesting access cannot
be concerned with data consistency.

Locks
Locks are applied at 3 levels
1. Database: Applies to
tables/Views in the database
2. Table/View: Applies to all rows
in a table
3. Row Hash: Applies to all rows
with same Row Hash
Rule:
Lock requests are queued
behind all outstanding incompatible
lock request for the same object.
Row Hash Lock Syntax :
Locking Row for Access SELECT
* FROM TABLE_A;

Compatibility between Read Locks


Read Locks are compatible but Write Locks are not.
Assume in Employee_Table, we have four SQL statement first two are SELECT, third is
INSERT and fourth is SELECT.

Compatibility:
Read supports other Read locks and Access Locks
Write supports Access Lock

Cliques

A cliques is a defined set of nodes with fallover capability


A clique protects against a node failure
All nodes in a clique must be able to access all vdisks for all
amps in a clique
If a node fails all AMPs will migrate to the remaining nodes
in a clique
When a node fails:

Teradata resets
On the restart the AMPs in Node 1 Migrate
The system is degraded but still able to function
The down node is fixed
Another reset is done and the AMPs return home

Each node can support 128 AMPs

Cliques

An example of Four node cliques

Node 1 fails and the AMPs are migrated to other AMPS

Fallback

Fallback is to protect against an AMP Failure.


Fallback makes a duplicate copy of every row in a table and keeps that row
on a different AMP.
If an AMP goes down, the system can still process the query because the
rows on the failed AMP are also held by another AMP.
Automatically restores data changed during AMP offline.
It is critical for high availability applications.
Cost of Fallback:
The cost of Fallback is that the table is twice as big and uses twice the
space.
Twice the Inserts, updates, and deletes is needed.
Table
with
Fallback
and with noCREATE
fallback
CREATE
TABLE
Emp_Intl,
TABLE Emp_Intl, No
Fallback
(Emp_No INTEGER
, Dept_No SMALLINT
, First_Name VARCHAR(12)
, Last_Name CHAR(20)
, Salary DECIMAL(10,2))
UNIQUEPRIMARY INDEX
Note:
Default
( Emp_No
); is No fallback

Fallback
(Emp_No INTEGER
, Dept_No SMALLINT
, First_Name VARCHAR(12)
, Last_Name CHAR(20)
, Salary DECIMAL(10,2))
UNIQUEPRIMARY INDEX
( Emp_No );

Fallback Clusters

A cluster is a group of AMPs that act as a single fallback


unit.
Fallback rows for AMPs reside in a cluster.
Loss of AMPs in a cluster permits continued table access.
Loss of 2 AMPs in the cluster causes the RDBMS to halt.
2 Clusters with 2AMP each

System performance can be adversely affected when any


AMP has a disproportionate burden

Fallback Vs. Non-Fallback tables


Fallback tables

One AMP down


Data fully available

Tow or more AMPs down


In different cluster
Data fully available
In the same cluster
System halts.
Non - Fallback tables

One AMP down


Data partially available
Queries avoiding down AMP succeed

Tow or more AMPs down


In different cluster
Data partially available
Queries avoiding down AMP succeed
In the same cluster
System halts.

RAID
RAID Redundant Array of Independent Disks
Two Types of Disk Array protection
RAID 1(Mirroring)

RAID 1 provides each AMP two disks for storing data and two disks
for mirroring.
The data disk and the mirror disk are called a mirrored pair.
RAID 1 costs 50% of the disk space, but it ensures a 99% up time for
customers.
If a single disk goes down, it is easily replaced and Teradata isn't
even effected

RAID
RAID 5(Parity):
For every 3 blocks of data, there is a parity block on a 4th disk.
If a disk fails, any missing blockmay be reconstructed using the
other three disks
Array controller reconstruction of failed disk is longer than RAID
1

Summary:
RAID 1: Good Performance with disk failures. Higher cost in
terms of disk space
RAID 5: Reduced Performance with disk failures. Lower cost in
terms of disk space

Questions

84

Test Your Understanding


1.
2.
3.
4.
5.
6.

List the type of locks in Teradata


What are compatibility locks?
What is Dirty read lock?
How can the Node failure be protected?
What is RAID?
Is it mandatory to have FALLBACK for all tables?

Summary

Exclusive Lock is placed only on a database or table when


the object is going through a structural change.
Write Lock happens on an INSERT, DELETE, or UPDATE
request.
Read Lock is placed in response to a SELECT request.
Access Locks is also known as Dirty-Read or Stale-Read.
A cliques is a defined set of nodes with fallover capability.
Fallback is to protect against an AMP Failure.
RAID 1 shows good Performance with disk failures.

Source

Tera Tom e Book


Teradata Database Design (PDF)
www.teradataforum.com
www.teradata.com

Disclaimer: Parts of the content of this course is based on the materials available from the
websites and books listed above. The materials that can be accessed from the linked sites
are not maintained by Cognizant Academy and we are not responsible for the contents
thereof. All trademarks, service marks, and trade names in this course are the marks of the
respective owner(s).
32

Change Log

34

Version
Number

Changes made

V1.0

Initial Version

V1.1

Slide No.

1-86

Bhuvanya.M
(221634)

Changed By

Effective
Date
05/05/2015

Changes
Effected
Base line
content

Introduction to Teradata
You have successfully completed the
session on Teradata Architecture