Vous êtes sur la page 1sur 29

White Paper Product Release Information

IBM Information Server


What’s New in IBM® WebSphere® DataStage® 8.0
IBM Information Server is the industry’s first comprehensive, unified foundation for enterprise information
architectures, capable of scaling to meet any information volume requirement. It combines the technologies within
the Information Integration Solutions portfolio (WebSphere DataStage, WebSphere QualityStage, WebSphere
Information Analyzer, and Information Integrator) into a single unified platform, allowing companies to easily
understand, cleanse, transform, and deliver trustworthy and context-rich information.

IBM Information Server


Unified Deployment

Understand Cleanse Transform Deliver

Combine and Synchronize, virtualize


Discover, model, and Standardize, merge, restructure information and move information
govern information and correct information for new uses for in-line delivery
structure and content

Unified Metadata Management

Parallel Processing
Rich Connectivity to Applications, Data, and Content

This paper outlines what is new in the WebSphere DataStage 8.0 release. This exciting new release is a revolution
in data integration and transformation, which contains many enhancements and new features, including many
enhancements requested by customers. New features that are specific to the parallel framework are also noted in
this document.
For more detailed information on these features, please consult the product documentation.
Note: This document will be updated prior to the General Availability of the DataStage 8.0 release with additional
information about the release.

©2006 IBM Corporation. All Rights Reserved. Page 1


What’s New in WebSphere DataStage 8.0

What are the key WebSphere DataStage 8.0 new features and
capabilities?
• New WebSphere Metadata Server Foundation to better integrate the products across IBM
Information Server and support the enterprise; with new meta data services DataStage provides
graphical impact analysis and job diff right within the DataStage and QualityStage Designer
• Completely Integrated Data Quality to ensure the most accurate, complete information is made
available
• Significant ease-of-use enhancements to improve usability and productivity
• New and Expanded Transformation Functionality to aid DataStage users in job design
• Continued focus on Performance and Throughput Improvements while providing detailed
performance analysis and system resource estimation
• Connectivity improvements, including the next generation of connectors
• Common and Enhanced Installation, Configuration, Administration and Reporting across
IBM Information Server
• General Enhancements related to specific customer technical support requests
• Migration, Upgrade, and Platform Support

WebSphere Metadata Server Foundation

Architecture
IBM Information Server embraces the design concepts of Service-Oriented Architecture (SOA) to deliver
multiple discrete services that hide the complexities of distributed configurations, thus allowing services to
concentrate on their functionality. With this architecture, individual components within IBM Information
Server can be used to compose intricate tasks without custom programming.
A Service-Oriented Architecture enables the design of common components which are accessible and
shared by all the other elements of the platform. These Common components allow all of the products in
IBM Information Server to operate in a uniform and well integrated manner. By eliminating duplication of
functions the architecture makes efficient use of hardware resources and reduces the amount of
development and administrative effort required to deploy a data integration platform.

©2006 IBM Corporation. All Rights Reserved. Page 2


What’s New in WebSphere DataStage 8.0

UNIFIED USER INTERFACE

Analysis Development Web Admin


Interface Interface Interface

COMMON SERVICES

Metadata Unified Security Logging &


Services Service Services Reporting
Deployment Services

UNIFIED PARALLEL PROCESSING UNIFIED METADATA

Understand Cleanse Transform Deliver Design Operational

COMMON CONNECTIVITY

Structured, Unstructured, Applications, Mainframe

The figure above shows the five top-level components of the IBM Information Server architecture. The
common repository and services are explained in the sections below. The common services layer is
deployed on a J2EE compliant application servers, the IBM WebSphere Application Server for the 8.0
release.

Common Metadata Repository


IBM Information Server introduces the next generation metadata repository, the WebSphere Metadata
Server, that is fully integrated and common across all products in the IBM Information Server, including
WebSphere Information Analyzer (the next generation data profiling & analysis technology), WebSphere
QualityStage, WebSphere DataStage, and WebSphere Business Glossary. This new repository resides
on an open RDBMs (DB2 UDB, Oracle, or SQL Server). If you do not want to use your own database
instance, IBM provides DB2 for use specifically with the 8.0 product which is integrated as part of the
installation process.

©2006 IBM Corporation. All Rights Reserved. Page 3


What’s New in WebSphere DataStage 8.0

This new dynamic enterprise metadata


foundation replaces the metadata prisons of
the past and transforms metadata from an
“end” to a “means” to manage data and
simplify integration. Because the repository is
common across all products, when data
profiling is occurring using WebSphere
Information Analyzer, for instance, the table
definitions and the pertinent profiling
information – such as primary key information,
foreign keys, notes, etc. – is available to a
DataStage or QualityStage user in the
DataStage and QualityStage Designer, with
no export/import, as shown in the screen
below.

WebSphere Metadata Server also provides a number of services, located on the server for performance
and scalability, that DataStage takes advantage of. This provides, for instance, impact analysis in the
context which the DataStage user can better use. More on this in Ease of Use Enhancements below.

Information Services Framework


IBM Information Server release brings the next generation of enterprise services, common to all products.
This simplifies administration, operation, licensing, and deployment. These services reside on an
application server; WebSphere Application Server is provided as part of the installation.

Logging
All products in IBM Information Server will log messages to a common place. Customers using multiple
products in the IBM Information Server will no longer have to look in multiple logs for problem
determination. For DataStage users, the log will still be available in the Director and through the existing
command line interfaces. Users can also view logs from the Web Console for IBM Information Server, a
browser-based interface. Administrators can define users to a specific role where that user can only view
logs. This allows developers, for instance, to only be able to view logs from a browser to aid in problem
determination in a production environment, but they are not allowed to do anything else such as start
jobs, change jobs, delete logs, etc. This is critical for locked down production environments.
Administrators can also create log views which allow users to only view specific entries in a log.

Security

See the Security section of Enhanced Installation, Configuration, and Administration below.

Integrated Data Quality


With the 8.0 release, IBM WebSphere QualityStage™, the best in class data quality product, evolves
data quality with the unification of data quality and data transformation capabilities via a combined
framework and enhanced design paradigm. QualityStage evolves via a new user interface which is
based on a visual drag-and-drop design; an improved set of features delivering increased productivity
along with innovations in data matching through a data driven design experience. This is all enabled by a
foundational framework of dynamic meta data, active integration services and a high performance
engine! To state it simply, QualityStage is now a set of parallel stages which completely integrates
DataStage and QualityStage.

©2006 IBM Corporation. All Rights Reserved. Page 4


What’s New in WebSphere DataStage 8.0

The QualityStage stages include:


• Investigate: Analyze your information and re-use that knowledge in the match. This includes
field based and pattern based analysis.
• Standardize: Cleanse your data to deliver high quality information with packaged rules or
customer built to meet your business.
• Match: Link matching records in your data. The new Match design interface is shown above.
• Survive: Roll up your information to create a single best of breed record
Note: WebSphere QualityStage is a separately licensed product from WebSphere DataStage.

Ease-of-Use Enhancements
With the Metadata Server described above, DataStage is taking advantage of the services directly in the
DataStage and QualityStage Designer. These include:
• Impact Analysis
• Object Difference (job, table, and routine difference)
• Quick Find and Advanced Find
Note: These new capabilities are available for all DataStage (and QualityStage) users, regardless of job
type (server, parallel, and job sequences).

Impact Analysis
From the contextual menu of an object in the repository tree or in some cases the stages on the design
canvas, several new options are available, including Find Dependencies and Find Where Used.

©2006 IBM Corporation. All Rights Reserved. Page 5


What’s New in WebSphere DataStage 8.0

The results show “What does this item depend on?” and “Where is this item used?”, respectively. The
results are shown in a tabular view and graphically. This brings more information to the DataStage and
QualityStage user to assess the impact of a change for instance.

©2006 IBM Corporation. All Rights Reserved. Page 6


What’s New in WebSphere DataStage 8.0

The Impact Analysis also allows the selection of an object from the result list and then shows where and
how that object is used in a flow in the Impact Analysis – Path Viewer.

In this example, Job HVCustomerContainerStanFreq has a process stage, which has the Standardized
output link, containing 20 columns, which came from the BankDemoAccounts table.
The object editor can also be launched from this viewer.
The graphical view has navigation features including a bird’s eye view and zooming. Results can be
printed or saved to an XML file for additional processing, or remote user viewing, and can also be
published to the Reporting Console.

©2006 IBM Corporation. All Rights Reserved. Page 7


What’s New in WebSphere DataStage 8.0

Object Difference
Object difference is now available for jobs, routines, and table definitions. A textual report in a DataStage
context is returned.

Hot links inside the report bring the user to the relevant editor in the Designer for the object selected.

©2006 IBM Corporation. All Rights Reserved. Page 8


What’s New in WebSphere DataStage 8.0

Jobs and table definitions can also be compared across projects. The user is required to log into the
second project. The results are then shown as described above. This new feature will significantly
improve productivity for DataStage and QualityStage users.
The Job Report, first introduced with the DataStage release 7.5 along with the Impact Analysis, Job Diff,
and Advanced Find can be printed and all of them can be published to the Reporting Console for viewing
from a web browser by anyone with access (see below).

Find
Customers have built up their DataStage repositories with literally tens of thousands of objects – jobs, job
sequences, shared containers, table definitions, and more. Finding those objects can, at times, be a
daunting task, even for the most organized and well documented repositories.
The 8.0 release adds new Quick and Advanced Find features to make it easier to locate objects and work
with them.
Available at the top of the Repository View and from the toolbar, the Quick Find allows users to locate
objects with the following capabilities:
• Find Name (full and partial)
• Wild card support
• Find next
• Filter on object type
• Include the objects’ descriptions in the search

©2006 IBM Corporation. All Rights Reserved. Page 9


What’s New in WebSphere DataStage 8.0

The Advanced Find, available from an Object’s contextual menu and the
tool bar, allows the user to add more advanced filtering criteria to the
Find:
• Object type
• Creation
o Date/time
o User
• Last modification
o Date/time
o User
• Where used (What other objects use this object)
• Dependencies (What does this object use)
Now a user can, for example, find all the jobs changed within the past
week by Keith.
Advanced options include the ability to restrict case and match on “name
and description” or “name or description”.
Results from both the Find and Advanced Find are the same as from
Impact Analysis results (tabular view). Quick Find is available anywhere
you browse the repository, for example from a stage editor when you are
browsing for a table definition and in the new Export dialog.

New Repository Tree View


In the DataStage and QualityStage Designer, Folders have replaced
Categories. The restriction on where objects ”live” in the folder structure has
also been removed. Jobs, table definitions, routines, etc. can all be in one
folder or split among many folders which the user can name. This allows the
user to configure the repository content in the way that suits their applications.

The new repository provides locking semantics that now allow more than one
user to have a job open. The first user in opens the job for write, and the
second user opens the job read-only (the second user is presented with a
dialog informing them who has the job locked). This is a long-standing
enhancement request from customers that provides increased collaboration.
In addition, the repository tree now has an “expanded view” to optionally show
more details of repository items. The properties visible are configurable and
each column in this expanded view is sortable.

©2006 IBM Corporation. All Rights Reserved. Page 10


What’s New in WebSphere DataStage 8.0

No More Manager and Export Improvements


The DataStage Manager capabilities are now merged into the DataStage and QualityStage Designer.
The following are now directly available from within the Designer:
• DSX/XML Import/Export
• DataStage Enterprise Edition Configuration File Editor
• Message Handler Manager
• MetaBroker Import/Export
• Web Service Definition Import
• Import IMS definitions
• JCL templates editor
This means one less Client tool and allows users to directly import meta data in the Designer.
A new Export dialog – launched from the Export menu
pulldown, contextual menu, or based on the results from
a Find – makes it easier to use and provides functional
enhancements. The new Export dialog allows users to
export items of different types:
• A single item
• All items in a folder
• A project
• Several items of mixed type/folders
• Export based on the result of a Find
• Export of dependent items
The new GUI allows modification of the original export
list by Adding additional objects where users can use
the Find capability described above. Filtering options
also exist to export job designs and/or executables, and
to filter Read-only items.

©2006 IBM Corporation. All Rights Reserved. Page 11


What’s New in WebSphere DataStage 8.0

Job Parameter Sets


Job parameters make it easy to parameterize the job at execution time. However, there is no sharing of
job parameters between jobs and adding a parameter means adding it through the job property window.
There is also no mechanism to easily manage and change job parameters when moving jobs through the
product life cycle – Development, Test, System Test, Production.
Now with the 8.0 release, a new repository object called a Job Parameter Set contains the names and
values of job parameters. A Job Parameter Set can be added and used by one or more jobs. In addition
multiple parameter sets can be associated with a job.

Job Parameter Sets enable users to share job parameters and their associated values across multiple
jobs. This makes it easier to share common properties and also enables easier deployment of jobs
across machines. And, since Job Parameter Sets are objects in the repository, the user can perform
impact analysis to see where (which jobs) use a particular Job Parameter Set.
A new parameter set dialog allows parameters to be added to the parameter set. Environment variables
can also be added to a parameter set.

©2006 IBM Corporation. All Rights Reserved. Page 12


What’s New in WebSphere DataStage 8.0

The Values tab on the Parameter Set dialog is used to specify sets of values to be used for the
parameters in this set when executing a job. Each set of values is stored in a file of the given name when
the parameter set is saved. Parameter set files are stored in the DataStage directory at the same level
as the project folder. The Values file can be changed dynamically and the values will be picked up by the
job when it is run.

Once a parameter set is associated with a job, it can be used in the stages of the job by referencing
ParameterSet.Parameter.
When a job is executed either through the Director (see below) or the command line (dsjob...-param
KTParamSet=TestSystem), the user can specify which default values to use for the parameter set.

©2006 IBM Corporation. All Rights Reserved. Page 13


What’s New in WebSphere DataStage 8.0

SQL Builder Enhancements


DataStage 7.5.1A added the SQL Builder, a graphical interface for building complex SQL. The SQL
Builders contain database specific grammar and parsers which allow users to take advantage of
database specific functionality.
The 8.0 release expands the SQL Builder support beyond DB2 UDB and Oracle to SQL Server 2000 &
2005, Teradata v2r6/TTU 8.x, and ODBC 3.52. Support has also been added for INSERT, UPDATE, and
DELETE SQL statements. This means the SQL Builder can be used for both the source and target ends
of DataStage jobs.
The SQL Builder works within DataStage server and parallel jobs, and with plug-ins, parallel stages, and
the new common connectors (see below).

Documentation Improvements

Error Message Manual


The DataStage parallel framework can generate a significant number of messages in the log. These
messages do not have a unique identifier and sometimes problem determination can be daunting.
The 8.0 release adds unique message identifiers to every message in the DataStage parallel framework.
A new Error Manual will begin to document each message. The message meaning and resolution will
also be documented in 8.0 and upcoming releases.

DSEE-TFRS-00013
The record_format variable must have a sub-property type. {0} was the returned value.

Explanation:
The record_format variable was used without a sub-property.

User response:
You must use either the implicit or varying sub-property when you use the record_format variable. Use varying to
specify one of the following blocked or spanned formats: V, VB, VS, or VBS. Data is imported by using the selected
format. If you use the implicit sub-property, data is imported or exported as a stream with no explicit record boundaries.

Parallel Job Tutorial


A parallel job tutorial is now included to aid new users in getting started with DataStage. This will also
benefit QualityStage users.

New and Expanded Transformation Capabilities


The enhancements described in this section are specific to the DataStage parallel environment.

Lookup Enhancement
The Lookup stage has been extended to support lookup on a range of values. It now allows a single or
multiple row result of “input field A is between table field B and table field C.” This is very useful for date
processing.

©2006 IBM Corporation. All Rights Reserved. Page 14


What’s New in WebSphere DataStage 8.0

Surrogate Key Generation


Today, users can generate surrogate keys as required in DataStage jobs using the Surrogate Key
Generator. However, the user is required to manage the key generation across job runs through
parameters. With DataStage 8.0, enhancements have been made to the Surrogate Key Generator stage,
the Transformer, and the new SCD stage (see below) where DataStage will now manage the generation
of surrogate keys across job execution runs. The
keys can be managed in a file or in a DBMS (DB2
and Oracle are supported). With databases, the
DBMS sequence functionality is utilized.

Slowly Changing Dimension Stage


Many users use DataStage to build and populate
star schema data warehouses, usually with Type 1
and Type 2 dimensions to maintain history. While
DataStage provides rich capabilities to do this in
existing releases, a new stage is now available with
the 8.0 release that encapsulates most of the work
for the user.
The new Slowly Changing Dimension (SCD) stage
processes source data against a dimension table
within the context of a star schema database
structure. Type 1, Type 2, and a hybrid of both are
supported. The SCD stage automatically performs
the following actions:
• Prepare the data for loading. This means
that the following process is performed for each dimension in the star schema:
o Business key(s) from the source are used to lookup a surrogate key in each dimension
table

©2006 IBM Corporation. All Rights Reserved. Page 15


What’s New in WebSphere DataStage 8.0

o Typically the dimension row will be found. If not, a dimension row needs to be created.
If a dimension row is found but needs to be updated, the update is performed
o The source data is augmented by the inclusion of the surrogate key, and is reduced by
the elimination of non-fact data (i.e., data that is present in the input only for the case that
a dimension row would need to be created or updated)
• The record is written or loaded into the fact table (with all surrogate keys)
The SCD stage also introduces a new “Fast Path” concept for improved usability and faster
implementation. The fast path walks the user through the screens/tabs of the stage properties required
to process the stage. Help is available for each tab by hovering the mouse over the “I” in the lower left.

The first tab of the fast path in the dialog above defines the output link from the SCD stage.

©2006 IBM Corporation. All Rights Reserved. Page 16


What’s New in WebSphere DataStage 8.0

The second step matches the source column with the dimension column to define the lookup. For
performance reasons, DataStage will only load the latest dimension records into memory for each
partition.

The next tab defines how to create the surrogate key information. As described above, DataStage now
handles surrogate key generation and management across job runs. In this example, a specific file is
used. A job parameter can also be used to specify a file name. Alternatively, a database (DB2 or
Oracle) can be used.

©2006 IBM Corporation. All Rights Reserved. Page 17


What’s New in WebSphere DataStage 8.0

Step 4 defines how to detect changes to dimension records and what data to use when records are
created or updated. For Type 2 dimensions for instance, the user defines the current record indicator
verses the history records of the dimension.

Finally, map the output columns coming out of the SCD stage. The next stage could be another
dimension, or any other DataStage stage.
The new SCD stage will greatly enhance productivity of users that are working with star schema data
warehouses.
©2006 IBM Corporation. All Rights Reserved. Page 18
What’s New in WebSphere DataStage 8.0

Performance Improvements
The enhancements described in this section are specific to the DataStage parallel environment.

Job Startup Time and More


When DataStage parallel jobs start-up, the framework performs job validation, sets up internal process
communication, copies transformer libraries to remote nodes, and more. Depending on the job and
hardware environment, this startup time is improved with the 8.0 release.
Additional internal performance enhancements have been made in the areas of buffer optimizations and
the way the framework combines processes.

Job Monitoring Improvements


Job monitoring has been re-architected with the 8.0 release. Not only is performance improved, but time-
based monitoring can once again be utilized.
Job monitoring provides useful information for job execution and performance problem determination.
However, it should not interfere with the performance of a job executing. Adaptive Job Monitoring is a
new feature with the 8.0 release which detects when CPU utilization by the parallel framework’s
conductor reaches 80%. When this threshold is reached, the job monitoring data is throttled back and a
warning message is issued to the user. Internally, the conductor sends control messages to each player
to reduce their output rate.
When time-based monitoring is used, the monitor time interval on players is increased. When record
count-based monitoring is used, the record interval is increased until the conductor’s CPU utilization
becomes less than 80%.
Only monitor messages are throttled back; metadata and summary messages are not affected.

Resource Estimation
Predicting hardware resources needed to run DataStage jobs in order to meet your processing time
requirements can sometimes be more of an art than a science. With new sophisticated analytical
information and deep understanding of the parallel framework, IBM has added Resource Estimation to
DataStage (and QualityStage) 8.0.
With a job open, a new toolbar option is
available called Resource Estimation.
This option opens a new dialog called
Resource Estimation. The Resource
Estimation works by first creating a model of
the DataStage job. There are two types of
models that can be created:
• Static. The static model does not
actually run the job to create the
model. CPU utilization can not be estimated, but disk space can be. The record size is always
fixed. The “best case” scenario is considered when the input data is propagated. The “worst
case” scenario is considered when computing record size.
• Dynamic. The Resource Estimation tool actually runs the job with a sample of the data. Both
CPU and disk space are estimated. This is a more predictable model to use for estimating.

©2006 IBM Corporation. All Rights Reserved. Page 19


What’s New in WebSphere DataStage 8.0

Resource Estimation is used to project the resources required to execute the job based on varying data
volumes for each input data source.

A projection is then executed using the model selected. The results show the total CPU needed, disk
space requirements, scratch space requirements, and more.

©2006 IBM Corporation. All Rights Reserved. Page 20


What’s New in WebSphere DataStage 8.0

Different projections can be run with different data volumes and each can be saved. Graphical charts are
also available for analysis, which allow the user to drill into each stage and each partition. A report can
be generated or printed with these estimations.

This new feature will greatly assist users in estimating the time and machine resources needed for job
execution.

©2006 IBM Corporation. All Rights Reserved. Page 21


What’s New in WebSphere DataStage 8.0

Performance Analysis
Isolating job performance bottlenecks during a job execution or even seeing what else was being
performed on the machine during the job run can be extremely difficult. DataStage 8.0 adds a new
capability called Performance Analysis. It is enabled through a job property on the Execution tab which
collects data at job execution time. Note: by default, this option is disabled. Once enabled and with a job
open, a new toolbar option is available called Performance Analysis.

This option opens a new dialog called


Performance Analysis. The first screen
asks the user which job instance to
perform the analysis on.
Detailed charts are then available for that
specific job run including:
• Job timeline
• Record Throughput
• CPU Utilization
• Job Timing
• Job Memory Utilization
• Physical Machine Utilization – which shows what else is happening overall on the machine, not
just DataStage
Each partition’s information is available in different tabs.

©2006 IBM Corporation. All Rights Reserved. Page 22


What’s New in WebSphere DataStage 8.0

A report can be generated for each chart.


Using the information in these charts, a developer can for instance pinpoint performance bottlenecks and
re-design their job to improve performance.
In addition to instance performance, overall machine statistics are available. When a job is running,
information about the machine is also collected and is available in the Performance Analysis tool
including:
• Overall CPU utilization
• Memory utilization
• Disk utilization
Users can also correlate statistics between the machine information and the job performance.
Filtering capabilities exist to only display specific stages.
IBM understands performance analysis can be a complex task. The information collected and shown in
the Performance Analysis tool can be easily be sent to IBM for assistance in performance analysis when
requested through our Product Support group.

Connectivity Improvements

Next Generation “Rich” Common Connectors


The enhancements described in this section are specific to the DataStage parallel environment.
With the 8.0 release, new connectors will be available that are common for all products in IBM
Information Server. The new connectors are easier to use and extend functionality from the existing
connectors. The new connectivity architecture will also make it easier for IBM to release new connectors
and enhancements to them independent of a product (DataStage) release.
Notes: The existing connectors will continue to be supported (see below). The new common connectors
for the IBM Information Server are:
• ODBC
o Embedded DataDirect v5.2 Connect for ODBC drivers
• WebSphere MQ
o Adds support for “client only” configuration
• Oracle 10g
• DB2 UDB
©2006 IBM Corporation. All Rights Reserved. Page 23
What’s New in WebSphere DataStage 8.0

o DPF and non-DPF environments


• Teradata
o New support for Teradata Parallel Transport (TPT)
Note: Some of the new common connectors will be delivered after DataStage 8.0 initially becomes
Generally Available.

New GUI
A new common GUI is provided for each common connector. A navigator panel allows users to select
stages and links easily with Explorer-style navigation. Drag and drop connection objects make it easy to
configure a connection (see below). The SQL Builder is integrated to assist users to build SQL
statements. The source/target and properties are validated at design time, with warning indicators for
properties requiring user attention. Job parameters can be used/inserted for any property.

Stage/Link
Overview

Stage/Link Properties in
Explorer model

Built in
Connection Test

BLOB Support
With the new common connectors, DataStage has been extended to support BLOB’s. BLOB support
allows BLOB’s to be moved from a data source to a target without paying a huge performance penalty.
As BLOBs typically are not manipulated as part of a data integration flow, they are referenced by a
location in the job versus sending the BLOB through the DataStage job itself. Only when the target is
written, is the BLOB moved. BLOB support will be added as the new common connectors are released
after the 8.0 release.

Connection Objects
Connection objects are new objects that hold connection path information (username, password,
©2006 IBM Corporation. All Rights Reserved. Page 24
What’s New in WebSphere DataStage 8.0

database name, etc.) to a particular source or target which allows saving and reusing connection
information. Connection objects are created manually, during metadata import, and from a stage editor.

Connection objects are used to save stage connection properties to be later used when building a job.
They can be dragged and dropped from the repository tree and also be used for metadata import from
that source or target. Drag and drop the table imported from that source or target onto the canvas to
create a pre-populated stage instance. Connection objects are used “by reference” at design time. The
stage editor displays the current state of the Data Connection, not the state when it was first loaded in the
stage instance.

Connection objects are tied to a particular


stage. Connection objects are supported
on the following stages:
• New common connectors
• Parallel stages: DB2 UDB,
Informix XPS, Oracle,
Teradata
• DataStage Server: Database
plug-ins (e.g., Oracle OCI,
DRS, DB2 API, etc.) and built-
in stages including ODBC,
Universe, and Unidata

Expanded Support for the Stored Procedure Stage


The Stored Procedure Stage is expanded to support SQL Server and Teradata databases. The support
for Teradata includes: stored procedures, macros, scalar user-defined functions, table user-defined
functions, and external stored procedures.

New & Enhanced Connectivity


New parallel stages exist to connect to Netezza. There are also new parallel iWay and WebSphere
Information Integration Federation and Classic Federation stages for easier access to distributed and
mainframe data. Support for Informix v10 has be added along with Oracle 10gR2, SQL Server 2005,
Sybase ASE 15, sftp, Teradata V2R6.1/TTU 8.1, and more.

©2006 IBM Corporation. All Rights Reserved. Page 25


What’s New in WebSphere DataStage 8.0

Enhanced CFF Stage


The enhancements described in this section are specific to the DataStage parallel environment.
The CFF stage has been enhanced to make it easier to read in files, particularly mainframe files, that
have multiple formats for each record.

Enhanced Installation, Configuration, Administration & Reporting

Installation
The installation process has been completely re-written with all of the software of IBM Information Server
in one platform installation process and media. Multiple CD’s just for DataStage are a thing of the past.
Also, authorization codes are gone. Licensing is done by IBM through a simple licensing file that is read
at installation time.

Security
Users, assignment of groups, and roles are now done at the Web Console for IBM Information Server.
Integration with LDAP or Active Directory is also provided. All products in IBM Information Server,
including DataStage, authenticate using this new service. This provides one place for userid
administration for all products.
Note: LAN Manager support is removed from the DataStage Client logon screen with the 8.0 release.
You will no longer see the “Omit” check box.

©2006 IBM Corporation. All Rights Reserved. Page 26


What’s New in WebSphere DataStage 8.0

DataStage Administration
The DataStage Administrator client tool still exists for DataStage (and QualityStage) specific
administration tasks. The DataStage Administrator client tool is used to set-up DataStage and
QualityStage projects, assign users & roles, and perform other DataStage specific tasks. Only authorized
DataStage administrator-level users can use the DataStage Administrator tool.
The DataStage user roles have been expanded with the DataStage 8.0 release.
• There is new “DataStage Administrator” role at the IBM Information Server level for DataStage
and QualityStage use of the DataStage Administrator tool.
• A new “Super Operator” role who can run and view objects in the Designer, but cannot change
them.

Reporting Console
A new browser-based Reporting Console is provided with IBM Information Server. Reports are available
to users who have access. The products of Information Server, such as Information Analyzer, publish
reports to the Reporting Console. Information Analyzer will publish reports on the results of data profiling.
DataStage and QualityStage can publish reports such as the job report, results of Find & Impact Analysis,
and more.

©2006 IBM Corporation. All Rights Reserved. Page 27


What’s New in WebSphere DataStage 8.0

New Source-to-Target and Target-to-Source Job and Database reports are available in the Reporting
Console. This allows users to build a report based on selecting a job and the columns to see in the
report. The report traverses the job either forward (source-to-target) or “in reverse” (target-to-source) of
the columns and their transformations inside the job.

General Enhancements
An expand FILLER capability has been added to the CFF stage in the WebSphere DataStage MVS
Edition. DataStage MVS Edition also gets the benefit of many of the services explained above including
“where used” impact analysis and Find.
DataStage Enterprise Edition has better handling of failed conversions in the transformer e.g. when a
string to decimal conversion fails, we used to just report it in the log, but we now send the record down
the reject link if one exists for the transform.
In DataStage Enterprise Edition, in the CFF stage better support for scaled COMP types, such as
©2006 IBM Corporation. All Rights Reserved. Page 28
What’s New in WebSphere DataStage 8.0

S9(16)V99 has been added. Previously DataStage would read this in as an integer and then the user
would need to divide by 100 to the right value. DataStage now handles this transparently.

Migration and Upgrades


Existing DataStage installations from 5.x through 7.5.1A can be upgraded into the 8.0 release and
repository with no changes to your job designs. For Unix Server users, DataStage 8.0 can be installed
alongside the existing DataStage Server installation. Migration can be performed from your existing
DataStage projects into 8.0 using export/import.
Notes:
• DataStage Version Control will not be supported in DataStage 8.0.
• The job release feature is no longer supported. Users should use the export functionality
provided.
• XML Pack v1 is not supported with DataStage 8.0.

What Platforms Are Supported?


The DataStage 8.0 release supports the following server platforms:
• Microsoft Windows Server 2003
• AIX 5.2, 5.3
• HP-UX 11i (11.11), 11iv2 (11.23) for PA-RISC
• Solaris 2.9, 2.10
• Red Hat Enterprise Linux AS 4.0
• SuSE Enterprise Linux 9, 10
DataStage Enterprise MVS Edition will also be available for IBM z/OS 1.1, OS/390 v2.6, 2.8, v2.10
The DataStage and QualityStage Client platform support is:
• Microsoft Windows XP SP2

More Information
Contact your IBM representative or log on to www.ibm.com.

©2006 IBM Corporation. All Rights Reserved. Page 29

Vous aimerez peut-être aussi