Académique Documents
Professionnel Documents
Culture Documents
To stop a running job go to DataStage Director and click the stop button (or Job ->
Stop from menu). If it doesn't help go to Job -> Cleanup Resources, select a
process with holds a lock and click Logout
If it still doesn't help go to the datastage shell and invoke the following command:
ds.tools
It will open an administration panel. Go to 4.Administer processes/locks , then
try invoking one of the clear locks commands (options 7-10).
The command can be placed in a batch file and run in a system scheduler.
There are two ways to analyze a hashed file. Both should be invoked from the
datastage command shell. These are:
• FILE.STAT command
• ANALYZE.FILE command
Yes, even though different versions of Datastage use different system dll libraries.
To dynamically switch between Datastage versions install and run DataStage Multi-
Client Manager. That application can unregister and register system libraries used
by Datastage.
There is a few possible methods of sending sms messages from Datastage. However,
there is no easy way to do this directly from Datastage and all methods described
below will require some effort.
The easiest way of doing that from the Datastage standpoint is to configure an
SMTP (email) server as a mobile phone gateway. In that case, a Notification
Activity can be used to send message with a job log and any desired details.
DSSendMail Before-job or After-job subroutine can be also used to send sms
messages.
If configured properly, the recipients email address will have the following format:
600123456@oursmsgateway.com
If there is no possibility of configuring a mail server to send text messages, you can
to work it around by using an external application run directly from the
operational system. There is a whole bunch of unix scripts and applications to send
sms messages.
In that solution, you will need to create a batch script which will take care of sending
messages and invoke it from Datastage using ExecDOS or ExecSh subroutines
passing the required parameters (like phone number and message body).
Please keep in mind that all these solutions may require a contact to the local
cellphone provider first and, depending on the country, it may not be free of charge
and in some cases the provider may not support the capability at all.
2.1. Error in Link collector - Stage does not support in-process active-to-
active inputs or outputs
To get rid of the error just go to the Job Properties -> Performance and select
Enable row buffer.
Then select Inter process which will let the link collector run correctly.
Buffer size set to 128Kb should be fine, however it's a good idea to increase the
timeout.
2.3. what is the difference between logging text and final text message in
terminator stage
Every stage has a 'Logging Text' area on their General tab which logs an
informational message when the stage is triggered or started.
The error appears in Stored Procedure (STP) stage when there are no stages going
out of that stage.
To get rid of it go to 'stage properties' -> 'Procedure type' and select Transform
2.5. How to invoke an Oracle PLSQL stored procedure from a server job
To run a pl/sql procedure from Datastage a Stored Procedure (STP) stage can be
used.
However it needs a flow of at least one record to run.
• source odbc stage which fetches one record from the database and maps it to
one column - for example: select sysdate from dual
• A transformer which passes that record through. If required, add pl/sql
procedure parameters as columns on the right-hand side of tranformer's
mapping
• Put Stored Procedure (STP) stage as a destination. Fill in connection
parameters, type in the procedure name and select Transform as procedure
type. In the input tab select 'execute procedure for each row' (it will be run
once).
The error appears in Stored Procedure (STP) stage when the 'Procedure name' field
is empty. It occurs even if the Procedure call syntax is filled in correctly.
To get rid of error fill in the 'Procedure name' field.
Note! work dir and file1 are parameters passed to the routine.
* open file1
OPENSEQ work_dir : '\' : file1 TO H.FILE1 THEN
CALL DSLogInfo("******************** File " : file1 : " opened successfully",
"JobControl")
END ELSE
CALL DSLogInfo("Unable to open file", "JobControl")
ABORT
END
2.9. Datastage routine which reads the first line from a text file
Note! work dir and file1 are parameters passed to the routine.
* open file1
OPENSEQ work_dir : '\' : file1 TO H.FILE1 THEN
CALL DSLogInfo("******************** File " : file1 : " opened successfully",
"JobControl")
END ELSE
CALL DSLogInfo("Unable to open file", "JobControl")
ABORT
END
2.11. When hashed files should be used? What are the benefits or using
them?
Hashed files are the best way to store data for lookups. They're very fast when
looking up the key-value pairs.
Hashed files are especially useful if they store information with data dictionaries
(customer details, countries, exchange rates). Stored this way it can be spread
across the project and accessed from different jobs.
Most of the datastage variable types map very well to oracle types. The biggest
problem is to map correctly oracle NUMBER(x,y) format.
There are no problems with string mappings: oracle Varchar2 maps to datastage
Varchar, and oracle char to datastage char.
2.14. How to adjust commit interval when loading data to the database?
In earlier versions of datastage the commit interval could be set up in:
General -> Transaction size (in version 7.x it's obsolete)
Starting from Datastage 7.x it can be set up in properties of ODBC or ORACLE stage
in Transaction handling -> Rows per transaction.
If set to 0 the commit will be issued at the end of a successfull transaction.
These variables can be used to generate sequences, primary keys, id's, numbering
rows and also for debugging and error tracing.
They play similiar role as sequences in Oracle.
2.16. Datastage trim function cuts out more characters than expected
To get the "a b c d" as a result use the trim function in the following way: Trim(" a
b c d "," ","B")
The destination table can be updated using various Update actions in Oracle stage.
Be aware of the fact that it's crucial to select the key columns properly as it will
determine which column will appear in the WHERE part of the SQL statement.
Available actions:
• Clear the table then insert rows - deletes the contents of the table
(DELETE statement) and adds new rows (INSERT).
• Truncate the table then insert rows - deletes the contents of the table
(TRUNCATE statement) and adds new rows (INSERT).
• Insert rows without clearing - only adds new rows (INSERT statement).
• Delete existing rows only - deletes matched rows (issues only the DELETE
statement).
• Replace existing rows completely - deletes the existing rows (DELETE
statement), then adds new rows (INSERT).
• Update existing rows only - updates existing rows (UPDATE statement).
• Update existing rows or insert new rows - updates existing data rows
(UPDATE) or adds new rows (INSERT). An UPDATE is issued first and if
succeeds the INSERT is ommited.
• Insert new rows or update existing rows - adds new rows (INSERT) or
updates existing rows (UPDATE). An INSERT is issued first and if succeeds the
UPDATE is ommited.
• User-defined SQL - the data is written using a user-defined SQL statement.
• User-defined SQL file - the data is written using a user-defined SQL
statement from a file.
ICONV and OCONV functions are quite often used to handle data in Datastage.
ICONV converts a string to an internal storage format and OCONV converts an
expression to an output format.
Syntax:
Iconv (string, conversion code)
Oconv(expression, conversion )
Iconv and oconv can be combined in one expression to reformat date format easily:
Oconv(Iconv("10/14/06", "D2/"),"D-E") = "14-10-2006"
Error message:
The problem appears when a job sequence is used and it contains many stages
(usually more than 10) and very often when a network connection is slow.
Basically the cause of a problem is a failure between DataStage client and the server
communication.
The command will produce a brief error description which probably will not be helpful
in resolving an issue but can be a good starting point for further analysis.
There may be several reasons for the error and thus solutions to get rid of it.
The error usually appears when using Link Collector, Link Partitioner and
Interprocess (IPC) stages. It may also appear when doing a lookup with the use of a
hash file or if a job is very complex, with the use of many transformers.
Error message:
The problem appears when a project is moved from one project to another (for
example when deploying a project from a development environment to production).
The appears when running Datastage Designer under Windows XP after installing
patches or the Service Pack 2 for Windows.
After opening a job sequence and navigating to the job activity properties window
the application freezes and the only way to close it is from the Windows Task
Manager.
The solution of the problem is very simple. Just Download and install the “XP SP2
patch” for the Datastage client.
It can be found on the IBM client support site (need to log in):
https://www.ascential.com/eservice/public/welcome.do
Go to the software updates section and select an appropriate patch from the
Recommended DataStage patches section.
Sometimes users face problems when trying to log in (for example when the license
doesn’t cover the IBM Active Support), then it may be necessary to contact the IBM
support which can be reached at WDISupport@us.ibm.com
costs – PL/SQL comes along with the standard Oracle licence and if an oracle database
is installed, PL/SQL can be used straight away with no additional costs. No additional
hardware is needed. However, implementing ETL process in PL/SQL is a lot more time
and human resources consuming process. It applies both to the implementation phase and
later – production support and enhancements.
The designers need an in-depth knowledge about oracle and it may take months to
become an expert in PL/SQL. The ETL tool like Datastage or Informatica can be learned
during a few days training and in fact the designers don’t need to know much about
programming, scripting and can do their job without a low-level IT knowledge.
From the other hand, a Datastage or Informatica expert may be far more expensive and
less accessible than a PL/SQL consultant.
workload and time – it’s a factor directly related to costs. An ETL tool comes with
the whole framework to simplify the design of the process which is usually an
administration panel, GUI frontend, global options, documentation module, day-to-day
operation management, failover capabilities, logging and reporting module, user
management, connectors to different data sources, plugins, etc.
In PL/SQL most of that modules must be programmed manually and it may significantly
increase the time of implementation.
flexibility – an ETL tools comes along with the set of mostly used components and it
is rather difficult to expand its capabilities. For example, when we need to extract data to
a non-typical format (for example EDI files, EPIC files) or process them in a non-
standard way, then PL/SQL should be a lot more helpful.
efficiency – if the company uses Oracle databases, then the ETL processing using the
native database language which is PL/SQL will be a lot more efficient. There is no better
way to process data than well optimized queries issued on the database internal engine
and apart from that very often an ETL tools operates on different server than the database
which causes the need to transfer the data across network.
integration - The commercial ETL tools provide functionality to integrate with
different systems, including connections to multiple data sources, operating systems
integration, FTP support, plugins to ERP systems, etc. PL/SQL doesn't have that features
and it has to be implemented externally.
The conclusion is that there is no easy answer to the question which approach is better.
The most important thing is to review the company’s needs, calculate costs, estimate
results and then make a choice between buying an ETL tool or using a PL/SQL
processing.