InfoSphere Information Server Pack For Salesforce

IBM InfoSphere Information Server Pack for
Salesforce.com
Version 1.5
Integration Guide for IBM InfoSphere

Information Server Pack for
Salesforce.com
SC19-3875-00
IBM InfoSphere Information Server Pack for

Salesforce.com
Version 1.5
Integration Guide for IBM InfoSphere

Information Server Pack for
Salesforce.com
SC19-3875-00
Note
Before using this information and the product that it supports, read the information in Notices and trademarks on page
25.
Copyright IBM Corporation 2007, 2013.

US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
IBM InfoSphere Information Server Pack
for Salesforce.com . . . . . . . . . . 1
Configuration prerequisites . . . . . . . . . 2
Installing and uninstalling the Pack. . . . . . 2
Salesforce.com object requirements for delta
extractions . . . . . . . . . . . . . . 3
Salesforce.com rules and guidelines. . . . . . 3
Connections and sessions . . . . . . . . . 4
Setting environment variables . . . . . . . 4
Connecting to Salesforce.com . . . . . . . . . 5
Saving design time connection properties . . . . 6
Loading connection properties at design time . . 7
Selecting metadata for load and extraction operations 7
Loading data into Salesforce.com . . . . . . . 8
Designing a load job . . . . . . . . . . . 8
Metadata display for selected write operations . 10
Saving rejected data and error information . . . 10
Querying bulk load status . . . . . . . . 11
Designing queries to extract data from
Salesforce.com . . . . . . . . . . . . . 14
Copyright IBM Corp. 2007, 2013
Designing a data extraction job . . . . .

Adding conditions and sorting to a query .
Saving job parameters . . . . . . . . .
.
.
.
. 14
. 15
. 16
Appendix A. Product accessibility . . . 19

Appendix B. Contacting IBM . . . . . 21
Appendix C. Accessing and providing
feedback on the product
documentation . . . . . . . . . . . 23
Notices and trademarks . . . . . . . 25
Index . . . . . . . . . . . . . . . 29
iii
iv
Integration Guide for IBM InfoSphere Information Server Pack for Salesforce.com
IBM InfoSphere Information Server Pack for Salesforce.com

The InfoSphere Information Server Pack for Salesforce.com lets you use
InfoSphere DataStage products to extract data from and load data into your
Salesforce.com organization. Salesforce.com is a Web-based customer relationship
management (CRM) platform that provides database services, applications, and
application programming interfaces (APIs).
The Pack for Salesforce.com connects IBM InfoSphere Information Server to
Salesforce.com, and provides a graphical interface you can use to design and run
jobs that:
v Extract all or some of the data from your Salesforce.com organization
v Extract only the data that was changed or deleted since the last time you
extracted the data, or within the time frame you specify
v Load new data or change data in your Salesforce.com organization
You configure the Pack's single stage to connect to Salesforce.com and either load
or extract data from your Salesforce.com organization.
v When designing a job to extract data, you can browse and select the metadata in
your Salesforce.com organization to automatically generate the data selection
statement used at run time to extract a Salesforce.com object. You can edit the
selection statement to add conditions and sorting.
v When designing a job to load data, you can browse and select the Salesforce.com
object you want to create or change. You can design a load job with a reject link
that captures any data rejected by Salesforce.com when you run the load job.
When designing either a load job or an extraction job, you can:
v Save and load connection properties, allowing you to reuse them across multiple
jobs
v Save variables as job parameters, allowing you to design reusable jobs
v Adjust the batch to ensure that the job processing is completed within the limits
set by Salesforce.com
v Load or extract Unicode data
The Pack for Saleforce.com stage supports a single output link when you use it in
an extraction job. The Pack does not support running a single extraction job on
parallel computing nodes. To avoid duplicate data, you must configure extraction
jobs to run sequentially.
When you use the Pack to design a data load job, the Pack supports an input link
and a reject link. A load job can run sequentially or in parallel. You can set up the
pack to load or update data by using either a real-time load or a bulk load.
Real-time load
This type of load depends on the web service interfaces and is designed
for jobs that load a small or moderate number of data records. With this
approach, the Pack bundles the data records into multiple batches (up to
200 rows per batch). The Pack sends one batch at a time to Salesforce.com
by using a web service call and then waits for the status of the operation
before sending the next batch.
Bulk load
This type of load uses the asynchronous bulk API and is suited for jobs
that load or update a large number of data records. With this approach, the
Pack organizes the data into a comma-separated values (.csv) file format,
and then uses the HTTPS Post method to send the comma-separated
values files to Salesforce.com. Each HTTPS post can send up to 10,000 rows
or 10 MB as a maximum. When you have hundreds of thousands or
millions of input records, you can use a bulk load to improve load
performance by reducing the number of round trips between InfoSphere
DataStage and Salesforce.com.
Important: For the latest information about known limitations, problems, and
workarounds, refer to the release notes for the Pack.
Configuration prerequisites
Before you can use the IBM InfoSphere Information Server Pack for Salesforce.com,
you must install or upgrade the Pack, and enable the metadata objects in your
Salesforce.com organization to support load and extraction operations by the Pack.
Installing and uninstalling the Pack

Before you can use the InfoSphere Information Server Pack for Salesforce.com to
design and run a job, you must install the required software. If you have a
previous version of the Pack installed, you might need to perform additional steps
before installing the Pack.
Before you begin

v The Pack for Salesforce.com uses the HTTPS protocol to connect to
Salesforce.com on the Web. Install the Pack on computers that have Internet
connections.
v Many enterprises have security policies that require all computers to access the
Internet through proxy servers. You need to work with your system or network
administrators, and make sure that the computers where you install the Pack are
capable of accessing the Internet through the proxy servers. The Pack supports
proxy servers that use the basic HTTP authentication mechanism specified in the
RFC 2617.
About this task

Previous versions of the Pack for Salesforce.com require customers to download
and install many third-party jar files in the Salesforce folder. Those jar files might
affect the operations of the Version 1.5 Pack for Salesforce.com. If you have a
previous version of the Pack for Salesforce.com installed and want to upgrade to
Version 1.5, you need to remove the older version before installing Version 1.5.
Note: Version 1.5 bundles all the required third-party jars files and IBM jar files in
the installation. You do not need to download and install any third-party jar files
manually.
Procedure
1. If you have a previous version of the Pack for Salesforce.com installed, perform
the following steps before installing Version 1.5:
a. Back up the existing Salesforce folder in case you want to revert to the
previous version of the Pack: IS_InstallDir/ASBNode/lib/java/salesforce,
where IS_InstallDir is the directory where InfoSphere Information Server is

installed. You must back up this folder on both the InfoSphere DataStage
client and InfoSphere DataStage server.
b. Remove the folder IS_InstallDir/ASBNode/lib/java/salesforce. You must
remove this folder from both the InfoSphere DataStage client and
InfoSphere DataStage server.
2. To install or uninstall the Pack for Salesforce.com, refer to the detailed
instructions in the release notes for the Pack.
Salesforce.com object requirements for delta extractions

To use the Pack for Salesforce.com to extract updated or deleted data, you need to
add a custom object to your Salesforce.com organization that includes the fields
required by the Pack.
To support delta extraction (getUpdated or getDeleted operations), create a custom
object called DataStage in your Salesforce.com organization. Include the following
fields in the DataStage object:
extractId
This field identifies a particular job that performs a delta extraction. The
Pack for Salesforce.com uses this field to track the last time this ID
performed a delta extraction. The time of the last delta extraction
determines which data the query returns to a particular ID.
For example, a Salesforce.com organization might have an
AccountManager user who runs an extraction job weekly and a SalesRep
user who runs another job daily. Both jobs might extract data from the
same Salesforce object. Setting up a unique extractId for each job lets the
Pack for Salesforce.com return the data that has changed in the last week
to the AccountManager user, and the data that has changed in the last day
to the SalesRep.
The extractId can be any text string. The field is limited to 80 characters.
You enter the extractId at design time. Optionally, you can save the
extractId as a job parameter. This field has the type of Text. This field is
Unique Case Insensitive, and it is an External ID field.
lastextracttime
This field stores the date and time when a user, identified by a unique
extractId, last ran a particular delta extraction job. This field has the type
of datetime.
object This field stores the name of the Salesforce object for the delta extraction. It
has the type of Text Field with 40 characters
Salesforce.com rules and guidelines

Salesforce.com imposes many rules and guidelines for different data operations.
Those rules might affect your IBM InfoSphere DataStage jobs. When you design an
InfoSphere DataStage job to load or extract data from Salesforce.com, you need to
be aware of the Salesforce rules and guidelines.
To learn more about rules and guidelines for the Web Service API and the bulk
REST API, see the Salesforce.com Web site at http://www.salesforce.com/us/
developer/docs/api/index.htm.
Data operations and requirements

The Pack for Salesforce.com can perform different types of data operations on a
Salesforce object and its fields. The object and the affected fields must have the
proper permissions. The operations might also require you to specify a unique
external ID or the Salesforce internal ID as a key field.
The Pack supports all the data operations through the real-time web service
interface. However, with the bulk load interface, you can only invoke the Update,
Upsert, or Create load operations by using the Pack.
The following table shows the permissions that are required and the rules that
apply to the different types of operations that you can perform by using the Pack.
Table 1. Permissions and rules for operations performed by the Pack for Salesforce.com
Operation
Supported in the
real-time or bulk
load
Object-level
permission required
Query or Query All
Real time only
Queryable
Upsert
Both
Updateable and
Creatable
Updateable and
Creatable
Upsert requires a
unique external ID
field to be selected as
the key column.
Update
Both
Updateable
Updateable
Update requires the

Salesforce internal ID
field selected as the
key column.
Create
Both
Creatable
Creatable
Delete
Real-time only
Deletable
GetUpdated
Real-time only
Replicatable
GetDeleted
Real-time only
Replicatable
Field-level
permission required
Other rules
Delete requires the

Salesforce internal ID
field to be selected as
the key column.
Connections and sessions

Each time the Pack connects to Salesforce.com, Salesforce returns a session ID that
represents the ongoing session between the Pack and Salesforce. The session
timeout is set in each Salesforce.com user account, and applies to design-time
sessions and runtime sessions.
The Pack automatically attempts to reconnect to Salesforce.com up to three times
before the session times out. If the Pack cannot reconnect to Salesforce.com for any
reason after three attempts, the job ends.
When a session times out during design time, you can reconnect to Salesforce.com
and start a new session by clicking Browse Objects.
Setting environment variables

You can specify settings for environment variables that control the JVM heap size,
the stage logging level, and the temporary directory for bulk loads. All
environment variables are case sensitive.
About this task

Procedure
To set environment variables for Pack for Salesforce.com jobs:
Procedure
1.
2.
3.
4.
Start the IBM InfoSphere DataStage and QualityStage Administrator client.

Click the Projects tab.
Select your project and click Properties.
In the General tab, click Environment.
5. In the Categories field, click User Defined.

6. In the Details field, add the following environment variables, as necessary.
Table 2. Environment variable settings for Salesforce.com
Environment variable
Description
Default value
CC_JVM_OPTIONS
Specifies the JVM heap size. 256 MB
-Xmx1G (1 GB)
CC_MSG_LEVEL
Specifies the logging level.
4 (warning)
3 (information)
Example value
The following values are

valid:
v 1 (trace)
v 2 (debug)
v 3 (information)
v 4 (warning)
v 5 (error)
v 6 (fatal)
SF_BULKLOAD_TEMPDIR
Specifies the temporary

directory for bulk loads.
The current project

directory in which
InfoSphere DataStage jobs
run.
/tmp
Connecting to Salesforce.com
At design time, you use an Internet connection to Salesforce.com to browse for and
select the metadata from your Salesforce.com organization to be used in your job.
At run time, you use the Internet connection to Salesforce.com to load data into or
extract data from your Salesforce.com organization.
Before you begin

You need to obtain a user name and password from Salesforce.com. To gain access
to Salesforce.com by using the Pack, you might also need to add a security token
to the end of your password. The security token is an automatically generated key
from Salesforce. For example, if your password is mypass and your security
token is xxxx, then use mypassxxxx as your password to gain access to
Salesforce.com by using the Pack.
Salesforce enforces security controls based on several mechanisms, including user
profiles, IP address restrictions, and security tokens. You need to contact your
Salesforce administrator to understand these security mechanisms and have the
security tokens ready before connecting to Salesforce.com.
About this task

You define the Internet connection to Salesforce.com in the job Properties tab at job
design time.
Procedure
To set up your connection to Salesforce.com:
Procedure
1. Create a Pack for Salesforce.com extraction job or load job and double-click the
stage to display the stage properties.
2. Enter your Salesforce.com user name in the Username field. The format of the
user name is username@domainname.
3. Enter your Salesforce.com password in the Password field.
4. Enter the Salesforce.com URL, for example, https://www.salesforce.com/
services/Soap/u/18.0, in the URL field. The Pack supports the Salesforce API
version 18.0.
5. Optional: To use a proxy server to connect to Salesforce.com, in the Proxy
Server field, click Yes, and enter the proxy server host name, port, user name,
and password.
6. Press Enter to set your credentials.
Note: Press Enter after making any change to your login credentials or to the
URL to reset the connection options.
7. To start the connection to Salesforce.com:
a. At design time, click Browse Objects to connect to Salesforce.com.
b. At run time, the Pack for Saleforce.com connects to Salesforce.com
automatically.
Saving design time connection properties

You can save your job connection properties in the IBM InfoSphere DataStage
repository and load them each time you design a new job. The connection properties
are the user name, password, and URL that connect your job to Salesforce.com.
About this task

Procedure
To save your design-time connection properties:
Procedure
1. Define your connection properties and connect to Salesforce.com to test the
connection.
2. Click Save.
3. In the Data Connection window General tab, type a name for the connection in
the Data Connection name field. The name must begin with an alphabetic
character and can contain only alphabetic, numeric, and underscore characters.
4. Optionally, add a short description and long description for the connection.
5. Click OK.
6. In the Save Data Connection As window, navigate to and open the Job folder in
the InfoSphere DataStage repository where you want to save the connection.
7. Click Save.
Loading connection properties at design time

After you save your login credentials and the URL that connect your job to
Salesforce.com, you can load the saved connection properties each time you design a
new job.
Before you begin

At design time, define and save the user name, password, and URL that connect
your job to Salesforce.com.
About this task

Procedure
To load connection properties that you have saved:
Procedure
1. In the IBM InfoSphere DataStage and QualityStage Designer client, create a
Pack for Salesforce.com load or extraction job.
2. Click Load.
3. Open the Job folder where you saved the connection.
4. Select the connection, and click Open.
Selecting metadata for load and extraction operations

When you design a job to load or extract data, you can use the Metadata Explorer
to browse for metadata in your Salesforce.com organization. You can select the
Salesforce.com object and its fields to automatically generate the stage properties
and the IBM InfoSphere DataStage table definitions for a load or extraction
operation.
About this task

When you design a load job, selecting a Salesforce.com metadata object with the
Metadata Explorer adds the object to the Business Object field as the argument of
the Create, Update, or Upsert call of the load operation.
When you design a job to extract changed or deleted data, selecting a
Salesforce.com metadata object adds the object type to a getUpdated or getDeleted
call. When you design a job to query some or all data, selecting a Salesforce.com
metadata object adds the object type and any selected field types to the query
string SOQL SELECT statement.
Important: The Pack for Salesforce.com provides no support for the following data
types: BLOB, CLOB, LONGVARBINARY, and LONGVARCHAR.
Procedure
To connect to Salesforce.com and browse and select metadata objects:
Procedure
1. Create a Pack for Salesforce.com load job or extraction job.
2. Enter or load your Salesforce.com login credentials and URL.
3. Select the type of operation to use:
a. For a load job, select the Write operation and then select the Access Method.
b. For an extraction job, select the Read operation. The Access Method is
disabled for the Read Operation.
4. Click Browse Objects.
5. In the Metadata Explorer window, expand the metadata objects to display the
fields and, for extract operations, the child fields of the object.
6. When you select an object, all of its fields are also selected. To select only a
subset of fields, deselect the object, and then select only the fields and child
fields you want to load or extract.
7. Click Import.
v For a Get updated delta operation, Get deleted delta operation, or a load
operation, the selected object is added to the Business Object field.
v For a Query operation or a QueryAll operation, if the SOQL Query to
Salesforce field is empty, the metadata that you selected creates a SELECT
statement in this field. If the SOQL Query to Salesforce field contains a
SELECT statement already, selecting a new object through the Metadata
Explorer does not change the existing SELECT statement. To replace an
existing SELECT statement in the SOQL Query to Salesforce field, clear the
field before you click Browse Objects to select the metadata object.
Results
To display the selected metadata object fields, click the Columns tab. Note that
Salesforce.com indicates a parent-child or child-parent relationship with a period (.)
between the object name and the field name. The Pack for Salesforce.com uses an
underscore ( _ ) to separate the object name from the field name. The Pack also
uses the description field in the column definitions to track the child-to-parent
relationship.
Loading data into Salesforce.com

You can use the Pack for Salesforce.com to load data into your Salesforce.com
organization. The Pack for Salesforce.com uses either real-time load or bulk load to
create, update, and upsert Salesforce objects.
For details about Salesforce.com core calls and Sforce Object Query Language
(SOQL), go to http://www.salesforce.com/us/developer/docs/api/index.htm.
Designing a load job

You can design load jobs to update your Salesforce.com organization.
About this task

The Pack supports real-time load using the web service interfaces, and bulk load
using the REST interfaces.
Real-time load is a synchronous operation. After the Pack sends a batch to

Salesforce.com, the pack waits for Salesforce to complete the operation for the
batch before sending the next batch. Real-time load supports create, update, upsert,
and delete operations.
Bulk load is an asynchronous API. The Pack posts the bulk data as multiple
batches to Salesforce.com. The batches are queued at the Salesforce backend
platform. Salesforce performs the data loads for the posted data batches
asynchronously on its backend computing platforms. The Pack waits for the
loading status of data records from Salesforce for 30 seconds. If Salesforce does not
complete the loading of all data records within 30 seconds after the Pack posts all
the batches, the pack completes the work, and the loading status of data records is
unknown. You can design a separate IBM InfoSphere DataStage job to query the
bulk load status.
Procedure
To design a load job:
Procedure
1. Create a Pack for Salesforce.com load job:
a. In the IBM InfoSphere DataStage and QualityStage Designer client, create a
new parallel job.
b. In the Designer client palette, click Packs and drag the Pack for
Salesforce.com icon to the parallel canvas.
2.
3.
4.
5.
6.
c. In the Designer client palette, click File and drag a sequential file to the
parallel canvas.
d. To define the job as a load job, draw a link from the sequential file to the
Salesforce.com icon. Right-click and drag the cursor to draw the link.
Double-click the Pack for Salesforce.com stage to open the link properties.
Specify your Salesforce.com login credentials, or load them from a saved
connection file.
In the Write Operation field, select a write operation.
In the Access Method field, select either Real-time load or Bulk load.
If Bulk load is selected as the Access Method, set up the following properties:
a. In the Job ID In File field, specify whether to save the Salesforce batch ID
to a file. For each bulk load operation, Salesforce returns a unique job ID.
Select YES if the job ID needs to be saved in a file.
b. In the Job ID file name field, specify the absolute file path for saving the
Salesforce bulk load job ID.
c. In the Salesforce.com Concurrency Mode field, select a concurrency mode.
The value of this property controls whether Salesforce loads the batches
asynchronously on its backend server in either parallel or sequential mode.
(The Pack uses "sequential mode" as a synonym for the Salesforce.com
term "serial mode.") The default value is Parallel.
d. Select Keep Temporary Files to keep the temporary files that are generated
on the disk during the processing of bulk load operations.
7. Click Browse Objects to connect to Salesforce.com and select the object that
you want to load, or type the name of the object in the field.
8. Expand the object to display the fields it contains. Select the fields you want
to load.
9. Configure the batch size:
v For a real-time load, the batch size is 1-200 and the default is 200.
v For a bulk load job, the batch size is 1-10,000. Salesforce also has the
restriction that each bulk load batch cannot have more than 10 MB of data.
Click the input field to use the arrow keys to increase or decrease the batch
size by 1, or type the value in the field.
10. To save any value as a job parameter, double-click the Use Job Parameter
column to the right of the input field.
Metadata display for selected write operations

The Metadata Explorer window displays the Salesforce.com objects that can be
loaded. The objects that can be loaded depend on the selected write operation and
the object's attributes in Salesforce.com.
Objects that do not have the required attributes for the selected write operation are
not displayed by the Metadata Explorer. This means that the Metadata Explorer
displays different lists of objects, depending on the write operation that is selected
when you click Browse Objects.
For more information, see the related topic about Salesforce.com rules and
guidelines.
Saving rejected data and error information

You can design a load job with a reject link to save the data records that were
rejected by Salesforce.com, as well as error codes and error messages. You can
analyze the rejected data and correct any deficiencies before running the job again.
Before you begin

Create a load job by drawing a link from an input file to the Pack for
Salesforce.com stage. After creating a load job, add a reject link to the job by
drawing a link from the Pack for Salesforce.com stage to an output file.
Important: Bulk load jobs use the asynchronous REST interface. A bulk load job
might complete before Salesforce completes its data load operations at its backend
platforms. If this happens, the data records that are rejected by the Salesforce
backend are not available on the reject link.
About this task

Procedure
To configure the reject link:
Procedure
1. Open the Pack for Salesforce.com stage.
2. On the Reject tab, select the reject link.
3. In the Filter rejected rows based on selected conditions field, select Salesforce
Write Error Row Rejected. If you do not select this option, no data is written to
the reject file.
4. Optional: You can choose to include the error information from Salesforce.com
in the reject file. Select ERRORCODE to include Salesforce.com return codes in
the reject file. Select ERRORTEXT to include Salesforce.com error messages in
the reject file.
10
To display details about the optional error information you selected, such as
data type and length, click the Columns tab.
5. Specify the error threshold. The job ends when it reaches this threshold. You
can specify the error threshold as a raw number of input rows or as a
percentage of input rows. When you specify a percentage, you specify how
many rows of input data must be processed before the error threshold is
reached and the job ends.
Querying bulk load status

When you use the Pack to load large amounts of data in bulk to Salesforce.com, it
is possible that the data might not be completely processed by Salesforce.com
before the IBM InfoSphere DataStage job has finished running. For example,
batches of data might be in several states when the job completes: Queued,
InProgress, Completed, or even Failed. To ensure that the data has been completely
processed by Salesforce.com and is ready to use, you can use the Pack to query the
status of the bulk load job.
You can create queries that return the following details:
v SFDC job ID: When Salesforce.com receives data in bulk, it assigns a Job ID to
the entire set of data. For example: 75070000000PCpk. Job IDs can be obtained in
one of following ways:
From the Job ID File specified in a Pack bulk load job.
v
v
v
v
From the Salesforce.com portal by selecting Setup > Administration Setup >
Monitoring > Bulk Data Load Jobs.
SFDC batch ID: Data within an SFDC job can be split into multiple batches.
Salesforce.com assigns an ID to each batch. For example: 75170000000PIHFAA4.
SFDC record ID: Salesforce.com assigns an ID to each record it processes. For
example: a0370000006dZjmAAE.
SFDC job status: When Salesforce.com processes a job, it assigns one of the
following statuses to that job: Open, Closed, Aborted, or Failed.
SFDC batch status: When Salesforce.com processes a batch, it assigns one of the
following statuses to that batch: Queued, Completed, InProgress, Failed or Not
Processed.
SFDC record status: When Salesforce.com processes a batch, it assigns one of
the following statuses to the records in the batch: Success or Failed.
You can create the following types of status queries:

Job status query
Returns the SFDC job ID and status of a Salesforce.com job. For example,
this type of query might return something like the following result:
"75070000000PCau","Closed"
Batch status query

For each batch in an SFDC job, returns the SFDC job ID, the SFDC batch
ID, and the SFDC batch status. For example, this type of query might
return something like the following result:
"75070000000PCau","75170000000PIHFAA4","InProgress"
Record status query

For each record in an SFDC job, returns the SFDC job ID, the SFDC batch
ID, the SFDC record ID, and the SFDC record status. For example, this
type of query might return something like the following result:
"75070000000PCaz","75170000000PIHKAA4","a0370000006dZjmAAE","Success"
11
The following examples of status query jobs are provided in the

BulkLoadStatusCheckingExample.dsx project file that is located in the examples
directory on the Microsoft Windows installation image for the Pack for
Salesforce.com.
Table 3. Example status query jobs in BulkLoadStatusCheckingExample.dsx
Example status query job
Description
example1
Demonstrates a basic job status query.
example2
Demonstrates a basic batch status query.
example3
Demonstrates a basic record status query.
example4
Demonstrates a basic bulk load job.
example5
Demonstrates the use of a sequence job for

bulk load and status query.
To query bulk load status, you can either use an imported example job, or design a
new job.
Tip: In general, you should consider placing your bulk load jobs and associated
queries within an InfoSphere DataStage sequence job. This allows you to easily
manage the workflow between the bulk load job and the status query. Specifically,
it makes it easy for you to have the bulk load job place an SFDC job ID in a
specific file, and then provide that file directly to the status query. (Of course, other
facilities are available in sequence jobs for error handling, conditional branching,
variable substitution, and so on.) For a basic example, see example5.
Importing examples to query bulk load status

You can quickly create queries that return bulk load status by importing and using
the examples provided in the BulkLoadStatusCheckingExample.dsx project file. The
file is located in the examples directory on the Microsoft Windows installation
image for the Pack for Salesforce.com.
About this task

Procedure
To use imported examples to query bulk load status:
Procedure
1. Import the example for the type of query that you want to use:
v For a job status query, import example1.
v For a batch status query, import example2.
v For a record status query, import example3.
2. Double-click the Pack for Salesforce.com stage to open the stage editor.
3. Select the Properties tab.
4. Specify your Salesforce.com login credentials, or load them from a saved
connection file.
5. Optional: To use a proxy server to connect to Salesforce.com, in the Proxy
Server field, click Yes, and enter the proxy server host name, port, user name,
and password.
12
6. If the SFDC job ID will be provided in a file, select Yes in the Job ID in file
field, and provide the file name in the Job ID field. Otherwise, select No in
the Job ID in file field, and provide the SFDC job ID in the Job ID field.
7. In the Sleep field, specify how frequently the Pack should check for status.
The value is specified in seconds. For example, a value of 60 will cause the
Pack to poll for status every 60 seconds.
8. In the Tenacity field, specify how long the Pack should check for status. The
value is specified in seconds. For example, a value of 1800 will cause the Pack
to stop polling for status when 1800 seconds has elapsed.
9. Close the stage editor.
10. Either modify the properties of the sequential file, as necessary, or replace the
sequential file with any other suitable stage and configure it.
11. Compile and execute the job.
Designing a new job to query bulk load status

You can design a new job that queries bulk load status.
About this task

Procedure
To design a new job to query bulk load status:
Procedure
1. Export the column schemas in the Pack stages contained in example1,
example2 and example3:
a.
b.
c.
d.
e.
Open an example job.

Double click the Pack stage to open the editor.
Select the Columns tab.
Click Save and follow the instructions.
Close the stage editor.
f. Close the example job.

2. Create a Pack for Salesforce.com bulk load status query job:
new parallel job.
c. Drag a target stage to the parallel canvas.
d. Right-click the Pack and draw a link to the target stage.
3. Double-click the Pack for Salesforce.com stage to open the stage editor.
4. Select the Properties tab.
5. Specify your Salesforce.com login credentials, or load them from a saved
connection file.
6. In the Read Operation field, select Get the Bulk Load Status.
7. If the SFDC job ID will be provided in a file, select Yes in the Job ID in file
field, and provide the file name in the Job ID field. Otherwise, select No in
the Job ID in file field, and provide the SFDC job ID in the Job ID field.
8. Select the Columns tab.
9. Specify the type of query:
v For a job status query, load the schema from example1.
13
10.
11.
12.
13.
v For a batch status query, load the schema from example2.

v For a record status query, load the schema from example3.
In the Sleep field, specify how frequently the Pack should check for status.
The value is specified in seconds. For example, a value of 60 will cause the
Pack to poll for status every 60 seconds.
In the Tenacity field, specify how long the Pack should check for status. The
value is specified in seconds. For example, a value of 1800 will cause the Pack
to stop polling for status when 1800 seconds has elapsed.
Close the stage editor.
Double-click the target stage to open its stage editor and configure the stage.
14. Compile and execute the job.
Designing queries to extract data from Salesforce.com

You can create and customize queries to extract data from Salesforce.com.
The Pack for Salesforce.com uses core calls to the Apex Web Services API and the
Sforce Object Query Language (SOQL) SELECT statement to query and extract
data from Salesforce.com. You can browse and select metadata to automatically
generate a SELECT statement, and then edit the command to add WHERE and
ORDER BY expressions, and other conditions supported by SOQL.
For details about Salesforce.com core calls and Sforce Object Query Language
(SOQL), go to http://www.salesforce.com/us/developer/docs/api/index.htm.
You can design an extraction program that:
v Extracts all data from the objects in your Salesforce organization.
v Extracts only a subset of fields from the objects in your Salesforce organization.
v Extracts only the data that has changed since the last time a particular extraction
ID queried the data.
v Extracts only the data that has been deleted since the last time a particular
extraction ID queried the data.
Designing queries
You use the Query or QueryAll utility calls to extract all or some of the data from
your Salesforce.com organization. You can browse for and select Salesforce.com
metadata to automatically generate the SELECT statement used in the query. You
can add filtering and sorting to the query by editing the SELECT statement.
Designing delta extractions

You use the getUpdated and getDeleted utility calls to extract only the data that
has changed or been deleted in a specified time frame. The default time frame is
the period since the last extraction by the specified extraction ID, up to 30 days.
You can browse for and select the Salesforce.com object that you want to extract.
Designing a data extraction job

Use the Pack to create a Salesforce.com extraction job. Use the link properties to
specify the objects that the job is to extract, the filtering that determines which data
meet the search criteria, and the sorting that is to be applied to the data returned
by the extraction.
14
About this task

Procedure
To design a job to extract data from Salesforce.com:
Procedure
1. Create a Pack for Salesforce.com extraction job:
new parallel job.
c. In the Designer client palette, click File and drag a sequential file to the
parallel canvas.
d. To define the job as an extraction job, draw an output link from the
Salesforce.com icon to the sequential file. Right-click and drag the cursor to
draw the link.
2. Double-click on the Pack for Salesforce.com stage to open the link properties.
3. Specify your Salesforce.com login credentials and URL, or load them from a
saved connection file.
4. In the Read Operation field, select a read operation.
5. Click Browse Objects to select a business object for a delta extraction or to
generate an SOQL SELECT statement for a query.
6. Specify the job details:
v For delta extractions, specify an extract ID. Enter the delta start and end
times, or accept the default values.
v For queries, optionally edit the SOQL SELECT statement to add a condition
expression (WHERE clause) and sorting (ORDER BY statement).
7. Configure the job to run sequentially. Extraction jobs must run sequentially to
avoid duplicate data.
a. Click the Advanced tab.
b. In the Execution mode field, click Sequential.
8. Optional: To extract Unicode data:
a. Click the Columns tab.
b. To specify that a field contains Unicode data, click the input field under the
Extended column and select Unicode.
9. Optional: To save any value as a job parameter, double-click the Use Job
Parameters column to the right of the input field.
Adding conditions and sorting to a query

You can add conditions and sorting to the Sforce Object Query Language (SOQL)
SELECT statement of a query. Conditions filter the rows and values that are
returned by the query. Sorting controls the order in which the query results are
returned.
Before you begin

For details about SOQL and the SELECT statement, go to http://
www.salesforce.com/us/developer/docs/api/index.htm.
15
About this task

Add conditions and sorting to the SELECT statement in the SOQL Query to
Salesforce field. Use the SOQL WHERE clause to add conditions that filter the
data returned by the query. Use the SOQL ORDER BY clause to sort the data
returned by the query.
Procedure
To edit the SELECT statement of a query:
Procedure
1. Connect to Salesforce.com and select the metadata objects for the query.
2. Click the SELECT statement in the SOQL Query to Salesforce field to display
the browse button.
3. Click the browse button to open the window in which to edit the query.
4. Add any conditions and sorting to the SELECT statement, and click OK to save
your changes.
The Pack for Salesforce.com does not parse the statements in the field. Ensure
that your changes are syntactically valid.
Results
When you run the job, the Pack for Salesforce.com sends the SELECT statement in
the SOQL Query to Salesforce field to Salesforce.com. The Program Generated
Reference SOQL Query field is intended for your reference during design time,
and is not sent to Salesforce.com.
Saving job parameters

When you use the Pack for Salesforce.com to design load and extraction jobs, you
can save the contents of any stage property as a job parameter. A job parameter is a
processing variable. When you run or schedule the job, you can supply a value to
be used in place of the parameter, so that you can reuse the job with different
inputs. The Pack for Salesforce.com does not support design-time job parameters.
About this task

Procedure
To save the contents of a stage property as a job parameter:
Procedure
1. Create a Pack for Salesforce.com extraction job or load job.
2. The Use Job Parameter column is to the right of the input field that you want
to save. Double-click the column.
3. Click New Parameter.
4. In the New Parameter window:
a. In the Parameter Name field, accept the default parameter name or type a
different name. When you run the job, the parameter name is replaced by a
value that you supply or by the default parameter value, if you define one.
b. In the Prompt field, type the text that is to be the label of the field in which
you supply the parameter when you run the job.
16
c. The Type field displays the default data type of this parameter. When you
run the job, the IBM InfoSphere DataStage and QualityStage Designer client
or the InfoSphere DataStage and QualityStage Director client uses the data
type to validate the parameter. Accept the default data type or select
another data type from the drop-down menu.
d. Optional: In the Default Value field, type the value that you want Designer
client or Director client to use if you do not supply a different parameter
value when you run the job.
e. Optional: In the Help Text field, type the text that you want Designer client
or Director client to display when you click Property Help in the Job
Options window when you run the job.
5. Click OK to save the parameter.
17
18
Appendix A. Product accessibility

You can get information about the accessibility status of IBM products.
The IBM InfoSphere Information Server product modules and user interfaces are
not fully accessible. The installation program installs the following product
modules and components:
v IBM InfoSphere Business Glossary
v IBM InfoSphere Business Glossary Anywhere
v IBM InfoSphere DataStage
v IBM InfoSphere FastTrack
v
v
v
v
IBM
IBM
IBM
IBM
InfoSphere
InfoSphere
InfoSphere
InfoSphere
Information Analyzer
Information Services Director
Metadata Workbench
QualityStage
For information about the accessibility status of IBM products, see the IBM product
accessibility information at http://www.ibm.com/able/product_accessibility/
index.html.
Accessible documentation
Accessible documentation for InfoSphere Information Server products is provided
in an information center. The information center presents the documentation in
XHTML 1.0 format, which is viewable in most Web browsers. XHTML allows you
to set display preferences in your browser. It also allows you to use screen readers
and other assistive technologies to access the documentation.
The documentation that is in the information center is also provided in PDF files,
which are not fully accessible.
IBM and accessibility

See the IBM Human Ability and Accessibility Center for more information about
the commitment that IBM has to accessibility.
19
20
Appendix B. Contacting IBM

You can contact IBM for customer support, software services, product information,
and general information. You also can provide feedback to IBM about products
and documentation.
The following table lists resources for customer support, software services, training,
and product and solutions information.
Table 4. IBM resources
Resource
Description and location
IBM Support Portal
You can customize support information by

choosing the products and the topics that
interest you at www.ibm.com/support/
entry/portal/Software/
Information_Management/
InfoSphere_Information_Server
Software services
You can find information about software, IT,

and business consulting services, on the
solutions site at www.ibm.com/
businesssolutions/
My IBM
You can manage links to IBM Web sites and

information that meet your specific technical
support needs by creating an account on the
My IBM site at www.ibm.com/account/
Training and certification
You can learn about technical training and

education services designed for individuals,
companies, and public organizations to
acquire, maintain, and optimize their IT
skills at http://www.ibm.com/software/swtraining/
IBM representatives
You can contact an IBM representative to

learn about solutions at
www.ibm.com/connect/ibm/us/en/
21
22
Appendix C. Accessing and providing feedback on the

product documentation
Documentation is provided in a variety of locations and formats, including in help
that is opened directly from the product client interfaces, in a suite-wide
information center, and in PDF file books.
The information center is installed as a common service with IBM InfoSphere
Information Server. The information center contains help for most of the product
interfaces, as well as complete documentation for all the product modules in the
suite. You can open the information center from the installed product or from a
Web browser.
Accessing the information center

You can use the following methods to open the installed information center.
v Click the Help link in the upper right of the client interface.
Note: From IBM InfoSphere FastTrack and IBM InfoSphere Information Server
Manager, the main Help item opens a local help system. Choose Help > Open
Info Center to open the full suite information center.
v Press the F1 key. The F1 key typically opens the topic that describes the current
context of the client interface.
Note: The F1 key does not work in Web clients.
v Use a Web browser to access the installed information center even when you are
not logged in to the product. Enter the following address in a Web browser:
http://host_name:port_number/infocenter/topic/
com.ibm.swg.im.iis.productization.iisinfsv.home.doc/ic-homepage.html. The
host_name is the name of the services tier computer where the information
center is installed, and port_number is the port number for InfoSphere
Information Server. The default port number is 9080. For example, on a
Microsoft Windows Server computer named iisdocs2, the Web address is in
the following format: http://iisdocs2:9080/infocenter/topic/
com.ibm.swg.im.iis.productization.iisinfsv.nav.doc/dochome/
iisinfsrv_home.html.
A subset of the information center is also available on the IBM Web site and
periodically refreshed at http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/
index.jsp.
Obtaining PDF and hardcopy documentation

v A subset of the PDF file books are available through the InfoSphere Information
Server software installer and the distribution media. The other PDF file books
are available online and can be accessed from this support document:
https://www.ibm.com/support/docview.wss?uid=swg27008803&wv=1.
v You can also order IBM publications in hardcopy format online or through your
local IBM representative. To order publications online, go to the IBM
Publications Center at http://www.ibm.com/e-business/linkweb/publications/
servlet/pbi.wss.
23
Providing comments on the documentation

Your feedback helps IBM to provide quality information. You can use any of the
following methods to provide comments:
v To comment on the information center, click the Feedback link on the top right
side of any topic in the information center.
v Send your comments by using the online readers' comment form at
www.ibm.com/software/awdtools/rcf/.
v Send your comments by e-mail to comments@us.ibm.com. Include the name of
the product, the version number of the product, and the name and part number
of the information (if applicable). If you are commenting on specific text, include
the location of the text (for example, a title, a table number, or a page number).
v You can provide general product feedback through the Consumability Survey at
www.ibm.com/software/data/info/consumability-survey
24
Notices and trademarks

This information was developed for products and services offered in the U.S.A.
Notices
IBM may not offer the products, services, or features discussed in this document in
other countries. Consult your local IBM representative for information on the
products and services currently available in your area. Any reference to an IBM
product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product,
program, or service that does not infringe any IBM intellectual property right may
be used instead. However, it is the user's responsibility to evaluate and verify the
operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not grant you
any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785 U.S.A.
For license inquiries regarding double-byte character set (DBCS) information,
contact the IBM Intellectual Property Department in your country or send
inquiries, in writing, to:
Intellectual Property Licensing
Legal and Intellectual Property Law
IBM Japan Ltd.
1623-14, Shimotsuruma, Yamato-shi
Kanagawa 242-8502 Japan
The following paragraph does not apply to the United Kingdom or any other
country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS
FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or
implied warranties in certain transactions, therefore, this statement may not apply
to you.
This information could include technical inaccuracies or typographical errors.
Changes are periodically made to the information herein; these changes will be
incorporated in new editions of the publication. IBM may make improvements
and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those Web
25
sites. The materials at those Web sites are not part of the materials for this IBM
product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it
believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact:
IBM Corporation
J46A/G4
555 Bailey Avenue
San Jose, CA 95141-1003 U.S.A.
Such information may be available, subject to appropriate terms and conditions,
including in some cases, payment of a fee.
The licensed program described in this document and all licensed material
available for it are provided by IBM under terms of the IBM Customer Agreement,
IBM International Program License Agreement or any equivalent agreement
between us.
Any performance data contained herein was determined in a controlled
environment. Therefore, the results obtained in other operating environments may
vary significantly. Some measurements may have been made on development-level
systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurements may have been
estimated through extrapolation. Actual results may vary. Users of this document
should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of
those products, their published announcements or other publicly available sources.
IBM has not tested those products and cannot confirm the accuracy of
performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products.
All statements regarding IBM's future direction or intent are subject to change or
withdrawal without notice, and represent goals and objectives only.
This information is for planning purposes only. The information herein is subject to
change before the products described become available.
This information contains examples of data and reports used in daily business
operations. To illustrate them as completely as possible, the examples include the
names of individuals, companies, brands, and products. All of these names are
fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which
illustrate programming techniques on various operating platforms. You may copy,
modify, and distribute these sample programs in any form without payment to
26
IBM, for the purposes of developing, using, marketing or distributing application

programs conforming to the application programming interface for the operating
platform for which the sample programs are written. These examples have not
been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or
imply reliability, serviceability, or function of these programs. The sample
programs are provided "AS IS", without warranty of any kind. IBM shall not be
liable for any damages arising out of your use of the sample programs.
Each copy or any portion of these sample programs or any derivative work, must
include a copyright notice as follows:
(your company name) (year). Portions of this code are derived from IBM Corp.
Sample Programs. Copyright IBM Corp. _enter the year or years_. All rights
reserved.
If you are viewing this information softcopy, the photographs and color
illustrations may not appear.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of
International Business Machines Corp., registered in many jurisdictions worldwide.
Other product and service names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the Web at www.ibm.com/legal/
copytrade.shtml.
The following terms are trademarks or registered trademarks of other companies:
Adobe is a registered trademark of Adobe Systems Incorporated in the United
States, and/or other countries.
Intel and Itanium are trademarks or registered trademarks of Intel Corporation or
its subsidiaries in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other
countries, or both.
Microsoft, Windows and Windows NT are trademarks of Microsoft Corporation in
the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Oracle and/or its affiliates.
The United States Postal Service owns the following trademarks: CASS, CASS
Certified, DPV, LACSLink, ZIP, ZIP + 4, ZIP Code, Post Office, Postal Service, USPS
and United States Postal Service. IBM Corporation is a non-exclusive DPV and
LACSLink licensee of the United States Postal Service.
Other company, product or service names may be trademarks or service marks of
others.
Notices and trademarks
27
28
Index
B
bulk load
Salesforce.com
11, 12, 13
trademarks
list of 25
C
CC_JVM_OPTIONS 5
CC_MSG_LEVEL 5
connection to Salesforce.com,
resetting 5
starting 5
customer support
contacting 21
Metadata Explorer
load operation 10
metadata requirements
delta extraction operations
load operations 3
U
3
Unique attribute, metadata

requirement 3
updatable attribute, metadata
requirement 3
O
Object field, metadata requirement 3
overview, Pack for Salesforce.com 1
D
delta extraction
default time frame 14
metadata requirements 3
E
environment variables
CC_JVM_OPTIONS 5
CC_MSG_LEVEL 5
setting for Salesforce.com 5
SF_BULKLOAD_TEMPDIR 5
error code from Salesforce.com 10
error message from Salesforce.com 10
error threshold, Salesforce.com load
operation 10
externalID field, metadata requirement 3
extractId field, metadata requirement 3
extraction job
designing 15
H
HTTPS requirement
load operation (continued)

metadata requirements
reject link 10
I
installation prerequisites 2
Internet access requirement 2
J
job parameter
run time 16
L
lastextracttime field, metadata
requirement 3
legal notices 25
load operation
metadata object display 10
P
parameter, job run time 16
privilege level 2
processing variable, defining
product accessibility
accessibility 19
product documentation
accessing 23
proxy servers 2
16
R
reject link
10
S
Salesforce.com
connections and sessions 4
environment variables 5
querying bulk load status 11, 12, 13
Salesforce.com connection properties
configuring 5
loading design-time 7
saving design-time 6
Salesforce.com Pack
Metadata Explorer 7
metadata, browsing and selecting 7
parent-child relationship indicator 7
session ID
Salesforce.com 4
SF_BULKLOAD_TEMPDIR 5
software services
contacting 21
SOQL statement,
generating automatically 7
support
customer 21
29
30

Printed in USA
SC19-3875-00

InfoSphere Information Server Pack For Salesforce

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

InfoSphere Information Server Pack For Salesforce

Transféré par

Droits d'auteur :

Formats disponibles

IBM InfoSphere Information Server Pack for

Integration Guide for IBM InfoSphere

IBM InfoSphere Information Server Pack for

Integration Guide for IBM InfoSphere

Copyright IBM Corporation 2007, 2013.

Copyright IBM Corp. 2007, 2013

Designing a data extraction job . . . . .

Appendix A. Product accessibility . . . 19

IBM InfoSphere Information Server Pack for Salesforce.com

Copyright IBM Corp. 2007, 2013

Installing and uninstalling the Pack

Before you begin

About this task

where IS_InstallDir is the directory where InfoSphere Information Server is

Salesforce.com object requirements for delta extractions

Salesforce.com rules and guidelines

IBM InfoSphere Information Server Pack for Salesforce.com

Data operations and requirements

Query or Query All

Real time only

Update requires the

Delete requires the

Connections and sessions

Setting environment variables

About this task

Start the IBM InfoSphere DataStage and QualityStage Administrator client.

5. In the Categories field, click User Defined.

Specifies the JVM heap size. 256 MB

Specifies the logging level.

The following values are

Specifies the temporary

The current project

Before you begin

About this task

Saving design time connection properties

About this task

Loading connection properties at design time

Before you begin

About this task

Selecting metadata for load and extraction operations

About this task

IBM InfoSphere Information Server Pack for Salesforce.com

Loading data into Salesforce.com

Designing a load job

About this task

Real-time load is a synchronous operation. After the Pack sends a batch to

Metadata display for selected write operations

Saving rejected data and error information

Before you begin

About this task

Querying bulk load status

You can create the following types of status queries:

Batch status query

Record status query

The following examples of status query jobs are provided in the

Demonstrates a basic job status query.

Demonstrates a basic batch status query.

Demonstrates a basic record status query.

Demonstrates a basic bulk load job.

Demonstrates the use of a sequence job for

Importing examples to query bulk load status

About this task

Designing a new job to query bulk load status