Vous êtes sur la page 1sur 21

QS-AVI Address Cleansing as a Web Service for IBM

InfoSphere Identity Insight


Author: Bhaveshkumar R Patel (bhavesh.patel@in.ibm.com)

Address cleansing sometimes referred to as address hygiene or standardization is a


process used with the Identity Insight pipeline to help you correct and standardize address
information for optimal entity resolution processing. This new IBM InfoSphereTM Identity
Insight feature enables the use of an industry standard address data standardization solution that
includes:

AddressDoctor
IBM InfoSphere Information Server
IBM InfoSphereDataStage
IBM InfoSphere WebSphere QualityStageTM.

Enabling support for an address standardization module provided by AddressDoctor eliminates


the dependencies and limitations often associated with other standardization databases such as
Worldwide Address Verification and Enhancement System (WAVES). The AddressDoctor
address standardization module can be used for Identity Insight entity resolution by using the
DataStage and QualityStage Address Verification interface. This process is generally referred to
as QS-AVI in this document.
This techdoc describes how to create and apply an address data-cleansing job that standardizes
address data for use by IBM Identity Insight. The job is defined in DataStage and uses QS-AVI
Data Quality stages. Note that the steps are described and illustrated in a Windows client
environment.
The basic steps for implementing this address cleansing job as a Web service are:
STEP
STEP
STEP
STEP

1:
2:
3:
4:

STEP 5:
STEP 6:
STEP 7:

Verify prerequisite software


Define a QS-AVI Data stage Job to cleanse the address data.
Enable the Data stage job for Information Services
Use the Information Server Console to define a Data stage job
as a service
Use the Information Server Console to deploy this new job as service
Examine the WSDL file
Test the service

STEP 1:

Verify prerequisite software

You should have the following software installed:

Data stage InfoServer Version 8.0.1


QS-AVI Data Quality stages
Address Doctor Database (Required Country Database)

STEP 2:

Define a QS-AVI Data stage Job to cleanse the address data.

1. Open DataStage Designer.


Start -> All Programs -> IBM Information Server -> IBM WebSphere DataStage and
QualityStage Designer
a. In the Attach to Project window, enter ibmpassw0rd as the password to
connect to the Project. Click OK.

Figure 1 - Attach to Project window

b. Close the window New.

2. In the Palette pane, open the Data Quality folder to browse through the
available stages. Make sure you are able to find the QS-AVI Address Verification
stage as shown in figure 2.

Figure 2 - Address Verification in the Data Quality folder

3. Copy the AddressValidateWS.dsx file from the QS-AVI package to your local hard
drive (C:\).
4. Import the AddressValidateWS.dsx file to DataStage. This is a predefined
address cleansing job and has been designed for IBM Identity Insight and QS-AVI
integration.
5. In the Repository pane, select the job AddressValidateWS in the Jobs folder,
and open it by selecting Edit.

Figure 3 - DataStage Designer Repository pane

6. Open the Address_Verification_8 stage, by selecting Properties.


In this window you can examine and modify stage->properties. (See figure
4.)
7.

Update the stage-> properties as follows:


a. Update Reference database path with the AddressDoctor Database
installation location.
b. Update Full Preload with the required county database.

Figure 4 - AddressVerification stage window.

STEP 3: Enable the DataStage job for Information Services


One more step must be performed before the new job is enabled for Information Services.
You must change the properties of the job and specify that multiple instances of the job can
be run, and that the job can be made available as a Web service.

1. In the Repository pane, open the job properties by selecting the Edit menu and
then Job Properties).
2. On the General page of the job properties, check the following 3 boxes:

Enable hashed file cache sharing


Allow Multiple Instances
Enabled for Information Services

Figure 5 - Job Properties window


3.

Click OK to save the job properties.

4.

Save the job by selecting Save from the File menu.

5.

Compile the job by selecting Compile from File menu, or press F7.

Step 4: Define a DataStage job as a service using the Information Server Console
The Console for IBM Information Server allows you to define a data transformation or
cleansing job (DataStage job) as a service. The job must have been set in DataStage with
the property Enabled for Information Services. The tool includes a wizard to guide you
through the task.
The wizard walks you through the following task steps:
1.
2.
3.
4.
5.

Name and describe the new service.


Choose one service interface binding (such as SOAP over HTTP, EJB).
Select the DataStage job to expose as the first operation of the new service.
Set up the request and response messages for the operation.
Set run time parameters.

Before you begin

Before starting the IBM Information Server Console, you must have an application server up
and running. The example configuration in this document uses WebSphere Application
Server, which was installed as part of the IBM Information Server install.
Verify that the IBM WebSphere Application Server service is already started. The service
name is IBM WebSphere Application Server V6 bhapatelNode02. If it is not started, start
the service.
1. Open the Console for IBM Information Server. From the Windows Start button,
select All Programs -> IBM Information Server -> IBM Information Server Console.
a. When you are prompted for user name and password, enter:
user name
password

: IBM_XXXX
: XXXXXX

2. Click New Project to create new project.


a. Type (Select)
: Information Services
b. Name
: AddressValidateProject

Figure 6 - New Project window.


3. Open the AddressValidateProject by selecting File -> Open Project.
In the Open Project window, select AddressValidateProject, and click
Open.
4. Open and customize the Information Services Application, AddressValidateApp.
a. Open the Information Services Application window.
b. Click the Develop icon, and Click Tasks->New (right side panel).
c. On the Overview page, Application name, enter AddressValidateApp.

Figure 7 - Open AddressValidateApp.


d. You can use the wizard to help you create and deploy the new service. The
wizard lets you fill in information about the general properties of the new
service, the binding used by the service, and the operation that the service
invokes.
e. On the Overview page, for Service Name, enter
AddressValidateService. In the Description field, describe the service (i.e.
QS-AVI-WISD Web Service). This information is useful when users look
at the Services Directory to find out what services exist, which function the
service performs, and who to call for help.

Figure 8 - Select Information Services Application.

f. On the bindings page (under NewService1 in the Services folder), click


the Attach Bindings menu button (bottom right), and select SOAP over
HTTP as binding.
Note that currently the system offers you a choice between SOAP over HTTP
and EJB as binding.
g. Specify the operation performed by the service. At the bottom of the
Select a View portion of the window, click New, and select Operation from
the menu.
h. A new window is displayed, which lets you specify the operation to invoke.

Figure 9 - a new operation.


i. Change the name of the operation to addressValidateOps. The name of
the operation must start with a lower case letter or you will not be able to
successfully save your service definition.
j. Click Select to choose the information provider for this operation. In the
Information Provider window, select DataStage and QualityStage, as type of
information provider.

k. Navigate now through the folders to find the job named


AddressValidateWS, which were enabled for information services earlier
when you set up the job in DataStage and QualityStage Designer. Select the
job located in the Job folder: IaaS_Proj.
If the job name is not listed, it is likely that you did not compile the
DataStage job. If so, go back to the DataStage and QualityStage Designer
and compile the job.

Figure 10 - Select the IaaS_Proj job in the Job folder.


l. Click OK.

Figure 11 - new operation detail pane.


You can browse through the Inputs, Outputs and Provider Properties tabs to
review input and output parameters for the service. Remember that this
DataStage job is enabled to Information Services and includes a WISD_Input
and a WISD_Output stage. During the definition of these stages, you should
have identified the columns that would be used as input and output.
m. The Provider Properties tab contains important runtime parameter
settings. These parameters control the number of job instances allowed, the
load balancing delay, and how requests will be handled in the pipeline.
n. Click Save Application to complete the definition of the service. You
could now deploy the service. However the service will not return multiple
rows of data, as explained in step m, you must go back to DataStage
Designer to slightly modify the original job.
o. Click Close Application. You are now returned to the Application window,
which should look like this:

Figure 12 - a defined service.


You have now completed the registration of the service and can deploy the
job as a service. The deployment is also performed using the Console for IBM
Information Server.
Step 5: Use the Information Server Console to deploy this new job as a service
Deploying an application will install an Enterprise Application on an application server. This
enables the services to be invoked by other applications or services.
1. In the Information Services Application window, select the AddressValidateApp.
2. Click Deploy.
3. The window with the Service Objects to deploy is displayed. Deploy the service
object named AddressValidateService.

Figure 13 - Deploying the application.


4. You can browse the Manage Providers section. For this example, keep all of the
default options.
5. Click Deploy (located at the bottom of the window). Note that deploying an
application can take a very long time, especially if your system does not have 3GB
or more of system memory.
6. The bottom of the screen has an activity status window, which you can expand by
selecting Details.
7. Once the deployment completes, the deployment status window shows a change
in status from Executing to Completed.
8. Close the Activity Status window. The application is now successfully deployed.

Step 6: Examine the WSDL file


Verify the deployment by generating the Web service definition language (WSDL) document
for the new service. WSDL contains all the necessary descriptions (meta data) that a client
application would need to invoke the service.
WISD generates the WSDL on the fly. If your application was not deployed successfully,
you will not be able to generate the definition.
1. Open the Deployed Information Services Application window.
On the WISD Navigation bar, click the OPERATE icon and select Deployed
Information Services Application from the menu.

Figure 14 - Deployed Information Services Application window.


2. In the Deployed Applications window, you should see the name of the application
AddressValidateApp that you just deployed. Expand the AddressValidateApp
folder.
3. The AddressValidateApp folder contains the name of the service(s) defined in
the application; for each service, you also can display the operation being called by
the service.
4. Select the name of the service: AddressValidateService.

Figure 15 - AddressValidateService details.


5. Select View Service in Catalog to open the Information Server Administrator
Web Client with the Information Services Catalog view displayed.

Figure 16 - View Service in Catalog results.


6. The above window contains the general properties of the service. You can browse
through the various pages to see the information related to bindings, attributes and
operations. To find the WSDL document, open the Bindings page, and expand the
SOAP over HTTP box.

Figure 17 - Bindings view.


7. Click the link Open WSDL Document to generate the WSDL file for the
AddressValidateService service. The file is being displayed in a new browser
window.

Figure 18 - Generated WSDL file.

8. Save the WSDL file in the folder C:\SOADEMO\Results.


9. Keep the name of the URL associated to the WSDL file:
http://bhapatel:9080/wisd/AddressValidateApp/AddressValidateService/wsdl/Address
ValidateService.wsdl
10. Close the window displaying the WSDL file and the window labeled Header
Microsoft Internet Explorer, and exit the Console for IBM Information Server.
11. You can now test the service.

Step 7: Test the Service


You can use the WebSphere Integration Developer environment provides to easily verify
that a service is working properly, without having to write an application.
1. Open WebSphere Integration Developer:
Start -> All Programs -> IBM WebSphere -> Integration Developer v6 ->
WebSphere Integration Developer v6
2. Accept the workspace as displayed: SOAiis.
3. Select Run then Launch to open the Web services Explorer.

4. In the Web Browser pane, select the icon representing WSDL Page (upper righthand corner).

5. Click WSDL Main in the Navigator.


6. Enter the WSDL URL:
http://bhapatel:9080/wisd/AddressValidateApp/AddressValidateService/wsdl/Address
ValidateService.wsdl

Figure 19 - Open the WSDL URL


This is the address that is associated with the service. You generated that address by
opening the WSDL document from the View Service in Catalog in the Information
Server Administrator Web Client at the end of the previous section.
a. Click Go to get the operation name associated with the service.
7. The next screen displays the operation name(s) associated with the service. Click
the operation named addressValidateOps.

Figure 20 - displaying the operation names associated with a service.

8. You must specify the input values that this operation requires. Lets assume that
you want to standardize the name and address of this customer:
addr1
city
state
country

:
:
:
:

4100 bohanon Dr
Menlo Park
CA
USA

Figure 21 - Invoking a WSDL operation.


9. Click Go. The service is being invoked.
In the Status window, you should see the result containing the standardized named
and address for the customer entered as input. In the Status window, switch from a
Source view to a Form view to get a nicely formatted response document.

Figure 22 - a form view of a response document.


Figure 22 shows that service has been successfully invoked and that you have
successfully enabled a data-cleansing job as a Service.

Vous aimerez peut-être aussi