Vous êtes sur la page 1sur 35

<Insert Picture Here>

OWB Tips and Tricks


Jean-Pierre Dijcks – Senior Manager
OWB Product Management
Topics

• File capabilities

• Advanced SQL capabilities

• Multi-configuration

• Match/Merge Capabilities
File Capabilities
Binary file loading

• How do I load a binary file into Oracle?

• Easy, just use Warehouse Builder and its capabilities

• How? Lets see…


Binary file loading (SQL Loader)
1. Data file 2. Sample Definition 3. Configure Byte Order

4. Create and Run mapping


5. Compare Data
Binary file loading (external table)
Multi-file loading

• What is it?
• Using built in SQL Loader functionality to reuse a mapping
• Use loops in Process Flow to facilitate

• Available in OWB 10gR2


• Added the execution time variable to change the file name
• Allow process flows to accept and pass variables and do the
looping over the required files
Multi-file loading

More information: http://blogs.oracle.com/warehousebuilder/newsItems/viewFullItem$149


New in OWB 10gR2
Complex Document Support

• OWB 10gR2 release introduced


• Extended data type and object type support
• XMLTYPE, object types, collections etc.
• Import of complex object models
• ETL operators to support complex types
• Expand, Iterate, Construct
• Any expression (EXTRACT/XMLFOREST etc.)
• Pluggable Mapping Components
• Reusable ETL
• Experts
• Macro-like accelerator framework
• Automation of complex tasks
OWB and XML Data

File
Table

Web Service XML OWB


(for example)
WS Map

instanceof
DB XML
XSD
Information Extraction
Leverage XDB – SQL XML

Generate

Example generates SQL XML extract/extractValue.


Encapsulate Common Logic
Pluggable Mappings

XMLSequence iterate extract


Generate Patterns
Example – Components from XSD
Generated Component
Example – Transaction837

Operator attributes for Element Attributes


Operator attributes for child associations X<assoc>
Advanced SQL Capabilities
Generate mappings from SQL

• What is it?
• Did you ever wish you could generate a mapping out of those
20 SQL statements you have lying around?

• How is it done?
• We created an expert that allows you to parse SQL and then
generate mappings
• Note that this is not a product, but will help you
• Take it and use it to create more and more cool stuff in OWB
Generate views from mapping

• What is it?
• Turn the mapping editor into a graphical view builder
• Allows you to choose between
• Federation
• Consolidation

• Available in OWB 10gR2


• Get lineage and impact on all your views
• Allows for change propagation of data type changes

• Also consider linking existing views to their source


tables for better lineage
DML error logging

• What is it?
• No more restarting jobs
• No more worrying about that one faulty record that trips your
load

• Available in Oracle DB 10gR2 and OWB 10.2.0.3


• Configurable per mapping
• Single or separate error tables
• High performance without the cost of restarting loads
Advanced Aggregation

• Oracle introduced new aggregation functions


• CUBE
• ROLLUP

• With 10g Release 2 and 11g of OWB you can


leverage these in your ETL environment
<Insert Picture Here>

D E M O N S T R A T I O N
Multi-Configuration
Multi-Configuration

• What is it?
• A way to stripe your physical information over multiple
environments
• Manage (part of) your dev – test – prod cycles

• Available in OWB 10gR2


• Much simpler to keep track of your environment
• Much simpler to keep the production system in a consistent
state
• A little bit hidden from your view
Multi-Configuration Targets

Dev 11g

• Single design repository


• Handle multiple target DB OWB Design
versions Test 10g
• Transparently optimize code Dev
for each version
• No recoding required Test
• Handle multiple OS target
environments
QA
QA
• Handle multiple security
settings per target
10g RAC
transparently Prod
Production
10g RAC
Multi-Configuration

• But also:

• Use direct deployment from dev / test to their targets


• But ensure the settings for qa / prod are set in the
development repository to ensure correct settings upon export

• Configurations are ALL exported


• Set the default in the receiving repository to
immediately pick up that configuration
Multi-Configuration

• Things to think about:

• Create the appropriate locations:


• For DEV / TEST / PROD in the repository
• Create the same control centers in the repository
• Assign the right locations to the right control center
• Set security on the locations
• This prevents data viewing on production
• Don’t give out the wrong passwords…
Matching and Merging
Where do I worry about DQ?

Staging Performance data layer


Data
Layer

Operational data layer

Handle DQ issues here…


Data Quality Fire Wall

Cleanse:
• De-duplicate incoming data
Cleanse Protect Report • Fix data issues
• Name and address
• String comparisons

Protect:
• Enforce referential integrity
• Enforce data rules
Operational data layer • Enforce data types and
conversions

• Report
• Data issues
• Quality levels
• Quality trends
Match/Merge Capabilities

Source A: Packaged Apps Customer Table Matching


Matching Rules:
1. SSN: edit-distance match
2. Name: soundex match
3. SAP_CUST_ID: partial match
4. XYZ_CUST_ID: exact match
Source B: In-house App Customer Table 5. ABC_CUST_ID: exact match

Merging
Merging Rules:
1. SSN: most common
Source C: Legacy App Customer File
2. NAME_L: most common
3. NAME_M: longest
4. NAME_F: most common
5. SAP_CUST_ID: most common 7 in length
6. A_CUST_SEQ: same as rec with the SSN

Expected Result:

A_CUST_SEQ SAP_CUST_ID NAME_F NAME_M NAME_L SSN

4805 KI17038 Jonathan Martin Smith 915-21-1234


Three Customer Sources

• Table A: Customers from ERP Application


• Table B: Customers from in-house database application XYZ
• Table C: Customers from legacy application ABC through flat file
Step 1: Data Standardization

• Table B: Name parsing to separate First/Middle Name


• Table C: Create external table to access flat file
• All: Combine all data into single data stream (union)
Step 2: Cross Table Matching
Matching Rules:
• SSN: If SSN not null, not 999-99-9999, use edit distance matching
• SAP ID: If SAP_CUST_ID not null, use partial matching
• ABC ID: If ABC_CUST_ID not null, use exact matching
• Name: If “NAME_F” and “NAME_L” not null, use soundex matching
• XYZ ID: If XYZ_CUST_ID not null, use exact matching

1
Step 3: Cross Table Merging

Merge Rules:
• SSN: Use the most common not-null SSN
• Name_M: Use the longest not-null middle name
• Name_F: Use the longest not-null first name from Table A
• SAP_CUST_ID: Use the most common SAP_CUST_ID with 7 digits
• A_CUST_SEQ: Use the CUST_SEQ from the record with merged SSN
<Insert Picture Here>

D E M O N S T R A T I O N
For More Information

OWB on OTN:
http://www.oracle.com/technology/products/warehouse/index.html

Blog:
http://blogs.oracle.com/warehousebuilder/

Utility Exchange:
http://www.oracle.com/technology/products/warehouse/htdocs/OWBexchange.html

Vous aimerez peut-être aussi