Vous êtes sur la page 1sur 40

IBM Netezza TwinFin Lder em Appliances para Data Warehouse

Silvio Ferrari IBM Netezza Systems Engineer slferrari@br.ibm.com


August 22, 2013 2011 IBM Corporation

Netezza, IM e BAO
Aplicaes Transacionais & Colaborativas

Integrar
Master Data

Analisar
Big Data

Aplicaes Analticas de Negcio

Gerenciar
Fontes de informao Externas

Data Warehouses

www

Dados Estruturados Data Warehouse Appliances

Dados Contedo Informao Streaming

Streams

Governana

Qualidade

Gerenciamento de Lifecycle

Segurana & Privacidade

2011 IBM Corporation

Verdadeiros Appliances
Dispositivos especializados Otimizados para um propsito Soluo completa Instalao rpida Operao muito simples Interfaces padro de mercado Baixo custo

Netezza anuncia servidor em 2002 Est no melhor quadrante do Gartner desde 2008
2008 Data Warehouse Database Management Systems Magic Quadrant report released on December 23, 2008

2011 IBM Corporation

A Simplicidade de um Appliance

Netezza

August 22, 2013

2011 IBM Corporation

Carregando dados no Appliance IBM Netezza Integrao de dados


Ab Initio Business Objects/SAP Composite Software Expressor Software GoldenGate Software (Oracle) Informatica IBM Information Server
SQL OLE-DB

Inserindo
ODBC

Sunopsis (Oracle) WisdomForce ... e outras mais....

JDBC

2011 IBM Corporation

Consultando o Appliance IBM Netezza


Actuate Business Objects/SAP Cognos (IBM) JDBC Information Builders Kalido KXEN MicroStrategy Oracle OBIEE QlikTech SQL Quest Software SAS SPSS (IBM) Unica (IBM) ... e outras mais....
6 2011 IBM Corporation

extraindo
ODBC

OLE-DB

Reporting e Anlise

A arquitetura IBM Netezza AMPP ( parte de Hardware )

FPGA

CPU

Analticos Analticos Avanados Avanados BI BI

Memory

FPGA

CPU

Host

Memory

Hosts

ETL ETL

FPGA

CPU

Loader Loader

Memory

Discos S-Blades

Rede Interna

Applicaes

Netezza Appliance
7 2011 IBM Corporation

Servidores Blade
Memria CPUs

2011 IBM Corporation

Acelerador IBM Netezza Database


Memria CPUs

FPGA

2011 IBM Corporation

Nosso segredo:
select DISTRICT, PRODUCTGRP, sum(NRX) from where and and MTHLY_RX_TERR_DATA MONTH = '20091201' MARKET = 509123 SPECIALTY = 'GASTRO'

FPGA

CPU

Parte Parte da da tabela tabela MTHLY_RX_TERR_DAT MTHLY_RX_TERR_DAT AA (comprimida) (comprimida)

Descomprime

Elimina colunas no usadas

Restringe Visibilidade

Operaes complexas: Joins, Aggs, etc.

select select DISTRICT, DISTRICT, PRODUCTGRP, PRODUCTGRP, sum(NRX) sum(NRX) 10

where where MONTH MONTH == '20091201' '20091201' and MARKET = and MARKET = 509123 509123 and and SPECIALTY SPECIALTY == 'GASTRO' 'GASTRO'

sum(NRX) sum(NRX)

2011 IBM Corporation

Simplicidade do Appliance IBM Netezza ( Software )


Sem ndices ou ajustes Administrao de storage desnecessria dbspace/tablespace: redo/physical/Logical log: page/block de tabelas: extent para tabelas Temp Space: dbspaces: Logical Volume: OS kernel: no h sizing ou configurao no h sizing ou configurao no h sizing ou configurao no h sizing ou configurao no h alocao ou monitorao no h decises para nvel RAID no h criao de files no h alteraes

DBAs se tornam Gerenciadores de Dados, em vez de administradores de banco de dados

OS kernel: no h nveis de patch requeridos Sesses JAD para configurar host/network/storage no requeridas

Sem instalao de software

Passos da instalao: - conectar energia eltrica - rodar testes (8h) - entregar servidor ao cliente
2011 IBM Corporation

13

Complexidade versus Simplicidade IBM Netezza Criando um database:


0. CREATE DATABASE TEST LOGFILE 'E:\OraData\TEST\LOG1TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG2TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG3TEST.ORA' SIZE 2M,
'E:\OraData\TEST\LOG4TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG5TEST.ORA' SIZE 2M EXTENT MANAGEMENT LOCAL MAXDATAFILES 100 DATAFILE 'E:\OraData\TEST\SYS1TEST.ORA' SIZE 50 M DEFAULT TEMPORARY TABLESPACE temp TEMPFILE 'E:\OraData\TEST\TEMP.ORA' SIZE 50 M UNDO TABLESPACE undo DATAFILE 'E:\OraData\TEST\UNDO.ORA' SIZE 50 M NOARCHIVELOG CHARACTER SET WE8ISO8859P1;

IBM Netezza: ZERO parmetros: 1. Oracle* table and indexes 2. Oracle tablespace 3. Oracle datafile CREATE DATABASE my_db; 4. Veritas file 5. Veritas file system 6. Veritas striped logical volume 7. Veritas mirror/plex 8. Veritas sub-disk 9. SunOS raw device 10. Brocade SAN switch 11. EMC Symmetrix volume 12. EMC Symmetrix striped meta-volume 13. EMC Symmetrix hyper-volume 14. EMC Symmetrix remote volume (replication) 15. Days/weeks of planning meetings
Mudar pata 6data!!!!!!!
14 2011 IBM Corporation

Simplicidade Netezza: criando uma tabela


ORACLE
CREATE TABLE "MRDWDDM"."RDWF_DDM_ROOMS_SOLD" ("ID_PROPERTY" NUMBER(5, 0) NOT NULL ENABLE, "ID_DATE_STAY" NUMBER(5, 0) NOT NULL ENABLE, "CD_ROOM_POOL" CHAR(4) NOT NULL ENABLE, "CD_RATE_PGM" CHAR(4) NOT NULL ENABLE, "CD_RATE_TYPE" CHAR(1) NOT NULL ENABLE, "CD_MARKET_SEGMENT" CHAR(2) NOT NULL ENABLE, "ID_CONFO_NUM_ORIG" NUMBER(9, 0) NOT NULL ENABLE, "ID_CONFO_NUM_CUR" NUMBER(9, 0) NOT NULL ENABLE, "ID_DATE_CREATE" NUMBER(5, 0) NOT NULL ENABLE, "ID_DATE_ARRIVAL" NUMBER(5, 0) NOT NULL ENABLE, "ID_DATE_DEPART" CREATE INDEX "MRDWDDM"."RDWF_DDM_ROOMS_SOLD_IDX1" ON "RDWF_DDM_ROOMS_SOLD" NUMBER(5, 0) NOT NULL ENABLE, "QY_ROOMS" NUMBER(5, 0) NOT NULL ("ID_PROPERTY" , "ID_DATE_STAY" , "CD_ROOM_POOL" , "CD_RATE_PGM" , ENABLE, "CU_REV_PROJ_NET_LOCAL" NUMBER(21, 3) NOT NULL ENABLE, "CD_RATE_TYPE" , "CD_MARKET_SEGMENT" ) PCTFREE 10 INITRANS 6 MAXTRANS 255 "CU_REV_PROJ_NET_USD" NUMBER(21, 3) NOT NULL ENABLE, STORAGE( FREELISTS 10) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING "QY_DAYS_STAY_CUR" NUMBER(3, 0) NOT NULL ENABLE, "CD_BOOK_SOURCE" PARALLEL ( DEGREE 4 INSTANCES 1) LOCAL(PARTITION "PART1" PCTFREE 10 CHAR(1) NOT NULL ENABLE) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 STORAGE( FREELISTS 6) TABLESPACE "DDM_ROOMS_SOLD_DATA" NOLOGGING

Netezza
CREATE TABLE MRDWDDM.RDWF_DDM_ROOMS_SOLD ( ID_PROPERTY ID_DATE_STAY numeric(5, 0) NOT NULL , integer NOT NULL , CD_ROOM_POOL CHAR(4) NOT NULL , CD_RATE_PGM CHAR(4) NOT NULL , CD_RATE_TYPE CHAR(1) NOT NULL , CD_MARKET_SEGMENT CHAR(2) NOT NULL , ID_CONFO_NUM_ORIG integer NOT NULL , ID_CONFO_NUM_CUR ID_DATE_CREATE ID_DATE_ARRIVAL integer NOT NULL ,

ORACLE Indexes

integer NOT NULL , integer NOT NULL ,

ORACLE Bitmap index

ID_DATE_DEPART integer NOT NULL , QY_ROOMS integer NOT NULL , numeric(21, 3) NOT NULL , numeric(21, 3) NOT NULL ,

MAXEXTENTS 100000 PCTINCREASE (PARTITION 0 FREELISTS 10 FREELIST GROUPS PARTITION BY RANGE ("ID_PROPERTY" "PART1" VALUES LESSON1 BUFFER_POOL CREATE BITMAP INDEX )"CRDBO"."SNAPSHOT_MONTH_IDX13" DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART2" THAN (600) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 "SNAPSHOT_OPPTY_MONTH_HIST" ("SNAPSHOT_YEAR" ) PCTFREE 10 INITRANS 2 PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4194304 MINEXTENTS 2 MAXEXTENTS "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART2" 10 VALUES MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS FREELIST GROUPS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL LESS THAN (1200) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, DEFAULT) TABLESPACE "SFA_DATAMART_INDEX" NOLOGGING ; STORAGE(INITIAL FREELISTS FREELIST 1) 255 TABLESPACE PARTITION 16777216 "PART3" PCTFREE 10 6 INITRANS 6 GROUPS MAXTRANS STORAGE(INITIAL "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART3" VALUES 4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 LESS THAN (1800) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART4" PCTFREE 10 INITRANS 6 "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART4" VALUES NUMBER(4, 0), CREATE CLUSTER "MRDW"."CT_INTRMDRY_CAL" ("ID_YEAR_CAL" MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS LESS THAN (2400) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 "ID_MONTH_CAL" NUMBER(2, 0), "ID_PROPERTY" NUMBER(5, 0)) SIZE 16384 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE PCTFREE 10 PCTUSED 90 INITRANS 3 MAXTRANS 255 STORAGE(INITIAL TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART5" PCTFREE 10 "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART5" VALUES 83886080 NEXT 41943040 MINEXTENTS 1 MAXEXTENTS 1017 PCTINCREASE 0 INITRANS 6 PCTFREE MAXTRANS STORAGE(INITIAL NEXT 4259840 MINEXTENTS 1 LESS THAN (3000) 5 255 PCTUSED 95 INITRANS 4194304 4 MAXTRANS 255 FREELISTS 4 FREELIST GROUPS 1 BUFFER_POOL RECYCLE) TABLESPACE MAXEXTENTS 100000 PCTINCREASE FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL STORAGE(INITIAL 16777216 FREELISTS 6 0 FREELIST GROUPS 1) TABLESPACE "TSS_FACT" ; DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART6" "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART6" VALUES

CU_REV_PROJ_NET_LOCAL CU_REV_PROJ_NET_USD QY_DAYS_STAY_CUR

smallint NOT NULL ,

CD_BOOK_SOURCE CHAR(1) NOT NULL)

distribute on random;

ORACLE Table Clusters

Sem indexes Sem Admininstrao ou ajustes Distribua os dados aleatoriamente, ou por Colunas
2011 IBM Corporation

PCTFREE 10 INITRANS 6 MAXTRANS 255 4194304 LESS THAN (MAXVALUE) PCTFREE 5 PCTUSED 95 STORAGE(INITIAL INITRANS 4 MAXTRANS 255 NEXT 4259840 STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS ) ; 1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING ) ;

15

Complexidade Tradicional versus a Simplicidade Netezza (RDBMS 101)


CREATE TABLE EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT ( RPT_PERIOD_DIM_ID PCTUSED 0
PCTFREE 10 SRVY_WEEK_DIM_ID TABLESPACE AT_EDW_REXMIN

516 BASE TABLE PARTITIONS


NUMBER NUMBER NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL,
INITRANS

CREATE TABLE EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT ( RPT_PERIOD_DIM_ID SRVY_WEEK_DIM_ID DATE_DIM_ID SRVC_MKT_SEG_DIM_ID RESPD_HHLD_DIM_ID MDOTLT_DIM_ID LSTN_LOC_DIM_ID EXPSR_MIN_CNT INTEGER INTEGER INTEGER INTEGER INTEGER INTEGER INTEGER NUMERIC(9,2)

NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL,

DATE_DIM_ID MAXTRANS 255

Index REXMIN_SOURCE_ID_I on 515 PARTITIONS NUMBER NOT NULL,


ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT VALUES LESS THAN (0) 2

Oracle: 34,500 KB de DDLs 250 KB de DDLs

LOGGING CREATE INDEX EDW_PROD.REXMIN_SOURCE_ID_I SRVC_MKT_SEG_DIM_ID NUMBER

RESPD_HHLD_DIM_ID ( TABLESPACE AI_EDW_REXMIN NUMBER


PARTITION RP0000 INITRANS MDOTLT_DIM_ID NOLOGGING MAXTRANS

PARTITION BY RANGE (RPT_PERIOD_DIM_ID) (SOURCE_ID)

LSTN_LOC_DIM_ID NOCOMPRESS LOGGING

255 CREATE BITMAP INDEX EDW_PROD.REXMIN_LLOC_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT (LSTN_LOC_DIM_ID)

Index NUMBER REXMIN_LLOC_FK_BI NOT NULL, on 515 PARTITIONS


NUMBER

Netezza:

TABLESPACE LOCAL ( AT_EDW_REXMIN EXPSR_MIN_CNT TABLESPACE NUMBER AI_EDW_REXMIN

PCTFREE 10 RP0000 PARTITION INITRANS 2 RESPD_WGHT_NMBR NUMERIC(9,2), RESPD_WGHT_NMBR NUMBER, INITRANS 1 NOLOGGING MAXTRANS 255 CREATE BITMAP INDEX EDW_PROD.REXMIN_REHH_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT MAXTRANS 255 NOCOMPRESS PRELIM_DAILY_WGHT_NMBR NUMERIC(9,2), PRELIM_DAILY_WGHT_NMBR NUMBER, LOGGING STORAGE ( TABLESPACE AI_EDW_REXMIN (RESPD_HHLD_DIM_ID) LOCAL ( FINAL_DAILY_WGHT_NMBR NUMERIC(9,2), FINAL_DAILY_WGHT_NMBR NUMBER, INITIAL 96K TABLESPACE AI_EDW_REXMIN PCTFREE 10 PARTITION RP0000 2 NEXT 2 96K INITRANS INITRANS TIMESHIFT_SECOND_CNT INTEGER, TIMESHIFT_SECOND_CNT NOLOGGINGNUMBER, MAXTRANS 255 MINEXTENTS 1 MAXTRANS 255 CREATE BITMAP INDEX EDW_PROD.REXMIN_SMS_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT TABLESPACE AI_EDW_REXMIN LOGGING BGN_EXPSR_UTC_TS TIMESTAMP, BGN_EXPSR_UTC_TS DATE, MAXEXTENTS UNLIMITED STORAGE ( (SRVC_MKT_SEG_DIM_ID) PCTFREE 10 LOCAL ( PCTINCREASE 0 INITIAL 96K TABLESPACE AI_EDW_REXMIN END_EXPSR_UTC_TS TIMESTAMP, END_EXPSR_UTC_TS INITRANS DATE, 2 PARTITION RP0000 BUFFER_POOL DEFAULT NEXT 96K INITRANS 2 MAXTRANS 255 NOLOGGING BGN_EXPSR_LOCAL_TS TIMESTAMP, BGN_EXPSR_LOCAL_TS ), MINEXTENTS DATE, 1 MAXTRANS 255 STORAGE ( TABLESPACE AI_EDW_REXMIN CREATE BITMAP INDEX EDW_PROD.REXMIN_SRWK_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT PARTITION RP0001 VALUES LESS THAN (2) UNLIMITED MAXEXTENTS LOGGING END_EXPSR_LOCAL_TS TIMESTAMP, END_EXPSR_LOCAL_TS DATE, INITIAL 96K PCTFREE 10 (SRVY_WEEK_DIM_ID) NOLOGGING PCTINCREASE 0 LOCAL ( NEXT 96K INITRANS 2 TABLESPACE AI_EDW_REXMIN BGN_BCST_UTC_TS TIMESTAMP, BGN_BCST_UTC_TS DATE, NOCOMPRESS BUFFER_POOL DEFAULT PARTITION RP0000 MINEXTENTS 1 MAXTRANS 255 INITRANS 2 TABLESPACE AT_EDW_REXMIN ), NOLOGGING END_BCST_UTC_TS TIMESTAMP, END_BCST_UTC_TS DATE, MAXEXTENTS UNLIMITED STORAGE ( MAXTRANS 255 PCTFREE 10 RP0001 PARTITION TABLESPACE AI_EDW_REXMIN PCTINCREASE 0 CREATE BITMAP INDEX EDW_PROD.REXMIN_SRWK_FK_BI ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT INITIAL LOGGING 96K BGN_BCST_LOCAL_TS TIMESTAMP, BGN_BCST_LOCAL_TS DATE, INITRANS 1 NOLOGGING PCTFREE BUFFER_POOL DEFAULT 10 (SRVY_WEEK_DIM_ID) NEXT 96K LOCAL ( MAXTRANS 255 NOCOMPRESS INITRANS 2 END_BCST_LOCAL_TS TIMESTAMP, ), END_BCST_LOCAL_TS DATE, TABLESPACE AI_EDW_REXMIN MINEXTENTS 1 PARTITION RP0000 STORAGE ( TABLESPACE AI_EDW_REXMIN MAXTRANS 255 PARTITION RP0001 INITRANS 2 MAXEXTENTS NOLOGGING UNLIMITED SOURCE_ID VARCHAR(50), SOURCE_IDPCTFREE VARCHAR2(50 BYTE), INITIAL 96K 10 STORAGE ( NOLOGGING MAXTRANS 255 PCTINCREASE TABLESPACE 0 AI_EDW_REXMIN NEXT 2 96K INITRANS CREATE BITMAP INDEX EDW_PROD.REXMIN_DATE_FK_BIACTIVE_IND ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT INITIAL 96K CHAR(1) TABLESPACE AI_EDW_REXMIN ACTIVE_IND CHAR(1 BYTE) BUFFER_POOL PCTFREE DEFAULT NOT NULL, LOGGING 'Y DEFAULT 10 MINEXTENTS 1 MAXTRANS 255 NEXT (DATE_DIM_ID) PCTFREE 10 ( 96K ), INITRANS 2 INSERT_TS TIMESTAMP NOT INSERT_TSSTORAGE DATE NOTLOCAL NULL, MAXEXTENTS UNLIMITED ( MINEXTENTS 1 INITRANS 2 TABLESPACE AI_EDW_REXMIN PARTITION RP0001 MAXTRANS 255 PCTINCREASE 0 INITIAL 96K MAXEXTENTS UNLIMITED UPDATE_TS TIMESTAMP NOT MAXTRANS DATE 255 UPDATE_TS NOT NULL, NOLOGGING STORAGE ( INITRANS 2 BUFFER_POOL DEFAULT NEXT 96K PCTINCREASE 0 STORAGE ( TABLESPACE AI_EDW_REXMIN INITIAL MAXTRANS 96K METADATA_ID INTEGER, CREATE ON EDW_PROD.EDW_RESPD_EXPSR_MIN_FACT METADATA_ID ), MINEXTENTS NUMBER, 1 255 BITMAP INDEX EDW_PROD.REXMIN_MEDO_FK_BI BUFFER_POOL DEFAULT INITIAL 96K PCTFREE 10 NEXT 96K PARTITION RP0002 VALUES LESS THAN (3) UNLIMITED MAXEXTENTS (MDOTLT_DIM_ID) ), LOGGING MEDIA_CODE VARCHAR(10), NEXT 96K MEDIA_CODE VARCHAR2(10 BYTE), INITRANS 2 MINEXTENTS 1 NOLOGGING PCTINCREASE 0 MINEXTENTS 1 LOCAL ( MAXTRANS 255 MAXEXTENTS UNLIMITED MDOTLT_HIER_DIM_ID INTEGER, MDOTLT_HIER_DIM_ID NUMBER, NOCOMPRESS BUFFER_POOL DEFAULT MAXEXTENTS UNLIMITED STORAGE ( PCTINCREASE 0 TABLESPACE AT_EDW_REXMIN ), OUT_OF_MKT_IND CHAR(1) PCTINCREASE 0 OUT_OF_MKT_IND CHAR(1 BYTE) INITIAL 96K BUFFER_POOL DEFAULT PCTFREE 10 PARTITION RP0002 BUFFER_POOL DEFAULT NEXT 96K ), ) distribute on random; INITRANS 1 NOLOGGING ), MINEXTENTS 1 MAXTRANS 255 TABLESPACE AI_EDW_REXMIN MAXEXTENTS UNLIMITED STORAGE ( PCTFREE 10

Index REXMIN_REHH_FK_BI on 515 PARTITIONS

Index REXMIN_SMS_FK_BI on 515 PARTITIONS

Index REXMIN_SRWK_FK_BI on 515 PARTITIONS

Index REXMIN_RP_FK_BI on 515 PARTITIONS

Index REXMIN_DATE_FK_BI on 515 PARTITIONS

DEFAULT 'Y NOT NULL, NULL,

NULL, Index REXMIN_MEDO_FK_BI on 515 PARTITIONS

PLUS DDL FOR TABLESPACE + 515 PARTITIONS

PLUS DDL FOR 515 PARTITIONS

INITIAL INITRANS 2 NEXT 255 MAXTRANS MINEXTENTS STORAGE ( MAXEXTENTS INITIAL PCTINCREASE NEXT MAXEXTENTS

96K 96K 1 UNLIMITED 96K 0 96K

PLUS DDL FOR 514 MORE PARTITIONS PCTINCREASE 0


BUFFER_POOL DEFAULT

PLUS DDL FOR 515 PARTITIONS

), PLUS DDL FOR 514 MORE PARTITIONS

BUFFER_POOL MINEXTENTS DEFAULT 1 ),

PLUS DDL FOR 513 MORE PARTITIONS


UNLIMITED DEFAULT ),

PCTINCREASE 0 PLUS DDL FOR 513 MORE PARTITIONS BUFFER_POOL

PLUS DDL FOR 512 MORE PARTITIONS PLUS DDL FOR 513 MORE PARTITIONS

16

2011 IBM Corporation

Comparao de requerimentos de redes (internas e externas)


Exadata (full rack) 22 IP addresses for the InfiniBand
network

TwinFin12 (full rack) 5 IP addresses 4 network drops 9 endereos IP

68 IP addresses for Ethernet (for a


single cluster)

10 network drops minimum (with

50+ reported as being typical

Total: 90 endereos IP Total:

17

2011 IBM Corporation

Monitorando a distribuio dos dados com NzAdmin

Uma m distribuio. O usurio escolheu a(s) coluna(s) errada(s) para a distribuio dos dados. Nota: Neste caso, o usurio escolheu a primeira coluna da tabela como a coluna de distrubuio. Uma deciso incorreta.

18

2011 IBM Corporation

Uma boa Distribuio: 2.2 Trilhes de Registros

19

2011 IBM Corporation

Monitorao: Distribuio homognea dos dados no sistema


Anlise de SKEW com relao ao sistema

Deve haver uma carga de utilizao equivalente entre as SPUs

20

2011 IBM Corporation

Backup e Restore
Integrao e certificao com ferramentas lderes de mercado:
Simplifica integrao com as principais ferramentas de backup e restore Suporte a X/Open Backup Services API (XBSA) Certificao IBM Tivoli Storage Manager (TSM) Certificao Veritas NetBackup da Symantec

Backup and Restore Incremental


Diminui significativamente os tempos de backup comparados ao backup Full Disponvel no utilitrio NZBACKUP Restores tipo Full ou parcial
Dom Seg Ter Qua Qui Sex Sab

Full

Dif

Dif Cumulativo

Dif

Dif

Dif

21

2011 IBM Corporation

The IBM Netezza TwinFin - Expanso


Em caso de expanso:
- um novo sistema completo enviado - dados migrados ONLINE - IPs so redirecionados - servidor original desligado e devolvido

2222

2011 IBM Corporation

i-Class: Analytics Without Constraints


Big Data Big Math

Analyze wider and deeper data


> >

Additional dimensions Richer history

Increase computational intensity > More complex models


>

Faster execution for results


2011 IBM Corporation

23

Advanced Analytics with TwinFin i-Class


SAS, SPSS

SQL

Demand Demand Forecasting Forecasting

R, S+ SQL

Fraud Fraud Detection Detection

24

2011 IBM Corporation

Simples de Instalar e Operar


Operaes
Simplesmente carregue e use um appliance! Instalao em ~2 dias! Fcil de avaliar e funciona como anunciado! Sem configurao ou modelagem fsica Sem ndices ou ajustes performance imediata Agnstico a modelos de dados Data Architects / DBA focam nos negcios, no na modelagem fsica

Desenvolvedores BI & DBAs mais geis

Desenvolvedores ETL

Tabelas de agregao no necessrias lgica de ETL simplificada Cargas e transformaes mais rpidas Anlise Linha de Pensamento 10 a 100x mais rpida Consultas ad hoc sem ajustes, sem ndices Consultas complexas a grandes datasets Menor latencia cargas e consultas simultneas processamento OnStream a centenas de nodes
2011 IBM Corporation

Analistas de Negcio

2525

Famlia de Appliances para todo o ciclo de gerenciamento:

Skimmer

TwinFin

Cruiser

Sistemas de Desenvolvimento e Testes 1 TB to 10 TB

Data Warehouse Analtico de alta Performance 1 TB to 1.5 PB

Archiving acessvel por SQL, Back-up / DR 100 TB to 10 PB

26

2011 IBM Corporation

Speed

15,000 users running 800,000+ queries per day 50X faster than before
when something took 24 hours I could only do so much with it, but when something takes 10 seconds, I may be able to completely rethink the business process
- SVP Application Development, Nielsen

Source: http://www.youtube.com/watch?v=yOwnX14nLrE&feature=player_embedded

27

2011 IBM Corporation

Simplicity

Up and running 6 months before having any training


200X faster than Oracle system ROI in less than 3 months
WEEKS MONTHS

Allowing the business users access to the Netezza box was what sold it.
Steve Taff, Executive Dir. of IT

DAYS

Services

28

2011 IBM Corporation

Scalability

1 PB on Netezza 7 years of historical data 100-200% annual data growth


NYSE has replaced an Oracle IO relational database with a data warehousing appliance from Netezza, allowing it to conduct rapid searches of 650 terabytes of data.
ComputerWeekly.com

Source: http://www.computerweekly.com/Articles/2008/04/14/230265/NYSE-improves-data-management-with-datawarehousing.htm

29

2011 IBM Corporation

Smart

Predicts what shoppers are likely to buy in future visits


Coupon redemption rates as high as 25%
Because of (Netezzas) in-database technology, we believe we'll be able to do 600 predictive models per year (10X as many as before) with the same staff."
Eric Williams, CIO and executive VP

30

2011 IBM Corporation

Todos prometem, mas... ns provamos!


Ns provamos que somos simples Ns provamos que entregamos performance Ns provamos dentro do seu ambiente Ns provamos que nos integramos com suas ferramentas Ns provamos que somos fceis de fazer negcio Ns provamos que temos o menor TCO Ns provamos Business Value

31

2011 IBM Corporation

Indice de sucesso nas PoCs:

One of

86%
The five most important M&A Deals of 2010
- Wall Street Journal
2011 IBM Corporation

33

Digital Media

Financial Services

Governo

Health & Life Sciences Retail / Consumer Products

Telecom

Other
34

Page 34

2011 IBM Corporation

Obrigado! (slides backup)

August 22, 2013

2011 IBM Corporation

Oracle Exadata
Oracle Exadata
Two layer: Clustered SMP DB Layer (RAC) Shared disk MPP Storage Layer Tuned for OLTP (e.g. FlashCache) RAC unfit for DW workloads Complexity of Oracle Real Application Clusters (RAC) Constant tuning for performance Very limited push-down of analytics RAC bottleneck for analytic performance Acquisition cost can exceed $7M per rack Hardware $1M Software is more than $6M! High maintenance and software subscription Continuing high admin costs

Results In

Netezza TwinFin
True MPP with FPGA acceleration of processing in each MPP node

Netezzas Competitive Advantage


Best architecture for DW and advanced analytics due to minimization of contention/bottlenecks

Architecture

Compromised Performance

Speed

Poor DW Performance

Appliance tuned for DW and advanced analytics True Appliance with HW/SW created to provide high performance for DW No tuning Push down of many diverse analytics (SAS, R, Gnu, etc.) through iClass Low, transparent initial cost Simple install requires no additional professional services Standard maintenance includes hw /sw support and sw upgrades

Highest DW performance Operational Simplicity

Simplicity

Complex Administration

More time spent delivering business value rather than tuning for acceptable performance Ability to accelerate the analytics used by many prospects

Smart

Poor Analytic Performance

Costs

High Total Cost of Ownership

Easily understood, predictable costs Minimal extra services so easier to budget for Netezza

36 36

2011 IBM Corporation

Analysis Summary: Oracle Exadata Database Machine


Exadata is Limited in the Processing It Does. Wont Handle:
Complex joins Distinct aggregation Analytical functions

Most Work Still Done on Oracle Database Server


Lots of movement of data Loss of Performance

Oracle Says Exadata Can Do OLTP or DW or Both At the Same Time Vastly different workloads requiring vastly different tuning Netezza customers report that Exadata poor at DW and analytic
37 2011 IBM Corporation

Query Throughput Scan Rate


Oracle Exadata throws together the very fast hardware and hopes it produces fast results. Exadata offers very fast scan rates but that just means it can get data off the disks quickly. Overall query throughput also relies on the speed of all the other components, including the software Oracle Exadata can be very fast for simple queries but gets slower with increasing complexity Netezza is designed for balance it works fast for all query types

38 38

2011 IBM Corporation

Netezzas Advantages over Oracle


Oracle RAC is still Oracle RAC. It is still: Complex needs to be tuned Temperamental needs retuning for different configurations Difficult needs specialized skills and constant maintenance Netezza is much easier. With hardware and software optimized for data warehouse applications, there is: No need for labor-intensive tuning No requirements for partitioning, indexing or building cubes Database Machine is a Resource Hog For a full rack Oracle Exadata Database Machine, you will need to supply at least 90 IP addresses (22 IP addresses for the InfiniBand network, 68 IP addresses for Ethernet, assuming a single cluster), and a minimum of 10 network drops (with 50+ reported as being typical). In contrast, a Netezza TwinFin-12 requires 5 IP addresses and 4 network drops. The core Netezza theme of simplicity is reflected in installation as in operation.
39 2011 IBM Corporation

TwinFin 24 Specification
16 (8*2) Disk Enclosures 192 (96*2) 1TB SAS Drives (8 hot spares) RAID 1 Mirroring

2 Hosts (Active-Passive): 24 Cores (Quad-Core Intel 2.6 GHz) 96 GB Memory 4x146 GB SAS Drives Red Hat Linux 5 64-bit 10G Internal Network 24 Netezza S-Blades: 192 Cores ( Intel Quad-Core 2.5 GHz) 192 FPGAs ( 125 MHz ) 384 GB DDR2 RAM (1+TB compressed) Linux 64-bit Kernel

User Data Capacity: Data Scan Speed: Load Speed (per system):
40

250 TB 290 TB/hr 2.0 TB/hr

Power/Rack: Cooling/Rack:

7,400 Watts 25,500 BTU/Hour


2011 IBM Corporation

Compress Engine in Action


On Data Load Rows separated into columnar streams Each stream independently compiled Field instructions applied to block headers Compressed data maintains row-based structure
Burst rows into column streams

Compile independent streams

Apply field instructions

Compressed storage retains all structural properties of row-wise uncompressed storage

Execute field instructions to On Data Scan/Query recover full-sized values FPGA executes field instructions to decompile at Reassemble values to recover full-sized, wire speed uncompressed rows & pass FAST Data re-assembled into rows on to remaining engines for other FAST Engines processing

41

2011 IBM Corporation

Default Workload Management: Short Query Bias


Short Query Bias (SQB)
Short queries prioritized ahead of longer running queries Real-time responses to users performing short queries Invaluable feature for large mixed-workload environments

8 Items or Less

Full Carts Here

Full Carts Here

43

2011 IBM Corporation

GRA Test: Fidelity to User Settings


60

50

40

30

rsg1_actual_ % rsg2_actual_ % rsg3_actual_ %

20

10

44

2011 IBM Corporation