Académique Documents
Professionnel Documents
Culture Documents
nzload is the utility used by Netezza to load data into tables. Heres some working examples.nzload options. Option Information -u -pw -db -t -df -cf Username to access the database Password for the username supplied to get into the database Name of the database into which you want to load the data. Table name into which the data is to be loaded. The datafile containing the data to be loaded. The file which contains the formatting and option for nvload. Useful if you are going to repeat the command often.
-delim The delimiter for the data in the file being loaded.
nzload alone is powerful enough to load most required data into netezza. for example, for a file containing tab delimited data, use the following options. To load into database prod, using user fred, password barney and table wilma from a tab delimited file called loadme.data, you would use the following: nzload -u fred -pw barney -db prod -t wilma -delim t -df loadme.data. If your file is delimited by a | symbol, you could then use the following to load into table piped table from file datafile.dat nzload -u admin -pw password -db thedatabase -delim | -df datafile.dat
Using Named Pipes If you are loading a lot of data, and do not have space to keep the data on the system, you can feed it to a named pipe, and feed the command from that. This will not exit until the end of file indicator is given. First, create the pipe file.
mkfifo pipefile
Now you can pipe the data from a supplied file and place it on the pipe.
Now you specify the pipe file with the -df option to load the data in.
Datafile loadme.dat { Database mydatabase TableName datatable. } Now you can run that with the following:
Load session of table DATATABLE completed successfully When you use the nzload command, note that you cannot specify both the -cf and -df options in the same command. You can load from a specified data file, or load from a control file, but not both in one command. This is because the control file requires a datafile definition, so you cannot specify the file from outside of the controlfile. The following control file options define two data sets to load. Note that the options can vary for each data set. Datafile /home/operation/data/customer.dat { Database dev TableName customer Delimiter | Logfile operation.log Badfile customer.bad } Datafile /home/imports/data/inventory.dat { Database dev TableName inventory Delimiter # Logfile importload.log Badfile inventory.bad }
If you save these control file contents as a text file (named import_def.txt in this example) you can specify it using the nzload command as follows: nzload -cf /home/nz/sample/import_def.txt Load session of table CUSTOMER completed successfully Load session of table INVENTORY completed successfully
353290 | 0 | 515101002 | 3 353284 | 353288 | 515101000 | 1 (3 rows) But updates do the same, dont they ? An updated row is logically updated but physically deleted and recreated. LABDB(ADMIN)=> select createxid, deletexid, rowid, * from bod; CREATEXID | DELETEXID | ROWID | ID | JUNK ++-+-+ (0 rows) LABDB(ADMIN)=> insert into bod values (1, null); INSERT 0 1 LABDB(ADMIN)=> insert into bod values (2, null); INSERT 0 1 LABDB(ADMIN)=> update bod set junk = TWO where id = 2; UPDATE 1 LABDB(ADMIN)=> select createxid, deletexid, rowid, * from bod order by createxid; CREATEXID | DELETEXID | ROWID | ID | JUNK +++-+ 353362 | 0 | 515102002 | 1 | 353364 | 353366 | 515102003 | 2 | 353366 | 0 | 515102003 | 2 | TWO (3 rows) This shows that the row has NOT been updated, but actually inserted and the old record marked as deleted. There is a gotcha here, the view_deleted_records does not work in exactly the way that you would expect. LABDB(ADMIN)=> select createxid, deletexid, rowid, * from bod order by createxid, rowid; CREATEXID | DELETEXID | ROWID | ID | JUNK +++-+ 353284 | 353288 | 515101000 | 1 | 353286 | 0 | 515101001 | 2 | 353290 | 0 | 515101002 | 3 | (3 rows) This is fine. Lets now update the table. LABDB(ADMIN)=> update bod set junk = ONE where id = 1; ERROR: 056408 : Concurrent update or delete of same row Well, that would be silly that you could actually update a deleted row in a table.
LABDB(ADMIN)=> update bod set junk = TWO where id = 2; UPDATE 1 LABDB(ADMIN)=> select createxid, deletexid, rowid, * from bod order by createxid, rowid; CREATEXID | DELETEXID | ROWID | ID | JUNK +++-+ 353284 | 353288 | 515101000 | 1 | 353286 | 353312 | 515101001 | 2 | 353290 | 0 | 515101002 | 3 | 353310 | 1 | 515101000 | 1 | ONE 353312 | 0 | 515101001 | 2 | TWO (5 rows) So you can see two things which are interesting. Firstly the deleted row was actually updated, despite the error. And we can see that id 2 was marked deleted at XID 353312, and a new one at the same XID. Lets just try that again. LABDB(ADMIN)=> set show_deleted_records=false; SET VARIABLE LABDB(ADMIN)=> insert into bod values (4, null); INSERT 0 1 LABDB(ADMIN)=> update bod set junk = FOUR where id = 4; UPDATE 1 LABDB(ADMIN)=> select * from bod where id = 4; ID | JUNK -+ 4 | FOUR (1 row) LABDB(ADMIN)=> delete from bod where id = 4; DELETE 1 LABDB(ADMIN)=> select * from bod where id = 4; ID | JUNK -+ (0 rows) LABDB(ADMIN)=> set show_deleted_records=true; SET VARIABLE LABDB(ADMIN)=> select createxid, deletexid, rowid, * from bod where id = 4; CREATEXID | DELETEXID | ROWID | ID | JUNK +++-+ 353330 | 353332 | 515102000 | 4 | 353332 | 353336 | 515102000 | 4 | FOUR (2 rows) Thats what we expect to happen. Lets repeat without the setting.
LABDB(ADMIN)=> set show_deleted_records=false; SET VARIABLE LABDB(ADMIN)=> insert into bod values (5, null); INSERT 0 1 LABDB(ADMIN)=> delete from bod where id = 5; DELETE 1 LABDB(ADMIN)=> update bod set junk = FIVE where id = 5; UPDATE 0 LABDB(ADMIN)=> set show_deleted_records=true; SET VARIABLE LABDB(ADMIN)=> select createxid, deletexid, rowid, * from bod where id = 5; CREATEXID | DELETEXID | ROWID | ID | JUNK +++-+ 353342 | 353344 | 515102001 | 5 | (1 row) Which shows that if you set show_deleted_records to true, you can not only view the deleted data, but you can update it too. This should never be set on the netezza system without being requested by support.
SYSTEM(ADMIN)=> create database gc2; CREATE DATABASE SYSTEM(ADMIN)=> create user gary2 with password gary2; CREATE USER Currently we cannot use this database as the new user. LABDB(ADMIN)=> c gc2 gary2 gary2 FATAL 1: database connection refused Previous connection kept SYSTEM(ADMIN)=> grant list on gc2 to gary2; GRANT Now we can connect. LABDB(ADMIN)=> c gc2 gary2 gary2 You are now connected to database gc2 as user gary2. GC2(GARY2)=> select count(*) from gc.gary.t1; ERROR: Permission denied on T1. The following allows access. GC2(ADMIN)=> c gc admin password You are now connected to database gc as user admin. GC(ADMIN)=> grant select on t1 to gary2; GRANT Which is the same as GC(ADMIN)=> grant select on gary.t1 to gary2; GRANT Now we can access across the database. GC2(GARY2)=> select count(*) from gc.gary.t1; COUNT 134217728 (1 row) We still cannot create a table in the database though. GC2(GARY2)=> create table t1 as select * from gc.gary.t1 limit 10; ERROR: CREATE TABLE: permission denied.
GC(ADMIN)=> c gc2 admin password You are now connected to database gc2 as user admin. GC2(ADMIN)=> grant create table to gary2; GRANT You will note that there is no requirement to reconnect to the database to obtain these permissions. GC2(GARY2)=> create table t1 as select * from gc.gary.t1 limit 10; INSERT 0 10 GC(ADMIN)=> c gc gary gary You are now connected to database gc as user gary. GC(GARY)=> create table t4 as select * from t1 limit 0; INSERT 0 0 GC(GARY)=> c gc admin password You are now connected to database gc as user admin. GC(ADMIN)=> grant insert on t4 to gary2; GRANT GC2(GARY2)=> insert into gc.gary.t4 select * from t1; ERROR: Cross Database Access not supported for this type of command Bearing in mind that you cannot have two tables the same in a single database, there is no reason to place the schema name of the table in the command. Therefore the following are identical. GC2(GARY2)=> insert into t1 select * from gc.gary.t1 limit 10; INSERT 0 10 GC2(GARY2)=> insert into t1 select * from gc..t1 limit 10; INSERT 0 10 GC2(GARY2)=> insert into t1 select * from gc.t1 limit 10; INSERT 0 10 But there is another option which can be set. GC2(GARY2)=> show enable_schema_dbo_check; NOTICE: ENABLE_SCHEMA_DBO_CHECK is 0 SHOW VARIABLE This changes the behaviour of using the schema name when referencing a table. 0 raises no message, 1 produces a warning whilst 2 denies access. GC(ADMIN)=> c gc2 gary2 gary2 You are now connected to database gc2 as user gary2. GC2(GARY2)=> insert into t1 select * from gc.gary.t1 limit 10;
INSERT 0 10 GC2(GARY2)=> set enable_schema_dbo_check = 1; SET VARIABLE GC2(GARY2)=> insert into t1 select * from gc.gary.t1 limit 10; NOTICE: Schema GARY does not exist INSERT 0 10 GC2(GARY2)=> set enable_schema_dbo_check = 2; SET VARIABLE GC2(GARY2)=> insert into t1 select * from gc.gary.t1 limit 10; ERROR: Schema GARY does not exist The setting can be made permenant. [nz@netezza data]$ pwd /nz/data [nz@netezza data]$ more postgresql.conf | grep schema # # Cross Database Access Settings # # enable_schema_dbo_check = 0 So it is originally hashed out and not read when the database starts. Change it in the configuration file. # # Cross Database Access Settings # enable_schema_dbo_check = 2 Now connect and try the same. GC2(GARY2)=> insert into t1 select * from gc.gary.t1 limit 10; INSERT 0 10 Nothing. So the default is still in place. Restart the system to enable the change. [nz@netezza data]$ nzstop [nz@netezza data]$ nzstart Now it takes effect. [nz@netezza data]$ nzsql gc2 gary2 gary2 Welcome to nzsql, the Netezza SQL interactive terminal. Type: h for help with SQL commands ? for help on internal slash commands
g or terminate with semicolon to execute query q to quit GC2(GARY2)=> show enable_schema_dbo_check; NOTICE: ENABLE_SCHEMA_DBO_CHECK is 2 SHOW VARIABLE GC2(GARY2)=> insert into t1 select * from gc.t1 limit 10; ERROR: Schema GC does not exist
Remember to run the nzdumpschema on the second database and perform the same changes incase you want to re-import the statistics back into the database after testing.