Vous êtes sur la page 1sur 4

------------------------------------------------------------------------------------Objective:

- Dumping required data from MSSQL Server in flat file, Load this data in MySQL.
- Doing this the fastest available way.
- Understanding and Considering ETL methodology (ETL: Extract, Transform, Load)
Topics:
1. MSSQL Server Bulk Copy Program (command: bcp, to extract data (E of Etl)
2. MSSQL date handling for queries (data transformation/formatting, T of eTl)
3. Loading the dump file in MySQL (data loading, L of etL)
-------------------------------------------------------------------------------------

------------------------------------------------------------------------------------Topic 1: MSSQL Server Bulk Copy Program (Command: bcp), to extract data (E of Et
l)
------------------------------------------------------------------------------------Command Example:
bcp "SELECT * FROM databaseName.Schema.TableName" queryout/out "Path to file\out
put_file_name.ext" -n/c t"|f|" -r"|r|" -SServerName[\instanceName] -UuserName -Pp
assword
Explanation of parameters used above:
queryout/out : "queryout" means we are dumping result of a query. We can dump a
complete table, use databaseName.Schema.TableName instead of query and use "out"
instead of "queryout"
-n or -c : if -n is used, data will be dumped in native format (we don't want it
, because data comes out encrypted).
if -c is used, all data will be dumped in ASCII character format (we
need to use this)
-t: field terminator. Can be one or more character. In above example 3 character
s "|f|" are used. If omitted, tab character will be used as default.
-r: row terminator. Can be one or more character. In above example 3 characters
"|r|"" are used. If omitted, CrLf character combination will be used as default.
** Using 3 characters as field/row separator will increase dump file size, but
I think this combination will make us more certain whether a field or a row is a
ctually terminated.
In table fields, there is possibility of having Tab character or NewLine cha
racter as data.
-S: ServerName[\instanceName]
ServerName: Get it by running this query - "select @@servername"
[\instanceName]: We may not need this if a single instance is running on
the server. Get it by running this query - "select @@servicename"
**
1. Command parameter are cAsE sensitive.

2. Use full qualifying table name. i.e. databaseName.Schema.TableName


3. Use full qualified path for dump files. e.g. "c:\out_dir\dumpfile.dat"
4. SQL, File-Paths need to be enclosed by quotations (e.g. "slect * from ...", "
C:\myfile.csv" etc)
5. There is no space/gap between parameter-switch and parameter value.
e.g. This is wrong: -U userName
This is correct: -Uusername
** How to enable bcp:
Probably already installed on your server, because it comes with SQLServer Manag
ement Tools.
May need to update PATH environmental variable of OS.
bcp utility location in my pc: C:\Program Files\Microsoft SQL Server\Client SDK\
ODBC\110\Tools\Binn\
If needed, can be downloaded from this link - http://www.microsoft.com/en-us/dow
nload/details.aspx?id=36433
Example Table, Query and output dump file:
Database table: testDb.dbo.testTab
Structure +----------+-------------------+
| Field
| Type
|
+------------------------------+
| logtime | datetime
|
| emp_name | nvarchar(250) |
| salary | float
|
+----------+-------------------+
Data (have 4 rows, value of second field in first row have a CrLf charac
ter, hence the line break)logtime
--------------2015-05-04 03:30:30.000
2015-05-04 15:30:00.000
2015-05-05 15:30:00.000
2014-05-05 15:30:00.000

emp_name
----------hulla
mia
rahman
jillur
naim

salary
-------------1000.78
300.7
15.6
55.9

bcp command: bcp "SELECT * FROM testDb.dbo.testTab" queryout "C:\output_


file.csv" -c -t"|f|" -r"|r|" -SPX-LAP -Usa -Pporosh
Here, server name is PX-LAP, User name is 'sa',
and Password is 'porosh'
Content of dump file C:\output_file.csv:
2015-05-04 03:30:30.000|f|hulla
mia|f|1000.78|r|2015-05-04 15:30:00.000|f|rahman|f|300.69999999999999|r|
2015-05-05 15:30:00.000|f|jillur|f|15.6|r|2014-05-05 15:30:00.000|f|naim|f|55.89
9999999999999|r|
------------------------------------------------------------------------------------Topic 2: MSSQL date handling for queries (data transformation/formatting, T of e
Tl)
--------------------------------------------------------------------------------

-----Mostly we need to dump data filtered by date. So some useful date functions are
demonstrated below.
-- today
select getdate();
output> 2015-05-06 23:15:58.833
-- tomorrow
select getdate()+1;
output> 2015-05-07 23:16:24.740
-- yesterday
select getdate()-1;
output> 2015-05-05 23:17:32.467
-- only date, in mysql/canonical format
select convert(date, getdate());
output> 2015-05-06
select convert(date,getdate()-1);
output > 2015-05-05
-- a sample query on the above table emphasizing date
SELECT convert(date,logtime) as logtime, emp_name, salary
FROM testDb.dbo.testTab
where
convert(date,logtime) = convert(date,getdate()-1)
and emp_name like '%i%';
-- a query in bcp command, emphasizing date
bcp "SELECT convert(varchar(20),logtime, 20) as logtime, emp_name, salary FROM t
estDb.dbo.testTab where convert(date,logtime) = convert(date,getdate()-1) and em
p_name like '%i%'" queryout "C:\output_file.csv" -c -t"|f|" -r"|r|" -SPX-LAP -Us
a -Pporosh
-- content of data dump file (c:\output_file.csv) generated from above bcp execu
tion will look like below:
2015-05-05 15:30:00|f|jillur|f|15.6|r|2015-05-05 15:30:00|f|naim|f|55.8999999999
99999|r|
------------------------------------------------------------------------------------Topic 3: Loading the dump file in MySQL (data loading, L of etL)
------------------------------------------------------------------------------------We will load the dump file data in testDb.testTab which is a MySql table having
the exact same columns as the dumped data.
+----------+--------------+
| Field
| Type
|
+----------+--------------+
| logtime | datetime
|
| emp_name | varchar(250) |
| salary | double
|
+----------+--------------+

This can be done from mysql prompt as below:


mysql> load data infile "c:\\output_file.csv" into table testDb.testTab fields t
erminated by '|f|' optionally enclosed by '"' lines terminated by '|r|';
Or, for automation we can use the below procedure (On Win OS):
Lets, create a file (say c:\config.txt), containing the following command:
load data infile "c:\\output_file.csv" into table testDb.testTab fields terminat
ed by '|f|' optionally enclosed by '"' lines terminated by '|r|';
Now from CMD execute the below:
mysql -uUserName -pPassword <"c:\\config.txt"
Or, put the above in a bat file(whichever is necessary).
Let's check if data is loaded in MySQL Table ;)
mysql> select * from testdb.testtab;
+---------------------+----------+--------+
| logtime
| emp_name | salary |
+---------------------+----------+--------+
| 2015-05-05 15:30:00 | jillur | 15.6 |
| 2015-05-05 15:30:00 | naim
| 55.9 |
+---------------------+----------+--------+
2 rows in set (0.00 sec)
Voila!!!!!
PS: See that auto number rounding in the salary field? (second row of above, ori
ginal MSSQL value:55.9, dumped value in file: 55.899999999999999, MySQL value: 5
5.9)
That can be a headache, if we don't do the T part of eTl in right/preferred way
:D

Vous aimerez peut-être aussi