Vous êtes sur la page 1sur 16

Employee Number Employee Name Name o" #he $roje%# ,o%a#ion 'esi1na#ion &on#a%# Number (hi#e paper .

opi% E) mail i

: 135379, 117560 : Ashish Ranjan, Hossein Sa i! : &'()*A& +A : S-* .o/ers, 0an1alore : ASE, AS. : 020 3 6660 3 646456136 : '(H)E., .es#in1 Approa%h : ashish7ranjan8#%s7%om, hossein7sa i!8#%s7%om

'(H)E., .es#in1 Approa%h


Ashish Ranjan, ashish.ranjan@tcs.com, Tata Consultancy Services Ltd. Hossein Sadiq, hossein.sadiq@tcs.com, Tata Consultancy Services Ltd. Abstract This paper describes how Data Warehouse (DWH) !T" testin# $est %ractices &or 'anua( and Automation pro)ides the competiti)e ed#e to Tata *onsu(tanc+ Ser)ices (T*S) in the competiti)e ,T ser)icin# industr+. Stream(inin# 'anua( DWH !T" testin# to increase the Test *o)era#e and reducin# the Time to mar-et and aimin# &or incrementa( Automation is the &oca( point o& this paper. DWH !T" Automation &ramewor-, at the end o& the paper, &ocuses on how the incrementa( Automation approach can u(timate(+ &it into the Automation .ramewor- &or end to end testin# DWH !T" app(ications. The ,ncrementa( approach o& Automation be(ie)es on the phi(osoph+ that R/, &or an+ Automation so(ution shou(d start reapin# the bene&its ri#ht a&ter the &irst component #ets bui(t. ,n order to insure this, components shou(d be capab(e enou#h to run independent(+ as we(( as be &(e0ib(e enou#h to &it into the o)era(( &ramewor-. This ensures reducin# operationa( costs and mana#in# the consumption o& techno(o#+ resources, to ma0imi1e business )a(ue. ,t a(so e0p(ains the )a(ue addition and the cost e&&iciencies which these $est practices and Automation approach brin#s to the customers. Re(e)ant case studies &rom T*S are inc(uded in this paper to e0p(ain the concept.

1.0 Introduction
Many organizations today are challenged to do more with lesser resources and to make cost reduction a strategic priority. Most Testing teams constantly face the challenge of innovating solutions that add value to their customers, who in turn are looking at reducing costs and increasing the coverage. DWH- T! testing is one such area. The maturity level of DWH- T! testing, in general, with respect to Testing Methodology is very low.

Typically a team should first move towards standardisation of Manual Testing processes. "fter Manual Testing processes standardization the team should move towards "utomation.

2.0 What is Data Warehouse?


" data warehouse is the main repository of the organization#s historical data, its corporate memory. $or e%ample, a &redit &ard company would use the information that#s stored in its data warehouse to find out which months of the year their &ustomers have a very high rate of defaulting on their &redit &ard 'ayment, or the spending ha(its of different segments of society and age group. )n other words, the data warehouse contains the raw material for management#s decision support system.

3.0 What is DWH-ETL?


E9#ra%#, #rans"orm, an loa :E.,; is a process in data warehousing that involves %tracting data from outside sources, Transforming it to fit (usiness needs, and ultimately !oading it into the data warehouse.

T! can in fact generally refer to a process that loads any data(ase.

4.0 DWH-ETL Testing


There is a need for more focused testing regarding DWH- T! processes. When testing T!* +alidation should (e done that all the data specified gets e%tracted. Test should include the check to see that the transformation and cleansing process are working correctly.

Testing the data loaded and their count, into each target field to ensure correct e%ecution of (usiness rules such as valid-value constraints, cleansing and calculations.

)f there are different ,o(s in an application, the ,o( dependencies should also (e verified.

T! "pplications can (e language (ased or Tool (ased. )f the capa(ility of the T! tool is fully utilised it reduces some of the Testing effort otherwise re-uired in language (ased application.

../

Manual Testing trateg! "or DWH-ETL

)n this section Manual 0ystem Testing and 1egression Testing strategy has (een discussed in detail. Most of the processes mentioned (elow have (een streamlined in order to make the transition to "utomation as smooth as possi(le.

For System Testing:


$rom a very high level any DWH- T! application has inputs which undergo a transformation and results in an output. Hence for system testing, the general steps which need to (e followed for success and coverage are* /2 0cenarios (ased on the Documents 32 Test data prepared (ased on 0cenarios 42 %pected results (ased on the Transformation and the Test data .2 1un the "pplication using the Test Data 52 &apture the "ctual output 62 "nalyse the results after comparing the %pected to "ctual Pre-requisites: /2 7usiness 1e-uirement document 32 $unctional 1e-uirement document 42 Technical 1e-uirement document .2 0ource to Target Mapping document 52 &ode diff 62 Tracea(ility Matri%

Ho/ #o %ome up /i#h S%enario Shee#< 7ased on the $unction 1e-uirement document, Technical 1e-uirement document and 0ource to Target Mapping document scenarios should (e created to check the new functionalities going in. 0cenarios should have (oth positive and negative cases. Typically scenarios can (e created using an %cel sheet with one of the worksheet having them in plain nglish. )f it is a comple% application where intermediate files are (eing created, the application should (e (roken into su(-applications. " su(-application will (e any set of ,o(s clu((ed together which has a physical input and a physical output. 0cenario sheet should (e created at su(-application level and then merged into a final application level 0cenario sheet. ach 0cenario should have corresponding source file8ta(le, source field8column names and target file8ta(le, target field8column names even at the 0u(-application level. The values put in a(ove source fields8columns can (e hypothetical ones, (ut should (e uni-ue and should have correct corresponding values 9after transformation2 put in the Target fields8columns. The a(ove scenarios should (e discussed with the 7usiness "nalyst, Development Manager, Test !ead w.r.t coverage, duplication and relevancy. "fter (eing agreed upon (y all 0takeholders, 3 new worksheets should (e created in the same e%cel sheet : )nput ; %pected <utput. )nput worksheet should have real world values, with field level details which can (e processed through the application. %pected <utput worksheet should have the %pected Target values if the )nput worksheet values are passed through the "pplication. Ho/ #o %rea#e .es# 'a#a< )f the )nput used (y the "pplication are files then (ased on the =)nput Worksheet> of the %cel sheet and the =$ile layout>, the input files used (y the application should (e mocked up.

)f the )nput used (y the "pplication is from the ta(le, then again (ased on the =)nput Worksheet> of the %cel sheet, the corresponding values should (e inserted into the &olumns of the concerned ta(les. Ho/ #o ensure #he *o%=e up .es# 'a#a "ile 1e#s pi%=e up /hen appli%a#ion runs< "t a very high level there are 4 ways to ensure it depending on the application /. 1enaming the file 3. &hanging the Header and Trailer 4. ?etting the name from the &ontrol D7 and naming the file (ased on that. Ho/ #o %ap#ure #he A%#ual >u#pu# "or Analysis< The (est way to do so is to import the data from the output file or ta(le to a @ew %cel 0heet. %cel has the (uilt-in functionality of importing delimited or fi%ed column width flat te%t files. "lso most of the D7 -uerying tools like Toad, 0A! @avigator and 0A! Developer have capa(ilities to save the result in %cel. 0toring the <utput simplifies comparison (etween e%pected and actual and also helps in storing the results in reada(le format for future reference. Ho/ #o analyse #he resul#s< $ield to field comparison should (e done (etween the %pected and the "ctual. "ny discrepancies should (e highlighted. $irst the scenario having the difference should (e dou(le checked for the %pected value and once satisfied defect should (e raised for that 0cenario.

For Regression Testing:


?enerally doing regression testing of DWH- T! applications seems a very daunting and time consuming task. Typically due to resource8time constraints* 1andom 0ampling of unchanged high critical functionalities is targeted 0ystem testing of random sampled functionality is done

This leaves plenty of code going into the production with very low confidence level.

Most of the unchanged medium and low critical functionality never gets tested release after release. 0ometimes due to time constraint and high num(er of critical functionality in the "pplication, some of the critical functions never get verified.

1egression testing is done to ensure that the new code did not affect anything other that what it was supposed to affect. This purpose can (e achieved (y taking a snapshot of the "pplication (efore the code gets implemented and a snapshot after the code gets implemented. " difference in (oth the snapshots will show what got changed. "nything which was not supposed to get changed will also (e highlighted. )mportant thing to remem(er is that the a(ove can only work if the input remains constant. Thus for 1egression testing, the test file should remain constant. There are 3 ways in which a 1egression )nput file can (e created* /2 7y taking a sample of production )nput file and desensitising it. 32 7y incrementally adding the 0ystem Test )nput Data file to a 1egression Test file with each release. 'rogressively with each release, the 1egression Test file will get (etter in its scope and coverage. <nce the a(ove is achieved, to regression test a release, same file should (e run through the application (efore and after code is implemented. The output should (e captured for (efore and after runs and stored prefera(ly in an %cel 0heet in different worksheets. " simple comparison will highlight the differences (etween the 3 worksheets. There will (e differences either (ecause /2 )t is supposed to (e different - the new change for this release or 32 @ew code (roke something it was not supposed to affect. 0egregating 3 and / and analysing the 3s will pin-point what the new code (roke. This is typically what any regression testing aims for.

..3

Incre#ental $uto#ation Test trateg! "or DWH-ETL

Most of the Manual "ctivities mentioned a(ove can (e effectively converted into independent "utomated &omponents. $or starting the "utomation )nitiative a Testing pro,ect need not wait for all the components to get developed, as time and resource

considerations permit they can (e incrementally developed and (enefits can start trickling in right after the first component gets developed. !ater in the paper a complete "utomation $ramework will (e discussed. 0ame framework with minor modifications can (e used to "utomate most of the 7ackend application Testing. 7elow are the "ctivities which can (e automated*

For System Testing:


Ho/ #o %rea#e .es# 'a#a< )f the )nput file used (y the application is a serial file with a specific format. " tool can (e (uilt which reads from the %cel worksheet and create a serial file according to the specific file format. )f the )nput file is a M$0 file, the flat file created as a(ove can (e ftped to the Bni% (o% and another T! component can created it. Ho/ #o ensure #he *o%=e up .es# 'a#a "ile 1e#s pi%=e up /hen appli%a#ion runs< " component can (e (uilt which (ased on the application re-uirement, can /. $T' the serial Test data file from windows to application Bni% directory 3. 1ename the file so that it can (e picked. $or &ontrol D7 (ased applications the component can Auery the D7 for the file name value and rename the file . Ho/ #o %ap#ure #he A%#ual >u#pu# "or Analysis< Two components can (e (uilt /. &onverts the different file formats to a serial comma delimited file. 3. &onnect to the D7 and -uery for the ta(le update done (y the "pplication and e%tract it in an %cel sheet. Ho/ #o analyse #he resul#s< " simple comparison tool can (e (uilt which takes in the %pected 1esult worksheet and $inal <utput as )nputs and does a comparison to show the difference at field level.

This tool can (e used for the 1egression testing too.

For Regression Testing:


)t was mentioned earlier that for 1egression Testing* =The output shou(d be captured &or be&ore and a&ter runs and stored pre&erab(+ in an !0ce( Sheet in di&&erent wor-sheets. A simp(e comparison wi(( hi#h(i#ht the di&&erences between the 2 wor-sheets3 The &omparison tool (uilt for 0ystem Testing can (e used e-ually well here. 0mall modifications may (e re-uired (ased on the 'ro,ect 1e-uirement.

4.3 DWH-ETL $uto#ation %ra#e&or' De(elo)#ent


$ramework development plays a key role in providing fle%i(ility to the tool for enhancing the capa(ilities for future usage. &areful thought has to (e put in while development of the framework to arrive at generic components that constitutes the framework. The component model for the framework under discussion is shown in the schematic (elow. )t is specific to "( )nitio8Tivoli (ut all the components are fle%i(le and can (e com(ined to work effectively (ased on the system under test 8 testing re-uirements. The framework also supports scala(ility to incorporate other similar modules 8 applications. The components e%plained (elow have (een considered at a very high level. 0ome new component8functionality can (e added, removed or modified to customize (ased on the application and testing automation re-uirements.

Framework component model

&omponen# 1 ) .o %on?er# E9%el .ab Shee# #o AS&@@ "iles o The Bser Defined %cel $ile should (e pulled from Bser Defined !ocation. o 0hould (e a(le to select any user defined Ta( in any user defined worksheet and convert the information in cells in a specified &olumn and 1ow 1ange into a "0&)) Te%t $ile. o The "0&)) $ile should (e named according to Bser Defined name. o The "0&)) $ile should (e placed in Bser Defined !ocation &omponen# 4 ) A.$ "iles "rom (in o/s $la#"orm #o Bni9 bo9 o &omponent should pick the Bser Defined file. o &omponent should (e a(le to ftp the picked file to Bser Defined Bni% (o% 9 Different "pplications reside on Different 0ervers2 &omponen# 3 ) &han1es AS&@@ #o *AS, SAS, E0&'@& epen in1 on A$$ o 0u( &omponents can (e defined which are responsi(le for only one type of conversion.

o The component should (e a(le to correctly Map the fields from one format to another rather than ,ust convert the format.9DM! Dependency should (e considered2 o The Cind of $ormat change and the DM! should (e Bser Defined. o The component should also place the file after conversion to the Bser Defined landing Directories of the "pps for it to (e a(le to pick it. o $ile should (e renamed to Bser Defined @ames. &omponen# 6 ) Bp a#es &on#rol '0 #o pi%= our "iles usin1 Arame/or= '0 o The &ontrol D7 should (e updated with the correct file name so that the "pp can pick it when it is running. o )t should validate whether the +alues in the &ontrol D7 for the "pp is correct or not. o )f a(ove is not correct it should 1aise and "lert and stop. &omponen# 5 ) Runs jobs one a"#er ano#her in #he or er men#ione Arame/or= '0 o )t should (e a(le to mock the way Tivoli runs. o )f any ,o( fails it should raise an alert and stop. &omponen# 6 ) &han1es *AS, SAS, E0&'@& #o AS&@@ epen in1 on A$$ o 0u( &omponents can (e defined which are responsi(le for only one type of conversion. o The component should (e a(le to correctly Map the fields from one format to another rather than ,ust convert the format.9DM! Dependency should (e considered2 o The Cind of $ormat change and the DM! should (e Bser Defined. o The component should (e a(le to pick up the file for conversion from the Bser Defined output Directories of the "pps. &omponen# 7 ) A.$s >u#pu# "iles "rom Bni9 bo9 o &omponent should pick the Bser Defined file from user defined location and Bser defined Bni% 7o%. in #he

o &omponent should (e a(le to ftp the picked file to Bser defined Windows location and (o%. &omponen# 2 ) E9#ra%#s .able loa s in App .ables #o AS&@@ "ile o 0hould (e a(le to e%tract values from Bser Defined Ta(le and D7 using Bser Defined Auery. o 0hould %tract and place the file in a &0+ file format in a Window#s Directory. &omponen# 9 ) &ompares E9pe%#e >u#pu# #o A%#ual >u#pu# o 0hould (e a(le to locate the <utput "0&)) file and the among all the files and pick them for comparison. o !imitation of 356 columns needs to (e addressed. &omponen# 10 ) Bp a#es $ass or Aail in +& o 7ased on pass or fail A&8TD should (e updated. %pected <utput $ile

*.0 uccess tories


Case Studies:
17 $roblem es%rip#ion* $or a leading )nvestment 7ank the overall o(,ective was to identify areas for 'rocess )mprovement to increase overall Test fficiency. The challenges were* !imited Test &overage 93D-5D2 - Due to Time 8 1esource constraints only few record 8 )nterface are validated. Tedious and Time &onsuming 'rocess. 'rone to Human rror Solu#ion: T&0 team has provided the optimal solution (ased on the a(ove framework in mind and provided a tool which connects to different data(ases of the investment (ankEs various data warehousing applications and retrieves the (aseline data and the input data, compares the set of data against each other and reports any differences. 47 $roblem es%rip#ion:

" (ankEs DWH Testing division was looking for ways to )mprove Manual Testing efficiency and leverage "utomation. Their confidence level for any complete "utomation solution was very low. They wanted to try "utomating few things (ut without huge investment of effort and time. Solu#ion: T&0 team has provided the optimal solution here (y suggesting the a(ove )ncremental "utomation approach. )t was readily adopted and with each component getting developed the &ost saving realizations started tricking in. T&0 then suggested the DWH "utomation $ramework and that too has (een adopted.

+.0 Lessons Learnt


Establis clear and reasonable e!pectations
o o sta(lish what percentage of your tests are good candidates for automation liminate overly comple% or one-of-a kind tests as candidates

o ?et a clear understanding of the "utomation Testing 1e-uirements o Technical personnel to develop and use the tool. o "n effective manual testing process must e%ist (efore automation is possi(le. F"d hocF testing cannot (e automated. Gou should have* o Detailed, repeata(le test cases, which contain e%act e%pected results o " standalone test environment with a restora(le data(ase

"anaging Resistance to C ange


The tool does not replace the testers. )t helps them (y* o 'erforming the (oring8repeata(le tasks performed while testing o $reeing up some of their time so that they can create (etter, more effective test cases 0pecific application changes are still going to (e manually tested. 0ome of these tests may (e automated afterward for regression testing. o veryone need not (e trained to code, all that they have to learn is a different method of testing.

Sta##ing Requirements
<ne area that organizations desiring to automate testing seem to consistently miss is the staffing issue. "utomated test tools use FscriptsF which automatically e%ecute test cases. "s mentioned earlier in this paper, these Ftest scriptsF are pro#rams. They are written in whatever scripting language the tool uses. 0ince these are programs, they must (e managed in the same way that application code is managed.

,.0 -onclusion
DWH manual testing process can (e streamlined making it easier to test the key areas and increase coverage. )t will also increase the ease of Testing 7ackend applications. With respect to "utomation, an )ncremental "utomation (ased test strategy can (e easy, cost effective and efficient to implement. " framework that has the fle%i(ility of scaling up, modularity and data dependency can deliver great (enefits to the Team and <rganization. Re"eren%es o Totally Data-Driven "utomated Testing - " White 'aper 7y Ceith Ham(elich o What is data warehouse testingI (y 1o( !evy o Wikipedia Au#hors "shish 1an,an, T&0-J'M& 1elationship mail* ashish.ranjan@tcs.com Hossein 0adi-, T&0-J'M& 1elationship mail* hossein.sadiq@tcs.com

Vous aimerez peut-être aussi