Académique Documents
Professionnel Documents
Culture Documents
6.0 INTRODUCTION
6.1 GENERAL INTRODUCTION IBM first published a technical article on information warehouse strategy in 1988 (Ballard, Chuck. 1996). This is a strategy for satisfying business needs for complex queries and insightful information with a managed database. In 1990, William Inmon (Inmon, W. H. 1997) coined he phrase Data warehouse. The ultimate goal of data warehousing is the creation of a single, logical view of data, which may reside in many physically disparate databases (Butler Group. 1996). traditional database systems are good at recording and reporting what happened. A data warehouse shows why (Fisher, Lawrence. 1996). Data warehouses represent the latest great paradigm of database management. The earliest data management systems were hierarchical, run on massive mainframes, and were used primarily for archival purposes. The first big change came in the early 1980s, with the adoption of relational database systems, which have primarily operational applications. These systems, typically run on minicomputers, are used for online transaction processing (O.L.T.P.), for example, to operate networks automated teller machine. Now come Data warehouses, commonly run on client/server networks of personal computers and more powerful server machines. These latest systems are used for online analytical processing (O.L.A.P.), an essentially strategic application.
3 Data warehouse organize and store data, from the operational environment, over a long historical time perspective. Consequently, they provide data found in the operational environment. Data warehouse allows user to recognize data they want and, using simple query tools, create their own queries, based on solid repository of integrated, historical data. The concept of data warehouse is that: Its a place where data extracted from production systems in the enterprise is stored (Warner, Tim. 1995). The University of Dar es Salaam as a big organization, there are operational systems like: Admission systems, Accommodation system, Examination record system, Master timetable, etc. all of these systems generate data that are vital to the University decision makers. Data warehouse is required to organize all of these data to be readily accessible and meaningful to the Chief Academic Office to support their decisions making. This study is divided into two main parts. The first part of the study will involve literature study, and documentation of the architecture, planning and designing methods, implementation techniques and laying out options for data ware house. This part of the research will be carried out and documented to enhance future references. The second part of the research will be that of laboratory work. This will involve the real development of the prototype of Data warehouse within the context of Relational Online Analytical Processing (ROLAP).
4 that data is organized into functional silos, from which it is hard to extricate what you want in other, related function (Jack D. Doyle. 1997). At the University of Dar es Salaam, despite the availability of more and more powerful computers on everyones desk and communication networks, large number of executives and decision makers cant get their hands on critical information that already exist in the University. One of the executives of the University is the Chief Academic officer. As an education institution, the University every day creates data about students, supporting programmes, staff etc, of which are important in supporting the daily works of the Chief Academic office of the University, but for the most part, this data is locked up in a myriad of manual and computer systems and is exceedingly difficult for the chief academic officer to get at. We are intending to conduct a study to analyse, design and implement data warehouse that will enable high improvement of information access for the Chief Academic Office. According to Michael Haisten, 1998 the most powerful justifications for opting Data warehouse investment in the Chief Academic office therefore are: Quality goals, since its typical objective are improving information access, Bringing the user in touch with their data, Enhancing the quality of their decisions and Providing cross-function integration of operation systems within the Organisation.
5 The result obtained will then be useful for future development of successful Data warehouse of the Chief Academic office of the University of Dar es Salaam.
6 understood the data and manipulate them while making decisions for the UDSM.
7 maintain. Other functions within the organization have to do with planning, forecasting and managing the organization. These are the knowledge-based functions, which form the Information system of the organization. Information systems have to do with analyzing data and making decision, often major decisions about how the enterprise will operate, now and in the future. Information data needs often span a number of different areas and needs large amounts of different operational data that are in summary form. Data warehouse provide information to the knowledge-based function (Decision Support Systems) within the organization. The operational systems generate data that have to be put and organized to the data warehouse (Vince Desio). Consider fig.1: below. Fig.1: The concept of data warehouse.
(Source: http://www.datawarehouseconsulting.com/img2.gif) A Data warehouse can be physically centralized, logically centralized but physically distributed, or simply distributed. With todays powerful Local Area Network based Database servers, data warehouse can also take advantage of the benefits of distributed computing.
8 Building a data warehouse is essentially a complex integration effort. Literally hundreds of system components must be brought together to work as an integrated application (Vince Desio. 1998). The graphic on the next page below represents only a high-level view of the basic components that comprise a Data warehouse. Fig.2: Data Warehouse Components ADMINISTRATION
INTERNAL &
EX TER N A L
O P ER A TI O N AL D A TA
SOURCING
I N FO R M A TI ON A C C ES S
Mi ddl ew are Perf orm ance M anagem ent Abst ract i ons U ser obj ect Preparat i ons
Desktop
Tool s
Load
METADATA
9 made up of a number of interconnected parts (The Ken Orr Institute; revised edition, 2000): Operational Data Base / External Data Base Layer Information Access Layer Data Access Layer Data Directory (Metadata) Layer Process Management Layer Application Messaging Layer Data Warehouse Layer Data Staging Layer Operational Data Base / External Data Base Layer The goal of data warehousing is to free the information that is locked up in the operational data bases and to mix it with information from other, often external, sources of data. Increasingly, large organizations are acquiring additional data from outside data bases. This information includes demographic, econometric, competitive and purchasing trends. The so-called "information superhighway" is providing access to more data resources every day. Information Access Layer The Information Access layer of the Data Warehouse Architecture is the layer that the end-user deals with directly. In particular, it represents the tools that the end-user normally uses day to day, e.g. Excel, Word, Access, PowerPoint, SAS, etc. This layer also includes the hardware and software involved in displaying and printing reports, spreadsheets, graphs and charts for analysis and presentation. Data Access Layer The Data Access layer of the Data Warehouse Architecture is involved with allowing the Information Access layer talk to the Operational Layer. In the network world today, the common data language that has
10 emerged is SQL. The Data Access layer then is responsible for interfacing between Information Access tools and Operational Data Bases. Data Directory (Metadata) Layer In order to provide for universal data access, it is absolutely necessary to maintain some form data directory or repository of meta-data information. Meta-data is the data about data within the enterprise. In order to have a fully functional warehouse, it is necessary to have a variety of meta-data available, data about the end-user views of data and data about the operational data bases. Process Management Layer The Process Management layer is involved in scheduling the various tasks that must be accomplished to build and maintain the data warehouse and data directory information. The Process Management layer can be thought of as the scheduler or the high level job control for the many processes (procedures) that must occur to keep the Data Warehouse up-to-date. Application Messaging Layer The Application Message layer has to do with transporting information around the enterprise-computing network. Data Warehouse (Physical) Layer The (core) Data Warehouse is where the actual data used primarily for informational uses occurs. Data Staging Layer Data staging is also called copy management or replication management, but in fact, it includes all of the processes necessary to select, edit, summarize, combine and load data warehouse and information access data from operation and/or external databases.
11
The knowledge of Data warehouse in Tanzania is new. Currently there is no known Data warehouse in Tanzania. This research will then create awareness to the Tanzanian IT professionals and society in general to utilize the power of data warehouse especially at higher learning institutions like in the Universities where all necessary facilities for building Data warehouses are present.
Development Studies; Institute of Kiswahili Research; Institute of Marine Sciences; Institute of Production Innovation; Institute of Resource Assessment; the University College of Lands and Architectural Studies and the Muhimbili University College of Health Sciences. The University also operates a Computing Centre, a Library and four bureaus: the Economic Research Bureau in the Faculty of Arts and Social Sciences; the Bureau for Educational Research and Evaluation in the Faculty of Education; the Bureau for Industrial Cooperation in the Faculty of Engineering and the University Consultancy Bureau. The University is situated on the west side of the city of Dar es Salaam, occupying 1,625 acres on Observation Hill, 13 k.m. from the centre of the city of Dar es Salaam. For purposes of maintaining East African inter-university academic cooperation and communication, an Inter-University Council for East Africa was set up in 1970. The Council has established an Inter-University Exchange Programme, through which the University admits students from other East African countries mainly Kenya and Uganda. countries The University also admits students from several other the world-over through established links, exchange
programmes or individual applications. Most of these students receive their bursaries from their respective governments. Students from other countries are considered for admission to both undergraduate and postgraduate studies, subject to the availability of vacancies.
7.2 Methodology
13 A short visit will be made to the Chief Academic Office. This visit is intended to familiarize the researcher and the stakeholders and also will enable an initial study of how information flows in and out of the CACOs office.
14 Give the possibility to generate theories from practice (as a preparation stage for developing the model of Data warehouse); Allow to understand the nature and complexity of the processes taking place in Data warehouse; Research an area in which few previous studies have been carries out; Research an area in which it is necessary to measure variables, but there is no a priori knowledge of what the variables of interest will be. In this case the variables are aspects, which are necessary to determine and estimate their role. 7.3 EXPECTED RESULTS OF THE RESEARCH
Theoretical Results The main theoretical result of the research will be the model, which supports Design and implementation of Data warehouse. The model should comply with the ongoing Information Plan Policy (IPP) at the University of Dar es Salaam. The model could include methods, techniques and/or instrumentation, which have to be able to support the Design and Implementation of Data warehouses in Tanzania.
Practical results
The main practical result of the research should be the realization of the Design and implementation of Chief Academic office Data warehouse of the University of Dar es Salaam. The success of this part of the research depends on the full support and willingness of the technical staff and management of already installed systems to realize that this research will help in their daily needs of information.
15
8.0 REFERENCE/BIBLIOGRAPHY: 1.
Jack D. Doyle.(1997). Informed Decision Making Through Data warehousing.
http://dhrinfo.hr.state.or.us/intranet/tands/Dwpap/DWWHITEP.htm
2. 3. 4. 5. 6. 7. 8. 9.
Vince Desio. Data warehouse Components. http://www.datawarehouseconsulting.com/page3.html Ken Orr. (1996). Data warehousing Technology. The Ken Orr Institute; revised edition, 2000. Roger Burlton. (1998). Data warehousing in the Knowledge Management Cycle. http//datawarehouse.dci.com/articles.
Ralph Kimball The Data warehouse Life Cycle Toolkit Building the Data warehouse by William H. Inmon Data warehouse Design Solutions by Christopher Adamson, Michael Venerble. SQL Server 7 Data warehousing by Michael Abbey, Ian Abramson, Larry Barner, be Taub, Michael J. Corey.
11. Data Preparation for Data Mining by Dorian Pyle 12. Data warehousing: Architecture and Implementataion by
Mark Humphries, Michael w. Hawkins, Michelle C. Dy.
17
University costs:
950,000/= 10,000/= 20,000/= 200,000/= 100,000/= 100,000/= 750,000/= 2,130,000/= 950,000/= 200,000/= 100,000/= 100,000/= 1,350,000/= UDSM -do-do-do-do-do-do-do-
DESCRIPTION
Tuition fees Application fee Registration fee Thesis Supervision Medical capitation fee Special Faculty Requirement Research Field Cost TOTAL
(b).
Student costs:
YEAR 1 SUBSQUENT YEAR SPONSOR
DESCRIPTION
UDSM -do-do-do-
18 Thesis Production Stipend (based on 130,000/=per month) TOTAL 1,560,000/= 1,913,200/ = 150,000/= 1,560,000/= 2,061,200/= -do-do-do-
9.1.2: RESEARCH/FIELD AND MATERIAL COSTS (Computer Lab.) Up-keep allowance and transport Processing fee Electrical and electronics components Subtotal 530,000/= 120,000/= 100,000/= 750,000/=
9.1.3: RESEARCH PROPOSAL PRODUCTION: Paper 5rims @ 5,000/= Secretarial services, 30 pages @ 600/= Photocopy, Department level 30 pages @40/=, 20 copies Photocopy, Faculty level 30 pages @40/=, 20 copies Photocopy, Senate level, 30 pages @40/=, 20 copies Subtotal 9.1.4: THESIS PRODUCTION Paper 5rims @ 5,000/= Secretarial services 250 pages @ 600/= Diskettes 3 boxes @ 5,000/= Photocopy, 250 pages @40/=, 4 copies Loose bound 4 copies @ 5,000/= Final binding 4 copies @ 6,000/= Subtotal TOTAL 15,000/= 15,000/= 15,000/= 40,000/= 20,000/= 24,000/= 264,000/= 1,129,000/= 25,000/= 18,000/= 24,000/= 24,000/= 24,000/= 115,000/=
19
20
9.2:
ACTIVITY Nov. Registration, literature review, Research Proposal. Data warehouse planning, Analysis and Design. Data warehouse Implementat ion and Testing. Thesis write-up, production & submission. Dec. Jan. Feb. Mar.
Apr.
May
21 9.3: COMMENTS
Date:...............................................................................Signature:........................... Name: 1/2000) (candidate) Supervisor's Comments. ................................................................................................................................... ................................................................................................................................... ................................................................................................................................... ...Date:..............................................................................Signature:......................... ... (Supervisor) Head of Department's Comments ................................................................................................................................... ................................................................................................................................... ................................................................................................................................... ................................................................................................................................... ....Date:.........................................................................Signature:............................. .... Name: Dr. H. Twaakyondo The Head, Department of Computer Science. Name: LUNGO, J. H. (Reg.No: HD/TP.