Vous êtes sur la page 1sur 22

Question 1

Update Blakes and Clarks salary, check that the total company salary does not exceed 20000, If total
salary >20000 then rollback.

Answer 1

set serveroutput on;

DECLARE

total_sal number(9);

BEGIN

update EMP Set SAL=3000 where ENAME='BLAKE';

Savepoint blake_sal;

update EMP Set SAL=2500 where ENAME='CLARK';

Savepoint clark_sal;

Select sum(SAL) into total_sal from EMP;

if total_sal>20000 then

rollback to Savepoint blake_sal;

end if;
COMMIT;

END;

Question 2

Transaction Data Items Locked by Data Items Transaction


Transaction is waiting for
T1 X2 X1
T2 X3, X10 X7, X8
T3 X8 X4, X5
T4 X7 X1
T5 X1, X5 X3
T6 X4, X9 X6
T7 X6 X5

Determine if deadlock exists or not. (draw wait for graph)

Answer 2

No
Question 3

Time Transaction 1 Transaction 2 Amount


T1 Begin Transaction 500
T2 Read Amount Begin Transaction 500
T3 Amount=Amount+200 Read Amount 500
T4 Write Amount Amount=Amount+50 700
T5 Commit Write Amount 550
T6 Commit 550

Find out the problem between two transactions.

Answer 3

Lost Update Anomaly

Time Transaction 1 Transaction 2 Amount


T1 Begin Transaction 500
T2 write_lock(Amount) Begin Transaction 500
T3 Read Amount write_lock(Amount) 500
T4 Amount=Amount+200 Wait 500
T5 Write Amount Wait 700
T6 Commit/unlock(Amount) wait 700
T7 Read Amount 700
T8 Amount=Amount+50 700
T9 Write Amount 750
T10 Commit/unlock(Amount) 750

Question 4

Following are the schemas of the database:

Sailors(Sid,Sname,rating,age)

Boat(Bid,Bname,color)

Reserve(Sid,Bid,day)

a) Find the age of sailors whose name begin with Am.


b) Find the Sid of sailors whose name begin and end with Z.
c) Write SQL commands to create the sailor table
Sid: number
Sname: varchar2
rating: number
age: number
Define constraint unique and not null Sid with range between 10 to 20.
d) Create 2 triggers:
Before insert (counter)- to initiate counter
After insert (Sailor_count)- to count the number of sailors whose rating is less than 5.

Answer 4

a) select Sname,age from Sailors where Sname LIKE 'Am%';


b) select Sname,Sid from Sailors where Sname LIKE 'Z%z';
c) create table Sailors
(
Sid number,
Sname varchar2(15),
rating number,
age number,
Primary Key(Sid),
constraint chk check(Sid BETWEEN 10 AND 20)
);
d) 1.Create or replace trigger counter
Before insert on Reserve
For each row
Declare
counter int;
Begin
select count(*) into counter from reserve;
counter:=counter+1;
dbms_output.put_line(Next Counter is|| counter);
End;

2. Create or replace trigger Sailor_count


After insert on Sailors
For each row
Declare
Sailorcount int;
Begin
Select count(*) into sailorcount from Sailors where rating <5;
Dbms_output.pul_line(the number of sailors whose rating is less than 5 is||sailorcount);
End;

Question 5

A company is having headquarter in delhi and major operations in bangalore and hyderabad.
Company wants to design a distributed relational database which consists of 3 tables table A, B
and C- table A consists of 1 lakh record and id frequently required in all cities. table B consists of
75000 records. record 1 to 30000 are most frequently used in delhi. record 30001 to 75000 are
most frequently used in bangalore. table C consists of 20000 records and is used exclusively in
Delhi.

a) Access and justify 4 advantages the company is expected to achieve by operating a


distributed database instead of using centralized database.

Answer:-

In the distributed database system, databases are geographically separated across the sites
that share no physical components and are separately administered and are connected
with one another through various communication media such as high-speed networks or
telephone lines.

Also, in a distributed database system, there may be heterogeneity of hardware and


operating system at each site.

Advantages of Distributed Database over Centralized Database:-

1. Sharing Data: The primary advantage of DDBMS is the ability to share and access data in an
efficient manner.
DDBMS provides an environment where users at a given site are able to access data
stored at other sites

For example, consider an organization having many branches, each branch stores data
related to that branch however a manager of a particular branch can access information of
any branch.

2. Improved availability and reliability:

Distributed systems provide greater availability and reli-ability. Availability is the


probability that the system is running continuously throughout a specified period,
whereas reliability is the probability that the system is running (not down) at any given
point of time.

In distributed database systems, since, data are distributed across several sites, failure of a
single site does not halt the entire system. The other sites can continue to operate in case
of failure of one site.

Only the data that exist at the failed site cannot be accessed. This means, some of the data
may be inaccessible but other parts of database may still be accessible to the users.

Furthermore, if the data are replicated at one or more sites, further improvement can be
achieved.

3. Autonomy: The possibility of local autonomy is the major advantage of distributed database
system.

Local autonomy implies that all operations at a given site are controlled by that site. It
should not depend on any other site. Further, the security, integrity and representation of
local data are under the control of the local site.

4. Easier expansion: Distributed systems are more modular, hence, they can be expanded easily
as compared to centralized systems, where upgrading a system with changes in hardware and
soft-ware affects the entire database.

In a distributed system, the size of the database can be increased, and more processors or
sites can be added as needed with little effort.

b) Illustrate 3 different options available for placing the table A data.


Answer:-

In distributed DBMS, the relations are stored across several sites using two methods,
namely, fragmentation and replication.

Fragmentation of a relation R consists of breaking it into a number of fragments R1,


R2.... Rn such that it is always possible to reconstruct the original relation R from the
fragments.R1, R2.... Rn.
The fragmentation can be horizontal, vertical or mixed.

Horizontal fragmentation: Horizontal fragmentation breaks a relation R by assigning


each tuple of R to one or more fragments. In horizontal fragmentation, each fragment is a
subset of the tuples in the original relation. Since each tuple of a relation R belongs to at
least one of the fragments, original relation can be reconstructed. Generally, a horizontal
fragment is defined as a selection on the relation R.
For example , the relation Book can be divided into several fragments on the basis of
attribute Category as shown:
BOOK1 = category=Novel(BOOK)

BOOK2 = category=Textbook(BOOK)

BOOK3 = category=LanguageBook(BOOK)
Here each fragment consist of tuples of Book relation that belongs to a particular
category. Further the fragments are disjoints.

Vertical fragmentation: Vertical fragmenta altion breaks a relation by decomposing


schema of relation R. In vertical fragmentation each fragment is a subset of the attributes
of the original relation. A vertical fragmentation is defined as a projection on the relation

R as Ri = Ri (R)
The relation should be fragmented in such a way that the original relation can be
reconstructed by applying the natural join on fragments. That is

R=R1 R2 Rn
Mixed Fragmentation: It is the combination of horizontal and vertical fragmentation.
The fragmentation obtained by horizontally fragmenting a relation can be further
partitioned vertically or vice versa. The original relation is obtained by the combination
of join and union operation.

Replication : Replication means storing a copy (or replica) of a relation or relation


fragments in two or more sites.

The replication can be full or partial.

Distribution of an entire relation at all sites is known as full replication.


When only some fragments of a relation are replicated it is called partial replication.
In partial replication the number of copies of each fragment can range from one to the
total number of sites in the system.

c) Access and justify which type of fragmentation you would choose for fragmenting table B
data.

Answer:-

Fragmentation is a technique in which a relation is divided into several fragments and each
fragment can be stored at sites where they are more often accessed. Fragmentation can be of
three types namely Horizontal, Vertical and Mixed.

1. Horizontal Fragmentation: This type of fragmentation is basically related with dividing


the relation based on its tuples (rows). This technique breaks a relation R by assigning
each tuple of R to one or more fragments. So, each fragment is a subset of the tuples in
the original relation. Since each tuple of a relation R belongs to at least one of the
fragments, original relation can be reconstructed.
2. Vertical Fragmentation: This type of fragmentation is basically related with dividing
the relation based on its attributes (columns). This type of fragmentation breaks a relation
by decomposing schema of relation R. In vertical fragmentation each fragment is a subset
of the attributes of the original relation.
3. Mixed Fragmentation: It is a combination of horizontal and vertical fragmentation.
The fragments obtained by horizontally fragmenting a relation can be further partitioned
vertically or vice versa. The original relation is obtained by the combination of join and
union operations.

So based upon all the three types of fragmentation techniques and the scenario, we would use
Horizontal Fragmentation Technique because according to the scenario first 30000 records
from Table B are used in one Delhi and rest are used in Banglore with all the attributes, so
stress must be given on the the tuples instead of attributes. In that case, we would use
Horizontal Fragmentation technique to split 75000 records of table B. Also, in case of
constructing the original records, union of these two fragments will be taken so as to
reconstruct the original relation.

d) In case company do not intend to implement full replication during first phase which table
should not be replicated and why?

Answer:-

Question 6
Briefly describe Recursive Triggers in context of trigger and by using appropriate example show
how triggers can cause recursion.

Answer:-

Recursion occurs when same code is executed again and again. It can lead to infinite loop and
which can result to governor limit sometime. Sometime it can also result in unexpected output. It
is very common to have recursion in trigger which can result to unexpected output or some error.
So we should write code in such a way that it does not result to recursion.

Example:-

Create or Replace trigger Update_Airport


After update on tbAirport
for each row
Begin
update tbAirport set AirportName=:New.AirportName where
AirportName=:Old.AirportName;
End;

In the above example, the trigger fires on after updation in any row of table tbAirport. But at the
same time, this trigger updates some objects of the same table which in-turn results to an infinite
loop.
In-short creating an after update statement trigger on one table that itself issues an update
statement on the same table, causes the trigger to fire recursively until it has run out of memory.

Question 7

Following is the relational schema---

book table(isbn, book title, category, price, copyright date, year, page count, pid)

publisher table(pid, pname, address, state, phone, emailid)

author table(a_id, a_name, city, state, zip, phone, url)

author_book table(a_id, isbn)

review table(review_id, isbn, rating)

write algebra queries---

a) retrieve city, phone and url of the author whose name is lewis.

Answer:- select city, phone,url from author where a_name=lewis;


b) retrieve the name, address and phone of all publishers located in new york state.

Answer:- select pname,address,phone from publisher where state=New York;

c) Retrieve pid, name, address and phone of publishers publishing novels.

Answer:- select p.pid,p.pname,p.address,p.phone from publisher p, book b where


p.pid=b.pid and b.category=novel;

Question 8

Following is the relational schema--- (employee+minit)

Employee(fname,minit,lname,essn,bdate,address,salary,superssn,dno,sex)

Department(dname,dno,mgrssn,mgrstart_date)

Department_location(dno,dlocation)

Works_on(essn,pno,hours)

Project(pname,pno,plocation,dno)

Dependent(essn,dependent_name,bdate,sex,relationship)

--- write triggers for the following condition

a) Whenever employee project assignment are changed, check if total hours per week spent on
employees project are less than 30 or greater than 40. If so notify the employees supervisor.

Answer:-

We assume that a procedure TELL_SUPERVISOR(ARGSSN) has been created. Thisprocedure


looks for an employee whose SSN matches the procedures AGRSSNargument and it notifies the
supervisor of that employee

Create trigger Inform_Supervisor_About_Hours


After update ON Works_On
For each row
Begin
When ((Select sum(hours)from works_on where essn =:new.essn) < 30
OR (Select sum(hours)from works_on where essn =:new.essn) > 40)
Tell_Supervisor(:new.essn);
End;

b) Whenever employee is deleted, delete the project tuple and dependent tuple related to that
employee and if employee is managing a department or supervising any employees, set the
mgrssn for that department to null and set the superssn for those employees to null.
Answer:-

Create or replace trigger delete_cascade


After delete on Employee
For each row
Begin
Delete from works_on where essn=:old.essn;
Delete from dependent where essn=:old.essn;
Update employee set superssn=null where essn=:old.essn;
End;

Question 9

consider the following relational schema---

customer(custno,cname,city)

order(orderno,orderdate,custno,amount)

order_item(orderno,itemno,qty)

item(itemno,unitprice)

--- on the basis of relational schema write the relational algebra query

a) retrieve the number and date of order placed by customer residing in city delhi.

Answer:-

Select o.orderno, o.orderdate from order o, customer c where o.custno=c.custno and


c.city=Delhi;

b) retrieve the number and unit price of items for which an order of quantity greater than 50
is placed.

Answer:-

Select i.itemno, i.unitprice from item i, order_item o where i.itemno=o.itemno and o.qty
>50;

Question 10
Question 11

Give the serial schedule and non-serial schedule of below transaction:

Time Transaction 1 Transaction 2


T1 Begin Begin
T2 Read (X) Read (X)
T3 X:-X-10 Read (Y)
T4 Write (X) X:-X*1.02
T5 Read (Y) Y:-Y*1.02
T6 Y:-Y+10 Write (X)
T7 Write (Y) Write (Y)
Answer:-

Serial Schedule :-

Time Trans-1 Trans-2


T1 Begin transaction
T2 Read(X);
T3 X=X-10;
T4 Write(X);
T5 Read(Y);
T6 Y=Y+10;
T7 Write(Y);
T8 Read(X)
T9 X=X*1.02;
T10 Y=Y*1.02;
T11 Write(X);
T12 Write(Y);
Non Serial Schedule:-

Time Trans-1 Trans-2


T1 Begin transaction
T2 Read(X);
T3 X=X-10;
T4 Write(X);
T5 Read(X)
T6 X=X*1.02;
T7 Read(Y);
T8 Y=Y+10;
T9 Write(Y);
T10 Y=Y*1.02;
T11 Write(X);
T12 Write(Y);

Question 12

Differentiate between data administrator and database administrator according to


activities/characteristics. Describe any 6 activities that DA and DBA have.

Answer:-

Differentiate between DA and DBA include:


DA(Data Administrator) DBA(Database administrator)

Strategic planning Control and supervision

Sets long term goals Executes plans to reach goals

Sets policies and standards Enforces policies and procedures

Enforces programming standards

Broad scope Narrow scope

Long term Short term focus on daily operations

Managerial Orientation Technical orientation

DBMS independent DBMS specific

ACTIVITIES OF DB AND DBA

Perform business requirements Define required parameters for database


gathering definition
Analyze requirements Analyze data volume and space
requirements
Model business based on requirements Perform database tuning and parameter
(conceptual and logical) enhancements
Define and enforce standards and Execute database backups and
conventions (definition, naming, recoveries
abbreviation)
Conduct data definition sessions with Monitor database space requirements
users
Manage and administer meta data Verify integrity of data in databases
repository and Data Administration
CASE (modeling) tools
Assist Database Administration in Coordinate the transformation of logical
creating physical tables from logical structures to properly performing
models physical structures

Question 13

Consultant offer different type of services to customer. Consultants are looking into possibilities
of investigating in data mining applications but unsure of potential benefits about such
applications. So, the consultant has requested your expertise as IT consultant to shed some light
in this area and address following aspects:

a) Clarify 4 reasons why consultant needs data mining applications.


Answer:-

Simply storing information in a data warehouse does not provide the benefits the
health care consultant is seeking.

To realize the value of a data warehouse, it is necessary to extract the knowledge


hidden within the warehouse

Amount and complexity of data in the data warehouse grows daily.

It is difficult for business analyst to identified business trends and relationships in the
data using simple SQL queries and report generating tools.

Data mining is one of the best ways to discover information within data warehouse
that queries and reports cannot effectively reveal.

b) Discuss any 3 applications of data mining which would benefit the consultant.

Answer:-

To improve client acquisition, the consulting firm can implement prediction


techniques on their website and on partner websites to offer potential clients with
services they are most likely going to be needed. These offers can show up on a
particular website in the form of recommendations, special promotions or
discounts. These offers will change dynamically according to the user profile a
particular user fits in. This thus will enable the consulting firm to target specific
users with specific offers that they are most likely to accept.
Using the data mining techniques, a consulting firm can create classes or
clusters of its clients according to one or more characteristics such as type of
industry the client is from, type of service requested etc. After creating these
homogenous clusters of their clients, then, using the several data mining
techniques, a consulting firm can analyze its current client data and predict what
kind of service is a particular client group more likely to request for. This will
allow the firm to provide customized and specialized service to each client group.
With the help of data mining, the consulting firm will also thus be able to target
each client group with special promotions and discounts on the services the client
group is most likely to request for. This customized service to each client will
most definitely help to retain the client and will also serve as a means to attract
new clients with the offer of customized and specialized services.

c) Describe using appropriate examples 4 problems that consultant face with data mining

Answer:-
d) Describe using appropriate examples how OLAP operations would benefit the agency.

Answer:-

Question 14

Why relational database management system is still widely used despite of emergence of object
oriented database. Provide 4 reasons.

Answer:-

There are several reasons as to why the relational model has gained its popularity until
now.

1. The model is well supported by mathematical concepts which result in the model as
simple and easy to understand.

2. The ability to perform complex queries using a query language, SQL which fits well
with the relational model exhibits that it is still relevant for organizations which
incorporated IT solution in their business operations to maximize ROI.

3. Relational DBMSs are currently the dominant database technology and because
business has invested so much money and resources in their development that change
is prohibitive. Moreover the relational model is easy to use and simple to understand.

4. Relational database are mature and extensively tested, while object oriented model is
new and there is a general shortage of experienced quality programmers.

5. In relational database model programmers know how to optimize for high speed
retrieval but object oriented database involves performance concerns.

Question 15

Star, Snowflake schema offer important advantage in data warehouse. Illustrate any 5 advantage
of star and snowflake schema of data warehouse.

Answer:-

STAR SCHEMA

The star schema is the simplest data warehouse schema, which consists of a fact table
with a single table for each dimension.

The centre of the star schema consists of a large fact table and the points of the star are
the dimension tables.
ADVANTAGES
Simple structure of the data Easy to understand how elements are connected.
Simplifies the reporting of the information.

Most common Easy to integrate with another tools.

Queries more effective The queries in these systems are usually simpler since the data
doesnt follow some strict rules of normalization. Another reason for this is the lesser
number of tables to join.

Performance enhancements The performance has substantial gains due to the de-
normalized form of the data.

Optimized for large data sets Due to the best performance of the system and its
queries, the star schema is efficient on data warehouses or data marts with huge data sets.

Rapid aggregational actions Tasks like sum, average, count, and others are performed
quickly on this systems.

Good for OLAP

SNOW FLAKE SCHEMA

The snowflake schema is a variation of star schema which may have multiple levels of
dimension tables.

ADVANTAGES

Better Data Quality The information stored on the dimensions has usually far less
anomalies than in star schema.

Less Storage Used Due to the optimization on the dimension tables, a lot of the storage
space is spared with the significant decrease of data replication.

Better Specific Query Performance Specific views are optimized by this structure
since it is built to support them because such queries are optimized.

Optimized Tools There are several tools built to work with this kind of data
organization.

More Structured Data Information is obviously much more organized than in non-
normalized structures.

Question 16
List out the selection criteria to select perfect DBMS. Describe any 5 parameters to prove your
answer.

Answer:-

Following points must be considered while selecting a perfect DBMS:-

Ease of use

Ease of administration

Reliability

Cost

Security

Compatibility

Minimum requirements

Familiarity with database

Support costs and availability

Question 17

Critically analyze how data mart differs from data warehouse and identify main reasons for
implementing data marts.

Answer:-

Data Mart: - A data mart is one piece of a data warehouse where all the information is related to
a specific business area. Therefore it is considered a subset of all the data stored in that particular
database, since all data marts together create a data warehouse.

Data Warehouse: - A data warehouse (DW) is a repository of suitable operational data (data that
document the everyday operations of an organization) gathered from multiple sources, stored
under a unified schema, at a single site. It can successfully answer any ad hoc, complex,
statistical or analytical queries.

Difference between Data Warehouse and Data Mart

Following comparison is divided on several key points that explain the differences between
them:-

Data Scope
The first, and most obvious difference is the information scope each one stores. On one hand,
data warehouses save all kinds of data related to system. On the other hand, data marts just store
specific subject information, becoming much more focused on these functionalities.

Size

Based on the definitions we can say that a data warehouse is usually much bigger than data
marts, because it keeps a lot more data.

Integration

As you may know, a data warehouse usually integrates several sources of data in order to feed its
database and the systems needs. In opposite, a data mart has a lot less integration to do, since its
data is very specific.

Creation

Creating a data warehouse is way more difficult and time consuming than building a data mart.
Building all the structure relationships between data its a long and very important step. Plus you
need to think and analyze how you will integrate all of your information sources. Since data
marts are smaller and subject oriented, these actions tend to be much simpler.

However a well built data warehouse can support large systems for the long run. In the other
hand a good data mart is only limited to its activity area.

Management

Like creation, the management of data warehouses is far more complex than data marts. For the
same reasons stated above, it is obvious that when you have a lot more data, relationships,
processes to manage, it becomes a harder task.

Cost

In overall, in terms of cost, data marts are cheaper than data warehouse. To build and maintain a
data warehouse you need significantly more physical resources like servers, disk space, memory
and cpu. Due to the complexity of the systems, a data mart requires less time to build and
operate. So, since time is money, we can easily reach to our conclusion.

Performance

The performance of a system always depends on how it is built, the infrastructure which supports
it, the processes, the number of users, etc. However, due to some previous conclusions, is safe to
say that usually a data mart is more performant than a data warehouse because of the inherited
complexity.

The main reasons for implementing data marts can be as follows:-


Because we need to create a static copy of a few fact tables (with their corresponding
dimensions) which does not change every day like the warehouse, for analysis purposes.

Because they need a mart for data mining. Processing a data mining model (training,
predictive analysis) creates heavy work load and we dont want it to affect the
performance of the central/core warehouse.

Because they want to change the data to simulate some business scenarios. They cant
change the data in the core warehouse (it will affect everybody!) so we provide a data
mart for them to running their scenarios.

Because the data warehouse is in normalized format, so we need to build a dimensional


mart for the analytics/OLAP to get the data from.

To ease the query workload on the warehouse. Say 50% of the reports are querying one
certain fact table (and its dimensions). To enlighten the burden of the warehouse we
create a copy of that fact table (and its dimension) in a separated database located on
different server, and point some of the reports that way. This mart is refreshed/ updated
every day.

Question 18

Consider the following sql queries. which of these queries is faster to execute and why, draw the
query tree also.

a) select category, count(*) from book where category !=novel group by category.

b) select category, count(*) from book group by category having category !=novel

Answer:-

Question 19

a) transform these sql queries into relational algebra expression

b) draw initial query tree for these expression

c) and then derive teir optimised query tree after applying heuristics on them.

-- select p.pid,pname,address,phone from book b, publisher p where p.pid=b.pid and


category='language book'
-- select isbn,book_title,year,page_count,price, from book b, author a, author_book ab where
b.isbn=ab.isbn and ab.a_id=a.a_id and a_name='sumit'

Answer:-

Question 20

Explain using schematic representation the architecture of data warehouse giving description of
each component of data warehouse.

Answer:-

A data warehouse basically consists of three components, namely,

the data sources,


the ETL process and
the metadata repository.

Data sources: Large companies have various data sources, which include operational
databases (databases of the organizations at various sites) and external sources such as
Web, purchased data, etc.

These data sources may have been constructed independently by different groups and
likely to have different schemas.

If the companies want to use such diverse data for making business decisions, they need
to gather these data under a unified schema for efficient execution of queries.

ETL process: After the schema is designed, the warehouse must acquire data so that it
can fulfill the required objectives.
Acquisition of data for the warehouse involves the following steps:

1. The data are extracted from multiple, heterogeneous data sources.

2. The data sources may contain some minor errors or inconsistencies.


For example, the names are often misspelled, and street, area or city names in the
addresses are misspelled, or zip codes are entered incorrectly. These incorrect data,
thus, must be cleaned to minimize the errors and fill in the missing information when
possible. The task of correcting and preprocessing the data is called data cleansing.
These errors can be corrected to some reasonable level by looking up a database
containing street names and zip codes in each city. The approximate matching of
data required for this task is referred to as fuzzy lookup.
In some cases, the data managers in the organization want to upgrade their data with the
cleaned data. This process is known as backflushing. These data are then transformed to
accommodate semantic mismatches.

3. The cleaned and transformed data is finally loaded into the warehouse.
Data are partitioned, and indexes or other access paths are built for fast and efficient
retrieval of data. Loading is a slow process due to the large volume of data.
For instance, loading a terabyte of data sequentially can take weeks and a gigabyte can
take hours. Thus, parallelism is important for loading warehouses. The raw data
generated by transaction-processing system may be too large to store in a data
warehouse; therefore, some data can be stored in the summarized form.
Thus, additional preprocessing such as sorting and generation of summarized data is
performed at this stage. This entire process of getting data into the data warehouse is
called extract, transform and load (ETL) process.Once the data are loaded into a
warehouse, it must be periodically refreshed to reflect the updates on the relations at the
data sources and periodically purge old data.

Metadata repository: It is the most important component of the data warehouse.


It keeps track of currently stored data. It contains the description of the data including its
schema definition.
The metadata repository includes both technical and business metadata. The technical
metadata includes the technical details of the warehouse including storage structures, data
description, warehouse operations, etc.
The business metadata, on the other hand, includes the relevant business rules of the
organization.

Vous aimerez peut-être aussi