Vous êtes sur la page 1sur 6

Data warehouse or Data Mart?

Data Warehouse:
Holds multiple subject areas
Holds very detailed information
Works to integrate all data sources
Does not necessarily use a dimensional model but feeds dimensional models.

Data Mart
Often holds only one subject area- for example, Finance, or Sales
May hold more summarised data (although many hold full detail)
Concentrates on integrating information from a given subject area or set of source
systems
Is built focused on a dimensional model using a star schema.

More Info about Data mart models


Data mart Modelling 101
Data mart example
Dimension tables and Fact Tables

More Detail regarding Data Warehouse Vs


Datamart: and Inmon vs Kimball
As the concept of decisional systems, and data warehouses and data
marts evolved, two major points of view came into existence. There are
two giants in this field. Bill Inmon, and Ralph Kimball.
There are some that argue the best approach is to start with data marts,
department by department, then merge them together to form a data
warehouse- this is more in line with Kimballs approach.
Now, Bill Inmon is an advocate of the Data warehouse. Here's one of his
articles, which contains the following quote that makes it clear what he
thinks about the idea:
"You can catch all the minnows in the ocean and stack them together
and they still do not make a whale," Bill Inmon, January 8, 1998.
Ralph Kimball, on the other hand, advocates what he calls a bus
architecture data warehouse. His methodology specifies conformed
dimensions, where multiple fact tables share common dimensional
tables. For me, each of these fact tables represents a data mart. The
row of dimensional tables that all the fact tables plug into is the bus,
and because, for example, the finance and the sales data marts both
use the same product dimension table there is integration between
departments.
The more extreme data mart strategy is that of the completely stand
alone data mart, the concept being that its fast, easy, cheap, and
delivers value immediately. I'm a supporter of this at the desktop level-
thats the point of the Datamartist tool afterall. But I don't buy this for
server based architectures- what is really fast, easy and cheap when you
have to buy servers, create a project and form a commitee? In my mind
if you've decided you need a central server solution, some level of
integration is needed, and don't pretend its going to be magic.

The interesting thing about these approaches, is that the harder you
work on really conforming your dimensions, the more your data marts
look like the data warehouse that Inmon advocates. (Data modellers in
the know will be jumping up and down right now shouting NO they don't-
but this is a high level conversation...) But the reality is, even in a data
warehouse, issues will arise that require compromise- things that just
don't map or conform, and budget, schedule and business reality will
mean that nothing is ever perfect, and in the end the world is full of data
warehouses that are less conformed than some data mart clusters. Its
just not simple.
Generally, it is probably true that data warehouses provide a solution
that is closer to the "single version of the truth", but they do take a HUGE
amount of effort, and an ability to coordinate across the entire
organisation. If you have not already built at least half a dozen data
marts, don't think you can estimate how much effort a data warehouse
will take. You can't. And bring your cheque book.
Whereas data marts might deliver some value early, if built without
sufficient effort on cross functional mapping and data cleanup they are
just more silo systems and have their own set of costs and issues. Don't
measure payback on datamarts in years- nothing is the same in a few
years, you'll be back to the drawing board shortly.
It's a real dilemma. So which one? Data warehouse? Data mart? In my
view, the right answer is "it depends" and "yes". However, never launch a
data warehouse project as your first shot. A successful strategy will
balance the fast, pain point addressing solutions, with a medium and
long term plan, and investment in infrastructure and competencies to
build the technology, processes and culture that a company needs to
manage information. Depending on your industry and how sucessful you
are, a massive data warehouse might be in your future. But sorry, no
magic bullet.

Build a multi-level data strategy


Level 1- Get the data to the people
Level 2- Build Departmental Data Marts
Level 3- Plan long term infrastructure and architecture

Don't do these things in order- this isn't step 1, 2, 3- actively work on all
three levels at once and ensure the plans at each level are coordinated.
Data to the People
People are building spreadsheets, and spending money on data base
development now- you know they are. Give them better tools, help them
better use the spreadsheets, and formalize the way they do. The do-it-
yourself exists- but it doesn't have to be completely informal.
The Datamartist tool is adding another capability that can accelerate the
process- letting you move more quickly, proto-typing and analysing to
determine which areas are ready for additional analysis capability.
In some cases Datamartist itself might simply be the best choice for
certain types of analysis, cutting costs dramatically. In addition, if your
end users are building their own data marts, when it comes time to build
server based data marts they know the concept, they understand the
structure, and can even provide concrete examples of the dimensions
they need.
But whatever you do- don't make the mistake of thinking this is all you
need. Work on all levels at once.

Build Departmental Data marts


If a whole department is flying blind, and big money is on the table, then
don't launch a three year data warehouse project- create some
departmental data marts. These projects should be designed to be 3-6
months long, and be sold to management honestly and clearly as being
for short term gains, and as part of a broader discovery process. The
hardware and software licenses will be reusable- but be clear that the
data marts will have a limited lifespan- they always do.
And trust me, when you build these data marts you will discover all sorts
of things about your data, your organisation, and your definitions and
business processes. You will discover that the sales organisation needs
to analyse product segments in a way that is fundamentally inconsistent
with how finance has been reporting it for years- and neither group is
willing to budge (and they may both be right). You will discover that 80%
of your sales orders have errors on them or are incomplete. These
discoveries will help you build the next data mart, and assess if a data
warehouse is possible. They should also send you back to your
transactional system and business processes to work to clean up the
problems.

Build the infrastructure and deal with the


foundation
There are lots of pitfalls in creating a decisional architecture- this short
list of from Gartnerresonates with me- I've seen and battled these issues
on project after project.
Set standards in terms of tools, project management etc. Buy
infrastructure for multiple projects, not on a project by project basis- don't
have multiple servers when one with multiple data marts is radically
cheaper.
And probably most important, I honestly think that you can build anything
you want, any way you want, but it will not succeed if you don't have
your definitions, both data and information, under control. (See
Gartner issue #8) In the end, it's not what language you speak, it's if you
have a dictionary or not, and if everyone is using the same dictionary
across your organisation.
You should have common reference data sets that are used by all levels.
Datamartist can load in and use reference data that is coordinated with
departmental data marts and the eventual warehouse. Make these data
sets available to everyone- you'll be amazed that if they are easy to get
and use, people will put them in their spreadsheets, and things might
actually start matching up.

Difference between Data Warehousing and Data Mart


It is important to note that there are huge differences between these two tools
though they may serve same purpose. Firstly, data mart contains programs,
data, software and hardware of a specific department of a company. There can
be separate data marts for finance, sales, production or marketing. All these
data marts are different but they can be coordinated. Data mart of one
department is different from data mart of another department, and though
indexed, this system is not suitable for a huge data base as it is designed to
meet the requirements of a particular department.
Data Warehousing is not limited to a particular department and it represents the
database of a complete organization. The data stored in data warehouse is more
detailed though indexing is light as it has to store huge amounts of information.
It is also difficult to manage and takes a long time to process. It implies then
that data marts are quick and easy to use, as they make use of small amounts
of data. Data warehousing is also more expensive because of the same reason.

Vous aimerez peut-être aussi