Vous êtes sur la page 1sur 13

Explain the concepts and capabilities of Business Intelligence.

Business Intelligence helps to manage data by applying different skills, technologies, security and quality risks. This also helps in achieving a better understanding of data. Business intelligence can be considered as the collective information. It helps in making predictions of business operations using gathered data in a warehouse. Business intelligence application helps to tackle sales, financial, production etc business data. It helps in a better decision making and can be also considered as a decision support system.

Explain the concepts and capabilities of Business Intelligence.


Business Intelligence is all about processes, skills, technologies, practices and applications used for supporting decision making. Business Intelligence applications could perform - Centrally initiated by the business needs - It includes decision support system, query reporting,

!"#, data mining, forecasting

Explain the Dashboard in the business intelligence.


" dashboard in business intelligence allows huge data and reports to be read in a single graphical interface. They help in making faster decisions by replying on measurable data seen at a glance. They can also be used to get into details of this data to analy$e the root cause of any business performance. It represents the business data and business state at a high level. %ashboards can also be used for cost control. &'ample of need of a dashboard( Banks run thousands of "T)*s. They need to know how much cash is deposited, how much is left etc.

Explain the Dashboard in the business intelligence.


%ashboard in business intelligence is used for rapid prototyping, cloning and deployment for all databases, operational applications or spread sheets through an organi$ation. " dashboard in BI allows an enterprise*s status+position, heading to, by using graphs, maps and chars. The drill-down and roll-over capabilities allows organi$ing things without revealing important information. It is fully customi$able, including free-form design options. %ashboard consolidates vital statistics of business into an easy-to-read page.

SAS Business Intelligence.


,", business intelligence has analytical capabilities like statistics, reporting, data mining, predictions, forecasting and optimi$ation. They help in getting data in the format desired. It helps in improving quality of data.

SAS Business Intelligence.


,", BI provides the information about an enterprise when needed. It provides this information in customi$ed format. ,", BI integrates data across the enterprise and delivers the self-service reporting and analysis. This consumes less time for responding requests and for business uses to view the information. "n integrated, fle'ible and robust presentation layer for ,", "nalytics with full breadth is also offered by ,", BI. "ll these are integrated within the conte't of business for better and faster decision making.

What are fact tables and dimension tables?


"s mentioned, data in a warehouse comes from the transactions. -act table in a data warehouse consists of facts and+or measures. The nature of data in a fact table is usually numerical. n the other hand, dimension table in a data warehouse contains fields used to describe the data in fact tables. " dimension table can provide additional and descriptive information .dimension/ of the field of a fact table. e.g. If I want to know the number of resources used for a task, my fact table will store the actual measure .of resources/ while my %imension table will store the task and resource details. 0ence, the relation between a fact and dimension table is one to many.

What are fact tables and dimension tables?


Business facts or measures and foreign keys are persisted in fact tables which are referred as candidate keys in dimension tables. "dditive values are usually provided by the fact tables which acts as independent variables by which dimensional attributes are analy$ed. "ttributes that are used to constrain and group data for performing data warehousing queries are persisted in the dimension tables.

What is ETL process in data

arehousing?

&T! is &'tract Transform !oad. It is a process of fetching data from different sources, converting the data into a consistent and clean form and load into the data warehouse. %ifferent tools are available in the market to perform &T! 1obs.

What is ETL process in data

arehousing?

&T! stands for &'traction, transformation and loading. That means e'tracting data from different sources such as flat files, databases or 2)! data, transforming this data depending on the application*s need and loads this data into data warehouse.

Explain the difference bet een data mining and data

arehousing.

%ata warehousing is merely e'tracting data from different sources, cleaning the data and storing it in the warehouse. 3here as data mining aims to e'amine or e'plore the data using queries. These queries can be fired on the data warehouse. &'plore the data in data mining helps in reporting, planning strategies, finding meaningful patterns etc. &.g. a data warehouse of a company stores all the relevant information of pro1ects and employees. 4sing %ata mining, one can use this data to generate different reports like profits generated etc.

Explain the difference bet een data mining and data

arehousing.

%ata mining is a method for comparing large amounts of data for the purpose of finding patterns. %ata mining is normally used for models and forecasting. %ata mining is the process of correlations, patterns by shifting through large data repositories using pattern recognition techniques. %ata warehousing is the central repository for the data of several business systems in an enterprise. %ata from various resources e'tracted and organi$ed in the data warehouse selectively for analysis and accessibility.

What is an !LT" s#stem and !LA" s#stem?


!T#( nline Transaction and #rocessing helps and manages applications based on transactions involving high volume of data. Typical e'ample of a transaction is commonly observed in Banks, "ir tickets etc. Because !T# uses client server architecture, it supports transactions to run cross a network. !"#( nline analytical processing performs analysis of business data and provides the ability to perform comple' calculations on usually low volumes of data. !"# helps the user gain an insight on the data coming from different sources .multi dimensional/.

What is an !LT" s#stem and !LA" s#stem?


!T# stands for n!ine Transaction #rocessing. "pplications that supports and manges transactions which involve high volumes of data are supported by !T# system. !T# is based on client-server architecture and supports transactions across networks. !"# stands for n!ine "nalytical #rocessing. Business data analysis and comple' calculations on low volumes of data are performed by !"#. "n insight of data coming from various resources can be gained by a user with the support of !"#.

What are cubes?


" data cube stores data in a summari$ed version which helps in a faster analysis of data. The data is stored in such a way that it allows reporting easily. &.g. using a data cube " user may want to analy$e weekly, monthly performance of an employee. 0ere, month and week could be considered as the dimensions of the cube.

What are cubes?


)ulti dimensional data is logically represented by Cubes in data warehousing. The dimension and the data are represented by the edge and the body of the cube respectively. !"# environments view the data in the form of hierarchical cube. " cube typically includes the aggregations that are needed for business intelligence queries.

What is sno

fla$e scheme design in database?

" snowflake ,chema in its simplest form is an arrangement of fact tables and dimension tables. The fact table is usually at the center surrounded by the dimension table. 5ormally in a snow flake schema the dimension tables are further broken down into more dimension table.

&.g. %imension tables include employee, pro1ects and status. ,tatus table can be further broken into status6weekly, status6monthly.

What is sno

fla$e scheme design in database?

,now flake schema is one of the designs that are present in database design. ,now flake schema serves the purpose of dimensional modeling in data warehousing. If the dimensional table is split into many tables, where the schema is inclined slightly towards normali$ation, then the snow flake design is utili$ed. It contains 1oins in depth. The reason is that, the tables split further.

What is anal#sis ser%ice?


"nalysis service provides a combined view of the data used in !"#, %ata mining. !"# or %ata mining. ,ervices here refer to

What is anal#sis ser%ice?


"n integrated view of business data is provided by analysis service. This view is provided with the combination of !"# and data mining functionality. "nalysis ,ervices allows the user to utili$e a wide variety of data mining algorithms which allows the creation and designing data mining models.

What is surrogate $e#? Explain it

ith an example.

%ata warehouses commonly use a surrogate key to uniquely identify an entity. " surrogate is not generated by the user but by the system. " primary difference between a primary key and surrogate key in few databases is that #7 uniquely identifies a record while a ,7 uniquely identifies an entity. &.g. an employee may be recruited before the year 8999 while another employee with the same name may be recruited after the year 8999. 0ere, the primary key will uniquely identify the record while the surrogate key will be generated by the system .say a serial number/ since the ,7 is 5 T derived from the data.

What is surrogate $e#? Explain it

ith an example.

" surrogate key is a unique identifier in database either for an entity in the modeled word or an ob1ect in the database. "pplication data is not used to derive surrogate key. ,urrogate key is an internally generated key by the current system and is invisible to the user. "s several ob1ects are available in the database corresponding to surrogate, surrogate key can not be utili$ed as primary key. -or e'ample, a sequential number can be a surrogate key.

What is the purpose of &actless &act Table?


Ans er -act less tables are so called because they simply contain keys which refer to the dimension tables. 0ence, they don*t really have facts or any information but are more commonly used for tracking some information of an event. &g. To find the number of leaves taken by an employee in a month.

What is the purpose of &actless &act Table?

" tracking process or collecting status can be performed by using fact less fact tables. The fact table does not have numeric values that are aggregate, hence the name. )ere key values that are referenced by the dimensions, from which the status is collected, are available in fact less fact tables.

What is a le%el of 'ranularit# of a fact table?


" fact table is usually designed at a low level of :ranularity. This means that we need to find the lowest level of information that can store in a fact table. &.g. &mployee performance is a very high level of granularity. &mployee6performance6daily, employee6perfomance6weekly can be considered lower levels of granularity.

What is a le%el of 'ranularit# of a fact table?


The granularity is the lowest level of information stored in the fact table. The depth of data level is known as granularity. In date dimension the level could be year, month, quarter, period, week, day of granularity. The process consists of the following two steps( - %etermining the dimensions that are to be included - %etermining the location to place the hierarchy of each dimension of information The factors of determination will be resent to the requirements.

Explain the difference bet een star and sno fla$e schemas.
Ans er " snow flake schema design is usually more comple' than a start schema. In a start schema a fact table is surrounded by multiple fact tables. This is also how the ,now flake schema is designed. 0owever, in a snow flake schema, the dimension tables can be further broken down to sub dimensions. 0ence, data in a snow flake schema is more stable and standard as compared to a ,tart schema. &.g. ,tar ,chema( #erformance report is a fact table. Its dimension tables include performance6report6employee, performance6report6manager ,now -lake ,chema( the dimension tables can be broken to performance6report6employee6weekly, monthly etc.

Explain the difference bet een star and sno fla$e schemas.
,tar schema( " highly de-normali$ed technique. " star schema has one fact table and is associated with numerous dimensions table and depicts a star. ,now flake schema( The normali$ed principles applied star schema is known as ,now flake schema. &very dimension table is associated with sub dimension table. %ifferences(

" dimension table will not have parent table in star schema, whereas snow flake schemas have one or more parent tables.

The dimensional table itself consists of hierarchies of dimensions in star schema, where as hierarchies are split into different tables in snow flake schema. The drilling down data from top most hierarchies to the lowermost hierarchies can be done.

Differences bet een star and sno fla$e schema.


" snowflake schema is a more normali$ed form of a star schema. In a star schema, one fact table is stored with a number of dimension tables. n the other hand, in a star schema, one dimension table can have multiple sub dimensions. This means that in a star schema, the dimension table is independent without any sub dimensions.

What is a (ube and Lin$ed (ube

ith reference to data

arehouse?

" data cube stores data in a summari$ed version which helps in a faster analysis of data. 3here as linked cubes use the data cube and are stored on another analysis server. !inking different data cubes reduces the possibility of sparse data. &.g. " data cube may store the &mployee6performance. 0owever in order to know the hours which calculated this performance, one can create another cube by linking it to the root cube .in this case employee6performance/.

What is a (ube and Lin$ed (ube

ith reference to data

arehouse?

!ogical data representation of multidimensional data is depicted as a Cube. %imension members are represented by the edge of cube and data values are represented by the body of cube. !inked cubes are the cubes that are linked in order to make the data remain constant.

What are fundamental stages of Data Warehousing?


,tages of a data warehouse helps to find and understand how the data in the warehouse changes. "t an initial stage of data warehousing data of the transactions is merely copied to another server. 0ere, even if the copied data is processed for reporting, the source data*s performance won*t be affected. In the ne't evolving stage, the data in the warehouse is updated regularly using the source data. In ;eal time %ata warehouse stage data in the warehouse is updated for every transaction performed on the source data .&.g. booking a ticket/ 3hen the warehouse is at integrated stage, It not only updates data as and when a transaction is performed but also generates transactions which are passed back to the source online data.

What are fundamental stages of Data Warehousing?


!ffline !perational Databases) This is the initial stage of data warehousing. In this stage the development of database of an operational system to an off-line server is done by simply copying the databases. !ffline Data arehouse) In this stage the data warehouses are updated on a regular time cycle from operational system and the data is persisted in an reporting-oriented data structure.

*eal time Data Warehouse) %ata warehouses are updated based on transaction or event basis in this stage. "n operational system performs a transaction every time. Integrated Data Warehouse) The activity or transactions generation which are passed back into the operational system is done in this stage. These transactions or generated transactions are used in the daily activity of the organi$ation.

What is +irtual Data Warehousing?


The aggregate view of complete data inventory is provided by <irtual 3arehousing. The metadata is utili$ed for forming logical enterprise data model which is a part of database of record infrastructure , is contained in virtual data warehousing. The infrastructure consists of publishments of legacy database sysems with their metadta e'tracted. The standards =&&, =), and &=Bs are used in the infrastructure for the purpose of transactional unit requests and e'tract-tranform-load tools are used for loading real time bulk data.

What is +irtual Data Warehousing?


" virtual data warehouse provides a compact view of the data inventory. It contains )eta data. It uses middleware to build connections to different data sources. They can be fast as they allow users to filter the most important pieces of data from different legacy applications.

What is acti%e data

arehousing?

The transactional data captured and reposited in the "ctive %ata 3arehouse. This repository can be utili$ed in finding trends and patterns that can be used in future decision making.

What is acti%e data

arehousing?

"n "ctive data warehouse aims to capture data continuously and deliver real time data. They provide a single integrated view of a customer across multiple business lines. It is associated with Business Intelligence ,ystems.

Difference bet een dependent and independent data


%ependent data ware house are build

arehouse
%,.

%,,where as independent data warehouse will not depend on

Difference bet een dependent and independent data

arehouse
n the other hand independent

" dependent data warehouse stored the data in a central data warehouse. data warehouse does not make use of a central data warehouse.

What is data modeling and data mining?


%esigning a model for data or database is called data modelling. %ata is reposited in fact table and dimension table. -act table consists of data about transaction and dimensional table consists of master data. %ata model is used to design abstract model of database. The process of obtaining the hidden trends is called as data mining. %ata mining is used to transform the hidden into information. %ata mining is also used in a wide range of practicing profiles such as marketing, surveillance, fraud detection.

What is data modeling and data mining? What is this used for?

%ata modeling aims to identify all entities that have data. It then defines a relationship between these entities. %ata models can be conceptual, logical or #hysical data models. Conceptual models are typically used to e'plore high level business concepts in case of stakeholders. !ogical models are used to e'plore domain concepts. 3hile #hysical models are used to e'plore database design. %ata mining is used to e'amine or e'plore the data using queries. These queries can be fired on the data warehouse. %ata mining helps in reporting, planning strategies, finding meaningful patterns etc. it can be used to convert a large amount of data into a sensible form.

Difference bet een E* ,odeling and Dimensional ,odeling.


%imensional modelling is very fle'ible for the user perspective. %imensional data model is mapped for creating schemas. 3here as &; )odel is not mapped for creating shemas and does not use in conversion of normali$ation of data into denormali$ed form. &; )odel is utili$ed for !T# databases that uses any of the >st or 8nd or ?rd normal forms, where as dimensional data model is used for data warehousing and uses ?rd normal form. &; model contains normali$ed data where as %imensional model contains denormali$ed data.

Difference bet een E* ,odeling and Dimensional ,odeling.


&; modeling that models an &; diagram represents the entire businesses or applications processes. This diagram can be segregated into multiple %imensional models. This is to say, an &; model will have both logical and physical model. The %imensional model will only have physical model.

What is snapshot

ith reference to data

arehouse?

" snapshot of data warehouse is a persisted report from the catalogue. The persistence into a file is done after disconnecting report from the catalogue.

What is snapshot

ith reference to data

arehouse?

" snapshot is in a data warehouse can be used to track activities. -or e'ample, every time an employee attempts to change his address, the data warehouse can be alerted for a snapshot. This means that each snap shot is taken when some event is fired. " snapshot has three components @

Time when event occurred. " key to identify the snap shot. %ata that relates to the key.

What is degenerate dimension table?


The dimensions that are persisted in the fact table is called dimension table. These dimensions does not contain its own dimensions. )apping does not take place for the columns available in fact tables. The values in the table is neither dimensions nor measures.

What is degenerate dimension table?

" degenerate table does not have its own dimension table. It is derived from a fact table. The column .dimension/ which is a part of fact table but does not map to any dimension. &.g. employee6id

What is Data ,art?


%ata )art is a data repository which is served to a community of people who works on knowledge .also known as knowledge workers/. The data resource can be from enterprise resources or from a data warehouse.

What is Data ,art?


%ata mart stores particular data that is gathered from different sources. #articular data may belong to some specific community .group of people/ or genre. %ata marts can be used to focus on specific business needs.

Difference bet een metadata and data dictionar#.


,etadata describes about data. It is Adata about data*. It has information about how and when, by whom a certain data was collected and the data format. It is essential to understand information that is stored in data warehouses and 'ml-based web applications. Data dictionar# is a file which consists of the basic definitions of a database. It contains the list of files that are available in the database, number of records in each file, and the information about the fields.

What is the difference bet een metadata and data dictionar#?


%ata dictionary is a repository to store all information. )eta data is data about data. )eta data is data that defines other data. 0ence, the data dictionary can be metadata that describes some information about the database.

Describe the %arious methods of loading Dimension tables.


The following are the methods of loading dimension tables( (on%entional Load) In this method all the table constraints will be checked against the data, before loading the data. Direct Load or &aster Load) "s the name suggests, the data will be loaded directly without checking the constraints. The data checking against the table constraints will be performed later and inde'ing will not be done on bad data.

Describe the %arious methods of loading Dimension tables.


The methods to load %imension tables( Conventional load(- 0ere the data is checked for any table constraints before loading.

%irect or -aster load(- The data is directly loaded without checking for any constraints

What is the difference bet een !LA" and data


The following are the differences between Data Warehouse

arehouse?

!"# and data warehousing(

%ata from different data sources is stored in a relational database for end use analysis. %ata organi$ation is in the form of summari$ed, aggregated, non volatile and sub1ect oriented patterns. ,upports the analysis of data but does not support data of online analysis. !nline Anal#tical "rocessing 3ith the usage of analytical queries, data is analy$ed and evaluated in the data ware house. %ata aggregation and summari$ation is utili$ed to organi$e data using multidimensional models. ,peed and fle'ibility for online data analysis is supported for data analyst in real time environment.

What is the difference bet een !LA" and data

arehouse?

" data warehouse serves as a repository to store historical data that can be used for analysis. !"# is nline "nalytical processing that can be used to analy$e and evaluate data in a warehouse. The warehouse has data coming from varied sources. !"# tool helps to organi$e data in the warehouse using multidimensional models.

Describe the foreign $e# columns in fact table and dimension table.
The primary keys of entity tables are the foreign keys of dimension tables. The #rimary keys of fact dimensional table are the foreign keys of fact tables.

Describe the foreign $e# columns in fact table and dimension table.
" foreign key of a fact table references other dimension tables. n the other hand, dimension table being a referenced table itself, having foreign key reference from one or more tables.

What is cube grouping?


" transformer built set of similar cubes is known as cube grouping. " single level in one dimension of the model is related with each cube group. Cube groups are generally used in creating smaller cubes that are based on the data in the level of dimension.

Define the term slo l# changing dimensions -S(D.


,lowly changing dimension target operator is one of the ,B! warehousing operators that can be used in mining flow or in data flow. 3hen the attribute for a record varies over time, the ,C% is applied.

Define the term slo l# changing dimensions -S(D..


,C% are dimensions whose data changes very slowly. "n e'ample of this can be city of an employee. This dimension will change very slowly. The row of this data in the dimension can be either replaced completely without any track of old record ; a new row can be inserted, ; the change can be tracked.

Explain the use of loo$up tables and Aggregate tables.


"t the time of updating the data warehouse, a lookup table is used. 3hen placed on the fact table or warehouse based upon the primary key of the target, the update is takes place only by allowing new records or updated records depending upon the condition of lookup. The materiali$ed views are aggregate tables. It contains summari$ed data. -or e'ample, to generate sales reports on weekly or monthly or yearly basis instead of daily basis of an application, the date values are aggregated into week values, week values are aggregated into month values and month values into year values. To perform this process Caggregate function is used.

Explain the use loo$up tables and Aggregate tables.


"n aggregate table contains summari$ed view of data. !ookup tables, using the primary key of the target, allow updating of records based on the lookup condition.

What is real time data/ arehousing?


The combination of real-time activity and data warehousing is called real time warehousing. The activity that happens at current time is known as real-time activity. %ata is available after completion of the activity. Business activity data is captured in real-time data warehousing as the data occurs. ,oon after the business activity and the available data, the data of completed activity is flown into the data warehouse. This data is available instantly. ;eal-time data warehousing can be viewed + utili$ed as a framework for the information retrieval from data as the data is available.

What is real time data/ arehousing?


In real time data-warehousing, the warehouse is updated every time the system performs a transaction. It reflects the businesses real time information. This means that when the query is fired in the warehouse, the state of the business at that time will be returned.

What is conformed fact? What is conformed dimensions use for?


"llowing having same names in different tables is allowed by Conformed facts. The combining and comparing facts mathematically is possible. " dimensional table can be used more than one fact table is referred as conformed dimension. It is used across multiple data marts along with the combination of multiple fact tables. 3ithout changing the metadata of conformed dimension tables, the facts in an application can be utili$ed without further modifications or changes.

What is conformed fact? What is conformed dimensions use for?


Conformed fact in a warehouse allows itself to have same name in separate tables. They can be compared and combined mathematically. Conformed dimensions can be used across multiple data marts. These conformed dimensions have a static structure. "ny dimension table that is used by multiple fact tables can be conformed dimensions.

Define non/additi%e facts.

The facts that can not be summed up for the dimensions present in the fact table are called non-additive facts. The facts can be useful if there are changes in dimensions. -or e'ample, profit margin is a nonadditive fact for it has no meaning to add them up for the account level or the day level.

Define non/additi%e facts.


5on additive facts are facts that cannot be summed up for any dimensions present in fact table. This means that these columns cannot be added for producing any results

Difference bet een SAS tool and other tools


The differences between ,", and other tools are( -,", is a reporting tool. -,", is an &T! tool and also a forecasting tool. Tools other than ,", - consists of reporting tool, for e'ample, Business both , for e'ample Business b1ects. b1ects Cognos or &T! tool, for e'ample, Informatica, or

ther tools does not have forecasting tool. -or this reason, ,", is used in most in Clinical Trials and health care industry.

List out difference bet een SAS tool and other tools.
,", provides more features in comparison to other tools. it supports almost "!! database interfaces and has its own e'tensive database engine.

Wh# is SAS so popular?


,tatistical "nalysis ,ystem is an integration of various software products which allows the developers to perform %ata entry, data retrieval, data management and data mining ;eport writing and supports for graphics ,tatistical analysis, business planning, business forecasting and business decision support perations research and pro1ect management, quality improvement, application development &'tract, transform and load functions in data warehousing. #latform independent and remote computing Because of these many features, ,", has become more and more popular.

Wh# is SAS so popular?


,", is an &T! tool. 5ot 1ust this it can be used for reporting and can be used for forecasting business needs.

What is data cleaning? 0o

can

e do that?

%ata cleaning is also known as data scrubbing. %ata cleaning is a process which ensures the set of data is correct and accurate. %ata accuracy and consistency, data integration is checked during data cleaning. %ata cleaning can be applied for a set of records or multiple sets of data which need to be merged.

%ata cleaning is performed by reading all records in a set and verifying their accuracy. Typos and spelling errors are rectified. )islabeled data if available is labeled and filed. Incomplete or missing entries are completed. 4nrecoverable records are purged, for not to take space and inefficient operations.

What is data cleaning? 0o

can

e do that?

%ata cleaning is the process of identifying erroneous data. The data is checked for accuracy, consistency, typos etc. )ethods("arsing - 4sed to detect synta' errors. Data Transformation - Confirms that the input data matches in format with e'pected data. Duplicate elimination - This process gets rid of duplicate entries. Statistical ,ethods- values of mean, standard deviation, range, or clustering algorithms etc are used to find erroneous data.

Explain in brief about critical column.


" column .usually granular/ is called as critical column which changes the values over a period of time. -or e'ample, there is a customer by name A"nirudh* who resided in Bangalore for D years and shifted to #une. Being in Bangalore, he purchased ;s. ?9 !akhs worth of purchases. 5ow the change is the CITE in the data warehouse and the purchases now will shown in the city #une only. This kind of process makes data warehouse inconsistent. In this e'ample, the CITE is the critical column. ,urrogate key can be used as a solution for this.

Explain in brief about critical column.


" critical column in a warehouse is a column whose value changes over a period of time. -or e.g. city of the user. If a user resides in city FabcF and the warehouse keeps a track of his per day e'penses - when the user changes the city, the data warehouse becomes inconsistent since the city has changed and the e'penses are shown under the new city.

Vous aimerez peut-être aussi