Vous êtes sur la page 1sur 3

UNIT TEST-I KEY CS2032 Data warehousing and data mining

Part-A 1. What is a data mart? A data mart is a repository of data gathered from operational data and other sources that is designed to serve a particular community of knowledge workers. In scope, the data may derive from an enterprise-wide database or data warehouse or be more specialized. 2. What are the characteristics of a data warehouse? a. Data warehouses are designed to help you analyze data. b. Integration is closely related to subject orientation. Data warehouses must put data from disparate sources into a consistent format c. Non-volatile means that, once entered into the warehouse, data should not change. d. In order to discover trends in business, analysts need large amounts of data 3. What is metadata? Metadata is the data about the data stored in a data warehouse. Source system and table-wise Extraction mapping to destination tables and fields in the staging area. 4. Define Carletons passport. Carleton's PASSPORT is a powerful native database extraction and transformation tool used to prepare data for business-critical applications such as enterprise data warehouses, managed data marts and packaged application conversions. PASSPORT provides highperformance extraction from a broad range of corporate databases including ADABAS, IMS, DB2, VSAM, IDMS, Datacom and others. 5. What are the desktop reporting tools? - crystal report - Actuate Reporting systems - IQ software corps. - InfoReports 6. Define Informix IBM Informix is a family of relational database management system (RDBMS) developed by IBM. It is positioned as IBM's flagship data server for online transaction processing (OLTP) as well as integrated solutions. IBM acquired the Informix technology in 2001 7. Define bit-mapped indexing Bit Map Indexing is a technique commonly used in relational databases where the application uses binary coding in representing data. This technique was originally used for low cardinality data but recent applications like the Sybase IQ have used this technique efficiently. 8. What is a star schema? The multidimensional view of the data that is expressed using relational database semantics is provided by the database schema design is called star schema. It is classified into two as facts and dimensions

9. What are the types of parallelism? Data: By splitting a single sequential file into smaller data files to provide parallel access. Pipeline: Allowing the simultaneous running of several components on the same data stream. For example: looking up a value on record 1 at the same time as adding two fields on record 2. Component: The simultaneous running of multiple processes on different data streams in the same job, for example, sorting one input file while removing duplicates on another file. 10. What are the components of ETL? a. Extraction b.transformation c. metadata d.load e. Administration and Transport services Part-B 11. What are the steps for building a data warehouse and the considerations in building it? BUILDING A DATA WAREHOUSE Decision need to be made quickly and correctly Users are business domain experts Added information value Legacy systems need to be integrated to new applications Workspace is increasingly heterogeneous Network bandwidth is increasing BUSINESS CONSIDERATIONS: Approach Organisational issues Design considerations Data content Metadata Data distribution Tools Performance considerations Nine decision in design of data warehouse 12. Explain about the data warehouse architecture in detail Seven major components:

data extraction, cleanup, integration, transformation and migration tools metadata repository warehouse database technology data marts data query, reporting, analysis and mining tools data warehouse administration and management information delivery system

13. What are data extraction, cleanup, and transformation tools? data extraction cleanup migration transformation vendor solutions prism solutions Carletons passport Information builders SAS Institute

Vous aimerez peut-être aussi