Vous êtes sur la page 1sur 41

Question: 1 of 60

How should you use the pre-built extractors for a new project?

Correct.

A. Drag and drop them onto canvas.

B. Assign them to a new query.

C. Convert them to AQL Statements.

D. Right click on the extractor and select Edit Output.

Question: 2 of 60
Which feature of Text Analytics should you use to process Japanese or Chinese language text?

Correct.

A. Standard tokenizer

B. Multilingual tokenizer

C. Online Analytical Programming (OLAP)

D. Annotation Query Language (AQL)

Question: 3 of 60
What defines a relation in an AQL extractor?

Incorrect.The correct answer is :a viewExplanationDW654_Course_Guide.pdf Page 3-5

A. a schema

B. a row

C. a view

D. a column
Question: 4 of 60
What advantage does the Text Analytics Web UI give you?

Incorrect.The correct answer is :It generates the AQL syntax for


you.ExplanationDW654_Course_Guide.pdf Page 1-11

A. It allows only single data types.

B. It generates the AQL syntax for you.

C. It teaches you how to write AQL syntax.

D. It allows only one type of file extension.

Question: 5 of 60
Which Text Analytics runtime component is used for languages such as Spanish and English by
breaking a stream of text into phrases or words?

Incorrect.The correct answer is :Standard tokenizerExplanationDW654_Course_Guide.pdf Page 1-


13, 6-12

A. Standard tokenizer

B. Named entity extractors

C. Multilingual tokenizer

D. Other extractors

Question: 6 of 60
Which AQL candidate rule combines tuples from two views with the same schema?

Correct.

A. Sequence

B. Union

C. Blocks

D. Select
Question: 7 of 60
What does a computer need to understand unstructured data?

Incorrect.The correct answer is :contextExplanationDW654_Course_Guide.pdf Page 1-10

A. attribute types

B. extractors

C. usage

D. context

Question: 8 of 60
What should you do in Text Analytics to fix an extractor that produces unwanted results?

Incorrect.The correct answer is :Create a new filter.ExplanationDW654_Course_Guide.pdf Page 5-8

A. Edit the properties of the sequence.

B. Remove results with a consolidation rule.

C. Re-create the extractors.

D. Create a new filter.

Question: 9 of 60
How is a sequence created in Canvas?

Correct.

A. Click on the New Literal button.

B. Drag and drop one extractor onto another.

C. Right click on the extractor, and select Edit Output.

D. Select multiple extractors on the result pane.


Question: 10 of 60
Which format is used to export extractor results?

Correct.

A. RTF

B. CSV

C. JSON

D. TXT

Question: 11 of 60
Which basic feature rule of AQL helps find an exact match to a single word or phrase?

Correct.

A. Literals

B. Part of Speech

C. Splits

D. Dictionary

Question: 12 of 60
Which type of HBase column is mapped to multiple SQL columns?

Incorrect.The correct answer is :DenseExplanationDW633_Course_Guide.pdf Page 5-13

A. Exclusive

B. Composite

C. Dense

D. Double

Question: 13 of 60
Which underlying data representation and access method does Big SQL use?

Incorrect.The correct answer is :HiveExplanationDW633_Course_Guide.pdf Page 2-9

A. SMALLINT

B. Hive

C. MAP

D. TINYINT

Question: 14 of 60
Which Big SQL file format is human readable and supported by most tools, but is the least efficient
file format?

Incorrect.The correct answer is :DelimitedExplanationDW633_Course_Guide.pdf Page 2-23

A. Avro

B. Sequence

C. Parquet

D. Delimited

Question: 15 of 60
Which type of key does HBase require in each row in an HBase table?

Incorrect.The correct answer is :UniqueExplanationDW633_Course_Guide.pdf Page 5-23

A. Duplicate

B. Primary

C. Foreign

D. Unique

Question: 16 of 60
What are the two types of Spark operations? (Choose two.)
(Please select ALL that apply)
Correct.

A. Actions

B. Transformations

C. Vectors

D. Sequences

E. DataFrames

Question: 17 of 60
What privilege is required to execute an EXPLAIN statement with INSERT privileges in Big SQL?

Incorrect.The correct answer is :SQLADM authorityExplanationDW633_Course_Guide.pdf Page 3-19


It appears that EXPLAIN or SQLADM or DBADM allows this. Thus either of the first two choices and
not just the first. (Glen) Henry -
http://www.ibm.com/support/knowledgecenter/en/SSPT3X_4.0.0/com.ibm.swg.im.infosphere.biginsi
ghts.commsql.doc/doc/r0000952.html

A. SQLADM authority

B. SYSMON authority

C. SECADM authority

D. SYSCTRL authority

Question: 18 of 60
What is used in a Big SQL file system to organize tables?

Correct.

A. DSM

B. JSqsh

C. schemas

D. partitions
Question: 19 of 60
Why is the SYSPROC.SYSINSTALLOBJECT procedure used with Big SQL?

Incorrect.The correct answer is :to create an EXPLAIN tableExplanationDW633_Course_Guide.pdf


Page 3-24

A. to set the location of the EXPLAIN.DDL

B. to create a SNAPSHOT column

C. to specify the SQL statement to be explained

D. to create an EXPLAIN table

Question: 20 of 60
What is required to run an EXPLAIN statement in Big SQL?

Incorrect.The correct answer is :proper authorizationExplanationDW633_Course_Guide.pdf Page 3-


19

A. proper authorization

B. a rule

C. the explainable-sql-statement clause

D. the SYSPROC.SYSINSTALLOBJECT procedure

Question: 21 of 60
You need to create multiple Big SQL tables with columns defined as CHAR. What needs to be set to
enable CHAR columns?

Correct.

A. ALTER CHAR DATATYPE TO byte

B. CREATE TABLE chartab

C. SET HADOOPCOMPATIBLITY_MODE=True
D. SET SYSHADOOP.COMPATIBILITY_MODE=1

Question: 22 of 60
You need to populate a Big SQL table to test an operation. Which INSERT statement is
recommended for testing, only because it does not support parallel reads or writes?

Correct.

A. INSERT INTO ... VALUES (...)

B. INSERT INTO ... SELECT ... WHERE …

C. INSERT INTO ... SELECT …

D. INSERT INTO ... SELECT FROM ...

Question: 23 of 60
Which Big SQL datatype should be avoided because it causes significant performance degradation?

Incorrect.The correct answer is :STRINGExplanationW673_Course_Guide.pdf Page 2-15

A. CHAR

B. UNION

C. VARCHAR

D. STRING

Question: 24 of 60
What is missing from the following statement when querying a remote table? CREATE _______ FOR
remotetable1 ...

Correct.
A. NICKNAME

B. TABLE

C. VIEW

D. INDEX

Question: 25 of 60
You need to set up the command-line interface JSqsh to connect to a bigsql database. What is the
recommended method to set up the connection?

Incorrect.The correct answer is :Run the JSqsh connection


wizard.ExplanationDW633_Course_Guide.pdf Page 1-10

A. Run the JSqsh connection wizard.

B. Run the JSqsh driver wizard.

C. Run the $JSQSH_HOME/bin/JSQSH script.

D. Modify database parameters in the .jsqsh/connections.xml file.

Question: 26 of 60
You have a very large Hadoop file system. You need to work on the data without migrating the data
out or changing the data format. Which IBM tool should you use?

Correct.

A. Big SQL

B. Pig

C. MapReduce

D. Data Server Manager

Question: 27 of 60
Which core component of the Hadoop framework is highly scalable and a common tool?

Correct.

A. Hive

B. Pig

C. Sqoop

D. MapReduce

Question: 28 of 60
Which action is performed prior to the Map step of a MapReduce v1 processing cycle?

Incorrect.The correct answer is :The job is broken into individual task pieces and
distributed.ExplanationDW613_Course_Guide_V2 9-7

A. The job is broken into individual task pieces and distributed.

B. The data required is moved to the fastest nodes.

C. Output result sets are simplified to a single answer.

D. The job is sent sequentially to all nodes.

Question: 29 of 60
What is the default replication factor for HDFS on a production cluster?

Correct.

A. 10

B. 5

C. 3

D. 1
Question: 30 of 60
Which command must be run after compiling a Java program so it can run on the Hadoop cluster?

Correct.

A. rm hadoop.class

B. jar cf name.jar *.class

C. jar tf name.jar

D. hadoop classpath

Question: 31 of 60
What is a feature of an Avro file?

Correct.

A. versioning of the data

B. columns delimited by commas

C. formal schema language

D. directly readable by JavaScript

Question: 32 of 60
What happens if a task fails during a Hadoop job execution?

Correct.

A. The job will be restarted with different compute nodes.

B. The entire job will fail.

C. The task will be restarted on another node.

D. The job will finish with incomplete results.


Question: 33 of 60
Which software is at the core of the IBM BigInsights platform?

Correct.

A. open source components

B. customer developed software

C. cloud-based web services

D. proprietary IBM libraries

Question: 34 of 60
What is the default data type in Big R?

Incorrect.The correct answer is :characterExplanationDW613_Course_Guide_V1 4-36

A. complex

B. character

C. integer

D. numeric

Question: 35 of 60
Which command is used to launch an interactive Python shell for Spark?

Correct.

A. spark-shell

B. hadoop pyshell

C. pyspark

D. python -spark
Question: 36 of 60
What is the primary core abstraction of Apache Spark?

Incorrect.The correct answer is :Resilient Distributed Dataset


(RDD)ExplanationDW613_Course_Guide_V2 10-12

A. Directed Acyclic Graph (DAG)

B. Resilient Distributed Dataset (RDD)

C. Spark Streaming

D. GraphX

Question: 37 of 60
What is a feature of Apache ZooKeeper?

Incorrect.The correct answer is :maintains configuration information for a


clusterExplanationDW613_Course_Guide_V2 11-5

A. maintains configuration information for a cluster

B. generates shell programs for running components of Hadoop

C. performance tunes a running cluster

D. monitors log files of cluster members

Question: 38 of 60
Which command is used to launch an interactive Apache Spark shell?

Correct.

A. hadoop spark

B. spark
C. spark-shell

D. scala --spark

Question: 39 of 60
Which action is performed during the Reduce step of a MapReduce v1 processing cycle?

Correct.

A. Intermediate results are aggregated.

B. The initial problem is broken into pieces.

C. The TaskTracker distributes the job to the cluster.

D. The JobTrackers execute their assigned tasks.

Question: 40 of 60
What are two major business advantages of using BigSheets? (Choose two.)
(Please select ALL that apply)

Incorrect.The correct answer is :built-in data readers for multiple formatsspreadsheet-like querying
and discovery interfaceExplanationDW613_Course_Guide_V1 3-24

A. feature rich programming environment

B. spreadsheet-like querying and discovery interface

C. built-in data readers for multiple formats

D. command-line-driven data analysis

Question: 41 of 60
A Hadoop file listing is performed and one of the output lines is: -rw-r--r-- 5 biadmin biadmin 871233
2015-09-12 09:33 data.txt What does the 5 in the output represent?
Incorrect.The correct answer is :replication factorExplanationDW613_Course_Guide_V2 8-54

A. replication factor

B. login id of the file owner

C. permissions

D. data size

Question: 42 of 60
When creating a new table in Big SQL, what additional keyword is used in the CREATE TABLE
statement to create the table in HDFS?

Correct.

A. dfs

B. cloud

C. replicated

D. hadoop

Question: 43 of 60
What is the ApplicationMaster in YARN responsible for? (Choose two.)
(Please select ALL that apply)

Incorrect.The correct answer is :obtaining resources for computationmonitoring node execution


statusExplanationDW613_Course_Guide_V2 9-40

A. monitoring node execution status

B. taking nodes offline for maintenance

C. obtaining resources for computation

D. allocating resources from all nodes


Question: 44 of 60
How does an end-user interact with the IBM BigSheets tool?

Correct.

A. command line

B. web browser

C. mobile app

D. IBM-built desktop app

Question: 45 of 60
What command is used to start a Flume agent?

Incorrect.The correct answer is :flume-ngExplanationDW613_Course_Guide_V2 12-39

A. flume-agent

B. flume-start

C. flume-ng

D. flume-src

Question: 46 of 60
Which integration API does Apache Ambari support?

Correct.

A. SOAP

B. RMI

C. REST

D. RPC
Question: 47 of 60
What does the MLlib component of Apache Spark support?

Correct.

A. stream processing

B. scalable machine learning

C. graph computation

D. SQL and HiveQL

Question: 48 of 60
What are two benefits of using the IBM Big SQL processing engine? (Choose two.)
(Please select ALL that apply)

Correct.

A. Various data storage formats are supported.

B. It provides access to Hadoop data using SQL.

C. Core functionality is written in Java for portability.

D. The system is built to be started and stopped on demand.

Question: 49 of 60
What does the programmatic implementation of a Map function do?

Correct.

A. Computes the final result of the entire job.

B. Locates the data in the DFS.

C. Reads the data file and performs a transformation.

D. Combines previous results into an aggregate.


Question: 50 of 60
Which data inconsistency may appear while using ZooKeeper?

Correct.

A. excessively stale data views

B. simultaneously inconsistent cross-client views

C. out-of-order updates across clients

D. unreliable client updates across the cluster

Question: 51 of 60
What command will load the BigR package in R?

Correct.

A. dir(pattern="bigr")

B. source("bigr")

C. bigr.connect

D. library(bigr)

Question: 52 of 60
Which command must be run first to become the HDFS user?

Correct.

A. pwd

B. hadoop fs
C. su - hdfs

D. hdfs

Question: 53 of 60
What type of NoSQL datastore does HBase fall into?

Incorrect.The correct answer is :columnExplanationDW613_Course_Guide_V2 13-18

A. key-value

B. document

C. graph

D. column

Question: 54 of 60
Which two tasks can an Apache Ambari admin do that a regular Apache Ambari user cannot do?
(Choose two.)
(Please select ALL that apply)

Correct.

A. browse job information

B. modify configurations

C. view service status

D. run service checks

Question: 55 of 60
What does the federation feature of Big SQL allow?

Correct.
A. rewriting statements for better execution performance

B. querying multiple data sources in one statement

C. tuning server hardware performance

D. importing data into HDFS

Question: 56 of 60
Which statement is true regarding Reduce tasks in MapReduce?

Correct.

A. They only run on nodes that didn't generate data during the Map step.

B. They run only on nodes that generated data during the Map step.

C. They can run on any node.

D. They only run on one node.

Question: 57 of 60
How does Sqoop decide how to split data across mappers?

Incorrect.The correct answer is :examining the primary keyExplanationDW613_Course_Guide_V2


12-45

A. applying the split size to the data

B. moving the data to the closest network node

C. examining the primary key

D. dividing the input bytes by available nodes

Question: 58 of 60
What is the JSqsh tool used for?
Correct.

A. web-based SQL editing

B. command-line SQL queries

C. installing the IBM Data Server Manager (DSM)

D. deploying the SQL JDBC driver

Question: 59 of 60
What does the HCatalog component of Hive provide?

Correct.

A. providing a REST gateway for jobs

B. maintaining an inventory of cluster nodes

C. table and storage management layer for Hadoop

D. collecting common data transformations into a library

Question: 60 of 60
What command is used to retrieve multiple rows out of an HBase table?

Incorrect.The correct answer is :scanExplanationDW613_Course_Guide_V2 13-74

A. get

B. pull

C. scan

D. select
Question: 1 of 60
Which statement will create a table with parquet files?

Correct.

A. CREATE HADOOP TABLE T ( i int, s VARCHAR(10)) SAVE AS PARQUET;

B. CREATE HADOOP TABLE T ( i int, s VARCHAR(10)) STORED AS


PARQUETFILE;.

C. CREATE HADOOP TABLE T ( i int, s VARCHAR(10)) STORED AS PARQUET;

D. CREATE HADOOP TABLE T ( i int, s VARCHAR(10)) SAVE AS PARQUETFILE;

Question: 2 of 60
In the ZooKeeper environment, what does atomicity guarantee?

Incorrect.The correct answer is :Updates completely succeed or


fail.ExplanationDW613_Course_Guide_V2 11-13

A. Updates are applied in the order created.

B. Every client sees the same view.

C. Updates completely succeed or fail.

D. If an update succeeds, then it persists.

Question: 3 of 60
Which two components make up a Hadoop node? (Choose two.)
(Please select ALL that apply)

Correct.

A. disk

B. memory

C. network

D. CPU
Question: 4 of 60
Which component connects sinks and sources in Flume?

Correct.

A. HDFS

B. channels

C. ElasticSearch

D. interceptors

Question: 5 of 60
How does Apache Ambari use the Ganglia component?

Correct.

A. to cluster job scheduling

B. to add new nodes to the cluster

C. to monitor cluster performance

D. to predict hardware failures

Question: 6 of 60
Why does YARN scale better than Hadoop v1 for multiple jobs? (Choose two.)
(Please select ALL that apply)

Incorrect.The correct answer is :There is one Application Master per job.Job tracking and resource
management are split.ExplanationDW613_Course_Guide_V2 9-44

A. Job tracking and resource management are one process.

B. There is one Job Tracker per cluster.

C. There is one Application Master per job.

D. Job tracking and resource management are split.

Question: 7 of 60
What is a key factor in determining how to implement file compression with HDFS?

Correct.

A. the speed of network transfers between nodes

B. compression algorithm supports splitting

C. the CPU speed of the cluster members (MHz)

D. the amount of storage space needed for all files

Question: 8 of 60
An organization is developing a proof-of-concept for a big data system. Which phase of the big data
adoption cycle is the company currently in?

Correct.

A. Execute

B. Explore

C. Engage

D. Educate

Question: 9 of 60
Which component is required for Flume to work?

Incorrect.The correct answer is :Data sourceExplanationDW613_Course_Guide_V2 12-29

A. Interceptor

B. Data source

C. Syslog

D. RDBMS

Question: 10 of 60
What is a limitation of Apache Spark?

Correct.
A. It does not have universal tools.

B. It does not support streams.

C. It does not run Hadoop.

D. It does not in itself interact with SQL.

Question: 11 of 60
Assuming the same data is stored in multiple data formats, which format will provide faster query
execution and require the least amount of IO operations to process?

Correct.

A. XML

B. flat file

C. JSON

D. Parquet

Question: 12 of 60
Which programming language is Apache Spark primarily written in?

Incorrect.The correct answer is :ScalaExplanationDW613_Course_Guide_V2 10-14

A. Scala

B. Java

C. C++

D. Python 2

Question: 13 of 60
What is the default install location for the IBM Open Data Platform on Linux?

Incorrect.The correct answer is :/usr/iopExplanationDW613_Course_Guide_V2 6-30


A. /usr/iop

B. /usr/local/iop

C. /opt/ibm/iop

D. /var/iop

Question: 14 of 60
What does the bucketing feature of Hive do?

Correct.

A. distributes the data dynamically for faster processing

B. splits data into collections based on ranges

C. sub-partitioning/grouping of data by hash within partitions

D. allows data to be stored in arrays

Question: 15 of 60
What command will list files located on the HDFS in R?

Incorrect.The correct answer is :bigr.listfs()ExplanationDW613_Course_Guide_V1 4-35

A. list()

B. ls()

C. bigr.dir()

D. bigr.listfs()

Question: 16 of 60
Which open source component is a big data processing framework?

Correct.
A. IBM Big SQL

B. Apache Ambari

C. IBM BigSheets

D. Apache Spark

Question: 17 of 60
Data collected within your organization has a short period of time when it is relevant. Which
characteristic of a big data system does this represent?

Correct.

A. Velocity

B. Variety

C. Validation

D. Volume

Question: 18 of 60
What is a feature of an Avro file?

Correct.

A. formal schema language

B. columns delimited by commas

C. versioning of the data

D. directly readable by JavaScript

Question: 19 of 60
What does the MLlib component of Apache Spark support?

Correct.
A. SQL and HiveQL

B. graph computation

C. scalable machine learning

D. stream processing

Question: 20 of 60
Which integration API does Apache Ambari support?

Correct.

A. REST

B. RPC

C. SOAP

D. RMI

Question: 21 of 60
What is the ApplicationMaster in YARN responsible for? (Choose two.)
(Please select ALL that apply)

Correct.

A. taking nodes offline for maintenance

B. obtaining resources for computation

C. allocating resources from all nodes

D. monitoring node execution status

Question: 22 of 60
Which data inconsistency may appear while using ZooKeeper?

Correct.

A. unreliable client updates across the cluster


B. simultaneously inconsistent cross-client views

C. excessively stale data views

D. out-of-order updates across clients

Question: 23 of 60
What does the HCatalog component of Hive provide?

Correct.

A. table and storage management layer for Hadoop

B. maintaining an inventory of cluster nodes

C. collecting common data transformations into a library

D. providing a REST gateway for jobs

Question: 24 of 60
What command is used to start a Flume agent?

Correct.

A. flume-ng

B. flume-start

C. flume-agent

D. flume-src

Question: 25 of 60
What are two benefits of using the IBM Big SQL processing engine? (Choose two.)
(Please select ALL that apply)

Correct.
A. It provides access to Hadoop data using SQL.

B. Various data storage formats are supported.

C. Core functionality is written in Java for portability.

D. The system is built to be started and stopped on demand.

Question: 26 of 60
What are two major business advantages of using BigSheets? (Choose two.)
(Please select ALL that apply)

Incorrect.The correct answer is :built-in data readers for multiple formatsspreadsheet-like querying
and discovery interfaceExplanationDW613_Course_Guide_V1 3-24

A. built-in data readers for multiple formats

B. command-line-driven data analysis

C. spreadsheet-like querying and discovery interface

D. feature rich programming environment

Question: 27 of 60
What command will load the BigR package in R?

Correct.

A. source("bigr")

B. dir(pattern="bigr")

C. bigr.connect

D. library(bigr)

Question: 28 of 60
Which action is performed prior to the Map step of a MapReduce v1 processing cycle?

Correct.

A. The job is broken into individual task pieces and distributed.


B. The job is sent sequentially to all nodes.

C. The data required is moved to the fastest nodes.

D. Output result sets are simplified to a single answer.

Question: 29 of 60
Which action is performed during the Reduce step of a MapReduce v1 processing cycle?

Correct.

A. Intermediate results are aggregated.

B. The initial problem is broken into pieces.

C. The JobTrackers execute their assigned tasks.

D. The TaskTracker distributes the job to the cluster.

Question: 30 of 60
Which software is at the core of the IBM BigInsights platform?

Correct.

A. cloud-based web services

B. open source components

C. proprietary IBM libraries

D. customer developed software

Question: 31 of 60
How does an end-user interact with the IBM BigSheets tool?

Correct.

A. IBM-built desktop app

B. mobile app
C. command line

D. web browser

Question: 32 of 60
What does the federation feature of Big SQL allow?

Correct.

A. rewriting statements for better execution performance

B. importing data into HDFS

C. tuning server hardware performance

D. querying multiple data sources in one statement

Question: 33 of 60
Which command is used to launch an interactive Apache Spark shell?

Correct.

A. spark

B. hadoop spark

C. scala --spark

D. spark-shell

Question: 34 of 60
Which kind of HBase row key maps to multiple SQL columns?

Incorrect.The correct answer is :CompositeExplanationDW633_Course_Guide.pdf Page 5-13

A. Composite

B. Dense

C. Primary
D. Unique

Question: 35 of 60
How can you reduce the memory usage of the ANALYZE command in Big SQL?

Correct.

A. Run everything in one batch.

B. Include all the columns in the batch.

C. Run the command separately on different batches of columns.

D. Turn on distribution statistics.

Question: 36 of 60
Which statement best describes Spark?

Correct.

A. A logical view on top of Hadoop data.

B. A computing engine for a large-scale data set.

C. An instance of a federated database.

D. An open source database query tool.

Question: 37 of 60
Which two commands are used to load data into an existing Big SQL table from HDFS? (Choose
two.)
(Please select ALL that apply)

Correct.

A. Table

B. Select

C. Create
D. Load

E. Insert

Question: 38 of 60
Which feature in a Big SQL federation is a library to access a particular type of data source?

Correct.

A. server

B. view

C. wrapper

D. table

Question: 39 of 60
How will the following column mapping command be encoded? cf_data:full_names mapped by
(last_name, First_name) separator ','

Correct.

A. Character

B. String

C. Binary

D. Hex

Question: 40 of 60
Which statement is used to set the correct compatible collation with Big SQL?

Incorrect.The correct answer is :CREATE SERVERExplanationDW633_Course_Guide.pdf Page 4-14

A. SEQUENCE

B. PUSHDOWN
C. CREATE WRAPPER

D. CREATE SERVER

Question: 41 of 60
Which command should you use to set the default schema in a Big SQL table and also create the
schema if it does not exist?

Incorrect.The correct answer is :useExplanationDW633_Course_Guide.pdf Page 1-15

A. default

B. use

C. format

D. create

Question: 42 of 60
Which core component of the Hadoop framework is highly scalable and a common tool?

Correct.

A. MapReduce

B. Sqoop

C. Pig

D. Hive

Question: 43 of 60
Which Big SQL file format is human readable and supported by most tools, but is the least efficient
file format?

Correct.

A. Delimited

B. Sequence
C. Avro

D. Parquet

Question: 44 of 60
You need to create multiple Big SQL tables with columns defined as CHAR. What needs to be set to
enable CHAR columns?

Correct.

A. ALTER CHAR DATATYPE TO byte

B. CREATE TABLE chartab

C. SET SYSHADOOP.COMPATIBILITY_MODE=1

D. SET HADOOPCOMPATIBLITY_MODE=True

Question: 45 of 60
Which underlying data representation and access method does Big SQL use?

Correct.

A. SMALLINT

B. Hive

C. MAP

D. TINYINT

Question: 46 of 60
Which Big SQL datatype should be avoided because it causes significant performance degradation?

Correct.

A. UNION

B. VARCHAR
C. STRING

D. CHAR

Question: 47 of 60
Which type of HBase column is mapped to multiple SQL columns?

Correct.

A. Composite

B. Dense

C. Double

D. Exclusive

Question: 48 of 60
What is missing from the following statement when querying a remote table? CREATE _______ FOR
remotetable1 ...

Correct.

A. TABLE

B. INDEX

C. VIEW

D. NICKNAME

Question: 49 of 60
You have a very large Hadoop file system. You need to work on the data without migrating the data
out or changing the data format. Which IBM tool should you use?

Correct.

A. MapReduce

B. Pig
C. Big SQL

D. Data Server Manager

Question: 50 of 60
How can you fix duplicate results generated by an extractor from the same text because the text
matches more than one dictionary entry?

Incorrect.The correct answer is :remove with a consolidation


ruleExplanationDW654_Course_Guide.pdf Page 5-7

A. edit output with overlapping matches

B. remove with a consolidation rule

C. remove union statement

D. edit properties of the sequence

Question: 51 of 60
Where should you build extractors in the Information Extraction Web Tool?

Correct.

A. Documents

B. Regular expression

C. Property pane

D. Canvas

Question: 52 of 60
Which feature of Text Analytics allows you to rollback your extractors when necessary?

Correct.

A. Snapshots

B. Standard tokenizer

C. Multilingual tokenizer
D. Scalar functions

Question: 53 of 60
What are extractors transformed into when they are executed?

Correct.

A. Online Analytical Programming (OLAP) statements

B. Candidate generation statements

C. BigSheets function statements

D. Annotated Query Language (AQL) statements

Question: 54 of 60
In which text analytics phase are extractors developed and tested?

Correct.

A. Production

B. Analysis

C. Performance Tuning

D. Rule Development

Question: 55 of 60
How is a sequence created in Canvas?

Correct.

A. Drag and drop one extractor onto another.

B. Right click on the extractor, and select Edit Output.


C. Select multiple extractors on the result pane.

D. Click on the New Literal button.

Question: 56 of 60
Which Text Analytics runtime component is used for languages such as Spanish and English by
breaking a stream of text into phrases or words?

Correct.

A. Standard tokenizer

B. Other extractors

C. Multilingual tokenizer

D. Named entity extractors

Question: 57 of 60
Which AQL candidate rule combines tuples from two views with the same schema?

Correct.

A. Sequence

B. Select

C. Blocks

D. Union

Question: 58 of 60
What defines a relation in an AQL extractor?

Correct.

A. a row

B. a column
C. a schema

D. a view

Question: 59 of 60
Which basic feature rule of AQL helps find an exact match to a single word or phrase?

Correct.

A. Part of Speech

B. Literals

C. Splits

D. Dictionary

Question: 60 of 60
What should you do in Text Analytics to fix an extractor that produces unwanted results?

Correct.

A. Remove results with a consolidation rule.

B. Edit the properties of the sequence.

C. Create a new filter.

D. Re-create the extractors.

Vous aimerez peut-être aussi