HDP Qestions

1 In the hadoop-envsh file, what does the setting HADOOP_HEAPSIZE establish?
2 What is AVRO?

3 When archiving Hadoop files, which of the following statements are true? (Cho
ose two)

4 What is the difference between killed task and failed tasks?

5 What are the two methods of the orgapachehadoopmapredInputFormat interface?

6 The file sampletxt has the following content:

7 Which of the following is false about RawComparator ?

8 Which of the following are true about fsck?

9 Based on the following reduce method signature, which two of the following st
atements are true?

10 You are given a file that is split into 5 blocks when writing to HDFS What c
onfiguration changes have to

be done inorder for one mapper to read all five blocks ?

11 Which one of the following statements is false regarding the Distributed Cac
he?

12 Which one of the following statements is false regarding the Combiner of a M
apReduce job?

13 What is hadoop stack?

14 What is writable?

15 HDFS is designed for:

16 what Workflow expressed in Oozie can contain ?

17 In a MapReduce program, the reducer receives all values associated with the
same keyWhich statement
is most accurate about the ordering of these values?

18 Given the following code from a MapReduce application:

19 What are the common problems with map-side join?

20 In order to apply a combiner, what is one property that has to be satisfied
by the values emitted from the

mapper

21 In Flume, which of the following provides scalability at the collector tier?

22 Out of Pig, Hive, and Jaql, which of the following attributes is specific

23 when can a reducer class also serve as a combiner without affecting the outp
ut of a map-reduce pgm?

24 Which of the following two responsibilities of the Job tracker have been spl
it into separate daemons in the Map Reduce v2 ?

25 Where is the information about the hive meta data stored ?

26 Which of the following are compelling reasons to benchmark a Hadoop deployme
nt?

27 What is the purpose of the shuffle in Hadoop MapReduce?

28 Why would a developer create a map-reduce without the reduce step?

29 When a job is run,your properties file are copied to distributed cache in or
der for your map jobs to accessHow do u access the property file

30 Which of the following are among the duties of the DataNodes in HDFS?

31 Which demon is responsible for replication of data in Hadoop?

32 Which file is required configuration file to run oozie job?

33 What are supported programming languages for Map Reduce?

34 What is HIVE?

35 What is Identity mapper?

36 Can a custom type for data Map-Reduce processing be implemented?

37 Which of the following is true for the output of the shuffle and sort phase?

38 Which of the following job doesn't support in oozie ?

39 How does Hadoop process large volumes of data?

40 How can you use binary data in MapReduce?

41 What happens if the client requests to access a part of the data file during
the processing stage?

42 When exactly Reducer starts?

43 Which of the following are Flume points of extension?

44 Which one of the following statements is false regarding a MapReduce job?

45 What is HBASE?

46 What's the difference of having 0 reducer and 1 reducer

47 Which one of the following statements is false regarding the Partitioner of
a MapReduce job?

48 Put the following phases of a MapReduce program in the order that they execu
te?

49 Which is faster: Map-side join or Reduce-side join? Why?

50 The input to a mapper takes the form <k1, v1>What form does the mapper
's output take?

51 Which of the following is a distributed, scalable, big data store that can b
e used when you need random,

realtime read/write access to your Big Data

52 What is map - side join?

53 What is the difference between addInputPaths() and setInputPaths() of FileIn
putFormat ?

54 What is Flume?

55 When writing data to HDFS what is true if the replication factor is three? (
Choose 2)

56 How can you disable the reduce step?

57 Which of the following components retrieves the input splits directly from H
DFS to determine the number of map tasks?

58 Which two of the following statements are true regarding HCatalog?

59 What is true about Writable and WritableComparable

60 What is the default input format?

61 When using HDFS, what occurs when a file is deleted from the command line?

62 Will settings using Java API overwrite values in configuration files?

63 Out of Pig, Hive, and Jaql, which of the following attributes is specific

64 Given the following code from a MapReduce program:

65 Which one of the following is not a main component of HBase?

66 In Flume, which reliability level guarantees an accepted event reaches the e
ndpoint?

67 The orgapachehadoopioWritable interface declares which two methods?

68 Based on the following map method signature, which two of the following stat
ements are true?

69 Can you run Map - Reduce jobs directly on Avro data?

70 Can you suppress reducer output?

71 Which of the following two responsibilities of the Job tracker have been spl
it into separate daemons in the Map Reduce v2 ?

72 The file sampletxt has the following content:

73 What is distributed cache?

74 What is the default InputFormat of a MapReduce job?

75 Which one of the following statements is true regarding <key,value> pa
irs of a MapReduce job?

76 The output of shuffle and sort is an Iterator of values which are iterated W
hat does the iteratornext provide?

77 If a file split into large noof small chunks / blocks (ieblock size is very
small), what's the problem?

78 There is 100 data node with 100TB capacityHow much data one can store (repli
cation factor is 3)?

79 A file in HDFS is treated as small file if its size is

80 What should be carefully coordinated by an administrator when decommissionin
g multiple DataNodes in a cluster?

81 Which of the following are configured in core-sitexml? (Choose three)

82 What happens if mapper output does not match reducer input?

83 What is reduce - side join?

84 Which of the following is common to Pig, Hive, and Jaql?

85 Why Value in a Key-Value pair doesn't implement WritableComparable?

86 What is the most important feature of map-reduce?

87 On cluster hosting 10 TB of data, the following command is executed on a Job
Tracker nodeWhat is the anticipated activity on that cluster? (Choose 1)

88 What is the data type of the return value of the getPartition method in the
orgapachehadoopmapredPartitioner interface?

89 Keys from the output of shuffle and sort implement which of the following in
terface ?

90 What happens when the iosortspillpercent threshold is exceeded when a Mapper
is outputting <key,value> pairs?

91 Which one of the following is not a built-in Pig data type?

92 You have file of 300 mb being written to HDFSWhat happens if after 200mb is
written, another user concurrently accesses the file

HDP Qestions

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

HDP Qestions

Transféré par

Droits d'auteur :

Formats disponibles

1 In the hadoop-envsh file, what does the setting HADOOP_HEAPSIZE establish?

Vous aimerez peut-être aussi