SlideShare a Scribd company logo
1 of 84
Question1. Which of the followingis a monitoring solution
for hadoop?
1. Sirona
2. Sentry
3. Slider
4. Streams
Question2. __________ is a distributed machine learning
framework on top of spark
1. MLlib
2. Spark Streaming
3. GraphX
4. RDDs
Question3. Point out the correct statement?
1. Knox is a stateless reverse proxy framework
2. Knox also intercepts REST/HTTP calls and provides
authentication
3. Knox scales linearlyby adding more knox nodes as the
load increases
4. All of the mentioned
Question4. PCollection,PTable, and PGroupedTableall
support a __________ operation.
1. Intersection
2. Union
3. OR
4. None of the mentioned
Question 5. How many types of mode are present in
Hama?
1. 2
2. 3
3. 4
4. 5
Question6. The IBM ____________ Platform provides all
the foundational buildingblocks of trusted information,
includingdata integration, data warehousing, master data
management, big data and information governance.
1. Infostream
2. Infosphere
3. Infosurface
4. Infodata
Question7. ________ is the name of the archive you would
like to create.
1. Archive
2. Archive name
3. Name
4. None of the mentioned
Question 8. Ambari provides a _______API that enables
integration with existing tools, such as Microsoft System
Center.
1. Restless
2. Web services
3. Restful
4. None of the mentioned
Question9. _______ forge software for the development of
software projects.
1. Oozie
2. Allura
3. Ambari
4. All of the mentioned
Question10. Posting format now uses a __________ API
when writing postings just like doc values.
1. Push
2. Pull
3. Read
4. All of the mentioned
Question11. Point out the correct statement
1. BuildingPylucene requires CNU make, a recent version
of ant capableof buildingjavalucene and a c++ compiler
2. Pylucene is supported on Mac OS X, linux, SOlaries and
windows
3. Use of the setuptoolsis recommended for lucene
4. All the mentioned
5. Question12. ________ buildsvirtualmachines of
branches trunk and 0.3 for KVM, VMWare and virtual
box.
1. Bigtop-trunk-pakagetest
2. Bigtop-trunk-repository
3. Bigtop-VM-matrix
4. None of the mentioned
Question13. Zookeeper is used for configuration,leader
election in cloud edition of
1. Solr
2. Solur
3. Solar101
4. Solr
Question14. How are keys and values presented and passed
to the reducers during a standard sort and shuffle phase of
Mapreduce?
1. Keys are presented to reducer in sorted order; values for
a given key are not sorted
2. Keys are presented to reducer in sorted order; values for
a given key are sorted in ascending order
3. Keys are presented to reducer in random order; values
for a given key are not sorted
4. Keys are presented to reducer in random order; values
for a given key are sorted in ascending order
Question15. Datastage RTI is real time integrationpack for:
1. STD
2. ISD
3. EXD
4. None of the above
Question16. Which mapreduce stage serves as a barrier,
where all the previous stages must be completed before it
may proceed?
1. Combine
2. Group (a.k.a. ‘shuffle’)
3. Reduce
4. Write
Question17. Which of the following format is more
compression aggressive?
1. Partition compressed
2. Record compressed
3. Block compressed
4. Uncompressed
Question18. _________ is the way of encodingstructured
data in an efficient yet extensible format.
1. Thrift
2. Protocol buffers
3. Avro
4. None of the above
Question19. Which of the following argument is not
supported by import-all-tabletool?
1. Class name
2. Package name
3. Database name
4. Table name
Question20. Which of the following operating system is not
supported by big top?
1. Fedora
2. Solaris
3. Ubuntu
4. SUSE
Question21. Distributed modes are mapped in the _____ file.
1. Groomservers
2. Grervers
3. Grsvers
4. Groom
Question22. ________ is the architectural center of hadoop
that allowsmultipledata processing engines.
1. YARN
2. Hive
3. Incubator
4. Chuckwa
Question23. Users can easily run spark on top of
amazons____________
1. ‘infosphere
2. ‘EC2
3. EMR
4. None of the above
Question24. Which of the following projects is interface
definitionlanguage for hadoop?
1. Oozie
2. Mahout
3. Thrift
4. Impala
Question25. Output of the mapper is first written on the local
disk for sorting and _____ process.
1. Shuffling
2. Secondary sorting
3. Forking
4. Reducing
Question26. HDT projects work with eclipse version _____
and above
1. 3.4
2. 3.5
3. 3.6
4. 3.7
Question27. Which of the following languageis not
supported by spark?
1. Java
2. Pascal
3. Scala
4. Python
Question28. Data analyticsscripts are written in __________
1. Hivw
2. CQL
3. Piglatin
4. Java
Question29. Ripper is a browser based mobile phone
emulatordesigned to aid in the development of ______
bases mobile application.
1. Javascript’
2. Java
3. C++
4. HTML5
Question30. If you set the inlineLOB limit to ____, all large
objects will be placed in external storage.
1. 0
2. 1
3. 2
4. 3
Question31. Hadoop archives reliabilityby replicatingthe
data across multiplehosts, and hence does not require _____
storage on hosts.
1. RAID
2. Standard RAID levels
3. ZFS
4. Operating system
Question32. The configuration file must be owned by the
user running
1. Data manager
2. Node manager
3. Validationmanager
4. None of the above
Question33. ________ is non blockinga synchronous event
driven high performance web framework
1. AWS
2. AWF
3. AWT
4. ASW
Question34. Falcon provides seamless integrationwith
1. HCatalog
2. Metastore
3. HBase
4. Kafka
Question35. One supported datatype that deserves
special mention are:
1. Money
2. Counters
3. Smallint
4. Tinyint
Question36. _______ are chukwa processes that actually
produce data
1. Collectors
2. Agents
3. Hbase table
4. HCatalog
Question37. Which of the following hadoopfile formats is
supported by impala?
1. Sequencefile’
2. Avro
3. Rcfile
4. All of the above
Question38. Avro is said to be the future ___________ layer
of hadoop
1. RMC
2. RPC
3. RDC
4. All of the above
Question39. ______ nodes are the mechanism by which a
workflow triggers the execution of a computation/processing
task
1. Server
2. Client
3. Mechanism
4. Action
Question40. The _______ attribute in the join node is the
name of the workflow join node
1. Name
2. To
3. Down
4. All of the above
Question41. Yarn commands are invoked by the _____ script
1. Hive
2. Bin
3. Hadoop
4. Home
Question42. Which of the following function is used to read
data in PIG?
1. Write
2. Read
3. Load
4. None of the above
Question43. Which of the following hive commands is not
supported by hcatalog?
1. Alter index rebuild
2. Create new
3. Show functions
4. Drop table
Question44. Apache hadoopdevelopmenttools is an effort
undergoing incubationat
1. ADF
2. ASF
3. HCC
4. AFS
Question45. Kafka users key value pairs in the _________ file
format for configuration
1. RFC
2. Avro
3. Property
4. None of the above
Question46. Facebook tackles big data with __________
based in hadoop
1. Project prism
2. Prism
3. Project big
4. Project data
Question47. The size of block in HDFCs is
1. 512 bytes
2. 64 mb
3. 1024 kb
4. None of the above
Question48. Which is the most popularNoSQL databases for
scalable big data store with hadoop?
1. Hbase
2. mongoDB
3. Cassandra
4. None of the above
Question 49. A ________- can route requests to multiple
knox instances
1. Collector
2. Load balancer
3. Comparator
4. All of the above
Question50. Hcatalog is installedwith hive, starting with hive
release
1. 0.10..0
2. 0.9.0
3. 0.11.0
4. 0.12.0
Question51. Tablemetadata in hive is:
1. Stored as metadata on the name node
2. Stored along with the data in HDFCs
3. Stored in the metastore
4. Stored in zookeeper
Question52. Avro schemes are defined with ________
1. JSON
2. XML
3. JAVA
4. All of the above
Question53. Spark was initiallystarted by ___________ at uc
Berkeley AMPlab in 2009
1. Matei Zaharia
2. Mahek Zaharia
3. Doug cutting
4. Stonebreaker
Question54. __________ does rewrite data and pack rows
into column for certain time periods
1. Open TS
2. Open TSDB
3. Open TSD
4. Open DB
Question55. Which of the following phrases occur
simultaneously
1. Shuffle and sort
2. Reduce and sort
3. Shuffle and map
4. All of the above
Question56. ________ command fetches the contents of row
or a cell
1. Select
2. Get
3. Put
4. None of the above
Quesiotn57. _______ are encoded as a series of blocks
1. Arrays
2. Enum
3. Unions
4. Maps
Question58. Hive also support custom extensions written in
1. C#
2. Java
3. C
4. C++
Question59. How many types of nodes are present in storm
cluster?
1. 1
2. 2
3. 3
4. 4
Question60. All decision nodes must have a ____________
element to avoidbringing the workflow into an error state if
none of the predicatesevaluatesto true.
1. Name
2. Default
3. Server
4. Client
Question61. ________ is a rest API for Hcatalog
1. Web hcat
2. Wbhcat
3. Inphcat
4. None of the above
Question62. Streaming supports streaming commands option
as well as ____________ command options
1. Generic
2. Tool
3. Library
4. Task
Questio63. By default collectorslisten on port
1. 8008
2. 8070
3. 8080
4. None of the above
Question64. _______ communicate with the client and
handledata related operations.
1. Master server
2. Region server
3. Htable
4. All of the above
Question65. We can declare the scheme of our data either in
________ file
1. JSON
2. XML
3. SQL
4. VB
Question66. ________ provides a couchbase server hadoop
connector by means of sqoop
1. Memcache
2. Couchbase
3. Hbase
4. All of the above
Question67. Storm integrates with _________ via apache
slider
1. Scheduler
2. Yarn
3. Compaction
4. All of the above
Question68. Avro-backed table can simply be created by
using ___________ in a DDL statement
1. Stored as avro
2. Stored as hive’
3. Stored as avrohive
4. Stored as serd
Question69. Drill analyze semistructured/nested data coming
from ______applications
1. RDBMS
2. NoSQL
3. newSQL
4. none of the above
Question70. The hadoop list includes the HBase Database,
the apache mahout __________ system and matrix
operations.
1. Machinelearning
2. Pattern recognition
3. Statistical classification
4. Articficial classification
Question71. Oozie workflow jobs are directed _______
graphs of actions
1. Acyclical
2. Cyclical
3. Elliptical
4. All of the above
Question72. ___ is an open source SQL query engine for
apache Hbase
1. Pig
2. Phoenix
3. Pivot’
4. None of the above
Question73. $ pig x tez_local will enable _____ mode in pig
1. Mapreduce
2. Tez
3. Local
4. None of the above
Question74. In comparison to SQl, pig uses
1. Lazy evaluation
2. ETL
3. Supports pipelinessplits
4. All of the above
Question75. For Apache _________ users, storm utilizes the
same ODBC interfaces
1. C takes
2. Hive
3. Pig
4. Oozie
Question76. In one or more actionsstarted by the workflow
job are executed when the _________ node is reached, the
actionswill be killed.
1. Kill’
2. Start
3. End
4. Finish
Question77. Which of the following data type is supported by
hive?
1. Map
2. Record
3. String
4. Enum
Question78. Hcatalog supports reading and writing files in
any format for which a _____ can be written
1. SerDE
2. SaerDear
3. Doc Sear
4. All
Question79. _______ is python port of the core project
1. Solr
2. Lucene core
3. Lucy
4. Pylucene
Question80. Apache storm added open source, stream data
processing to _________ data platform
1. Cloudera
2. Hortonworks
3. Local cloudera
4. Map R
Question81. Which of the following is spatialinformation
system?
1. Sling
2. Solr
3. SIS
4. All of the above
Question82. _______ properties can be overriddenby
specifying them in a job-xml file or configuration element.
1. Pipe
2. Decision
3. Flag
4. None of the above
Question83. CDH process and control sensitive data and
facilities:
1. Multi-tenancy
2. Flexibility
3. Scalability
4. All of the above
Qyestion84. Avro supports _________ kinds of complex types
1. 3
2. 4
3. 6
4. 7
Question85. With _________we can store data and read it
easily with variousprogramming languages.
1. Thrift
2. Protocol buffers
3. Avro
4. None of the above
Question86. A float parameter defaultsto 0.0001f, which
means we can deal with 1 error every ________ rows
1. 1000
2. 10000
3. 1 millionrows
4. None of the above
Question87. The ________ data mapper framework makes it
easier to use a databasewith Java or.NET applications
1. iBix
2. Helix
3. iBATIS
4. iBAT
Question88. ___________ is the most popularhigh level java
API in Hadoop Ecosystem
1. scalding
2. HCatalog
3. Cascalog
4. Cascading
Question89. Spark includesa collection over _________
operationsfor transforming data and familier data frame
APIs for manipulatingsemi-structured data
1. 50
2. 60
3. 70
4. 80
Question90. Zookeper’s architecture supports high ________
through redundantservices
1. Flexibilty’
2. Scalability
3. Availability
4. Interactivity
Question91. The Lucene ____________ is pleasedto
announcethe availabilityof Apache Lucene 5.0.0 and Apache
solr 5.0.0
1. PMC
2. RPC
3. CPM
4. All of the above
Question92. EC2 capacity can be increased or decreased in
real time from as few as one to more than ________ virtual
machines simultaneousl
1. 1000
2. 2000
3. 3000
4. None of the above
Question93. HTD has been tested on_________- and Juno.
And can work 0n kepler as well
1. Raibow
2. Indigo
3. Idiavo
4. Hadovo
Question94. Each kafka partitionhas one server which acts as
the __________
1. Leaders
2. Followers
3. Staters
4. All of the above
Question95. The right numbers of reduces seems to be
1. 0.9
2. 0.8
3. 0.36
4. 0.95
Question96. Which of the following is a configuration
management system?
1. Alex
2. Puppet
3. Acem
4. None of the above
Question97. Which of the following is the only for storage
with limited compute?
1. Hot
2. Cold
3. Warm
4. All_SSD
Question98. Grooms servers start up with a _______ instance
and a RPC proxy to contact the bsp master
1. RPC
2. BSP Peer
3. LPC
4. None of the above
Question99. A ________ represents a distributed, immutable
collectionof elements of type t.
1. Pcollect
2. Pcollection
3. Pcol
4. All of the above
Question100. ________ is used to read data from bytes
buffers
1. Write{}
2. Read{}
3. Readwrite{}
4. All of the above
Q101-Which is the default Input Formats defined in Hadoop?
1. SequenceFileInputFormat
2. ByteInputFormat
3. KeyValueInputFormat
4. TextInputFormat
Q102. Which of the following is not an inputformat in
Hadoop?
1.TextInputFormat
2. ByteInputFormat
3. SequenceFileInputFormat
4. KeyValueInputFormat
Q103. Which of the following is a valid flow in Hadoop ?
1.Input -> Reducer -> Mapper -> Combiner -> -> Output
2. Input -> Mapper -> Reducer -> Combiner -> Output
3. Input -> Mapper -> Combiner -> Reducer -> Output
4. Input -> Reducer -> Combiner -> Mapper -> Output
Q104. MapReduce was devised by ...
1.Apple
2. Google
3. Microsoft
4. Samsung
Q105. Which of the following is not a phase of Reducer ?
1. Map
2. Reduce
3. Shuffle
4. Sort
Q106. How many instances of Job tracker can run on Hadoop
cluster ?
1.1
2. 2
3.3
4.4
Q107. Which of the following is not the Dameon process that
runs on a hadoopcluster ?
1.JobTracker
2.DataNode
3.TaskTracker
4.TaskNode
Q108-As companies move past the experimental phase with
Hadoop,many cite the need for additionalcapabilities,
including:
1.Improved data storage and informationretrieval
2.Improved extract, transform and loadfeatures for data
integration
3.Improved data warehousing functionality
4.Improved security, workload management and SQL
support
Q109-Point out the correct statement :
1.Hadoopdo need specializedhardware to process the
data
2.Hadoop2.0 allowslive stream processing of real time
data
3.In Hadoop programming framework output files are
dividedin to lines or records
4.None of the mentioned
Q110-. According to analysts, for what can traditionalIT
systems provide a foundationwhen they’re integrated with
big data technologies like Hadoop?
1.Big data management and data mining
2. Data warehousing and business intelligence
3.Management of Hadoopclusters
4.Collectingand storing unstructured data
Q111- Point out the wrong statement :
1.Hardtop’s processing capabilitiesare huge and its real
advantagelies in the abilityto process terabytes & petabytes
of data
2.Hadoopuses a programming model called
“MapReduce”,all the programs should confirms to this
model in order to work on Hadoopplatform
3.The programming model, MapReduce,used by Hadoop
is difficult to write and test
4.All of the mentioned
Q112- What was Hadoopnamed after?
1. Creator Doug Cutting’s favorite circus act
2.Cutting’s high school rock band
3.The toy elephantof Cutting’s son
4.A sound Cutting’slaptop made during Hadoop’s
development
Q113- All of the following accurately describe Hadoop,
EXCEPT:
1.Open source
2. Real-time
3.Java-based
4. Distributed computing approach
Q114- __________ can best be described as a programming
model used to develop Hadoop-basedapplicationsthat can
process massive amounts of data.
1.MapReduce
2.Mahout
3.Oozie
4.All of the mentioned
Q115- __________ has the world’s largest Hadoop cluster.
1.Apple
2. Datamatics
3.Facebook
4.None of the mentioned
Q116- Facebook Tackles Big Data With _______ based on
Hadoop.
1.‘Project Prism’
2.‘Prism’
3.‘Project Big’
4. ‘Project Data’
Q 117- What is the main problem faced while reading and
writing data in parallelfrom multiple disks?
1.Processing high volume of data faster.
2. Combining data from multipledisks.
3. The software required to do this task is extremely costly.
4. The hardware required to do this task is extremely costly.
Q118 - Under Hadoop High Availability,Fencing means
1.Preventing a previously active namenode from start
running again.
2. Preventing the start of a failover in the event of network
failure with the active namenode.
3. Preventing the power down to the previously active
namenode.
4. Preventing a previously active namenodefrom writing to
the edit log.
Q119 - The default replicationfactor for HDFS file system in
hadoopis
1.1
2. 2
3. 3
4. 4
Q120 - The hadfscommand put is used to
1.Copy files from local file system to HDFS.
2. Copy files or directories from local file system to HDFS.
3. Copy files from from HDFS to local filesystem.
4. Copy files or directories from HDFS to localfilesystem.
Q121 - The namenodeknows that the datanodeis active
using a mechanism known as
1.heartbeats
2. datapulse
3. h-signal
4. Active-pulse
Q122 - When a machine is declared as a datanode,the disk
space in it
1.Can be used only for HDFS storage
2. Can be used for both HDFS and non-HDFs storage
3. Cannot be accessed by non-hadoop commands
4. cannot store text files.
Q123 - The data from a remote hadoopcluster can
1. not be read by another hadoopcluster
2. be read using http
3. be read using hhtp
4. be read suing hftp
Q124 - Which one is not one of the big data feature?
1. Velocity
2. Veracity
3. volume
4. variety
Q125 - What is HBASE?
1. Hbase is separate set of the Java API for Hadoop cluster.
2. Hbase is a part of the Apache Hadoopproject that
provides interface for scanning large amount of data using
Hadoopinfrastructure.
3. Hbase is a "database"like interface to Hadoopcluster
data.
4. HBase is a part of the Apache Hadoop project that
provides a SQL like interface for data processing.
Q125 - Which of the following is false about RawComparator
?
1.Compare the keys by byte.
2. Performance can be improved in sort and suffle phase by
using RawComparator.
3. Intermediary keys are deserialized to perform a
comparison.
Q 126 - Zookeeper ensures that
1. All the namenodes are actively serving the client
requests
2. Only one namenode is actively serving the client requests
3. A failover is triggered when any of the datanodefails.
4. A failover can not be started by hadoopadministrator.
Q 127 - Which scenario demandshighest bandwidthfor data
transfer between nodes in Hadoop?
1. Different nodes on the same rack
2. Nodes on different racks in the same data center.
3. Nodes in different data centers
4. Data on the same node.
Q128 - The hadoopframe work is written in
1. C++
2. Python
3. Java
4. GO
Q129 - When a client contacts the namenode for accessing a
file, the namenoderesponds with
1. Size of the file requested.
2. Block ID of the file requested.
3. Block ID and hostname of any one of the data nodes
containingthat block.
4. Block ID and hostname of all the data nodes containing
that block.
Q130 - Which of the following is not a goal of HDFS?
1. Fault detection and recovery
2. Handle huge dataset
3. Prevent deletionof data
4. Provide high network bandwidthfor data movement
Q 131 - In HDFS the files cannot be
1. read
2. deleted
3. executed
4. Archived
Q132 - The number of tasks a task tracker can accept
dependson
1. Maximum memory availablein the node
2. Not limited
3. Number of slots configured in it
4. As decided by the jobTracker
Q133 - When using HDFS, what occurs when a file is deleted
from the command line?
1. It is permanently deleted if trash is enabled.
2. It is placed into a trash directory common to all users for
that cluster.
3. It is permanently deleted and the file attributesare
recorded in a log file.
4. It is moved into the trash directory of the user who
deleted it if trash is enabled.
Q134 - The org.apache.hadoop.io.Writableinterface declares
which two methods? (Choose 2 answers.)
publicvoid readFields(DataInput).
publicvoid read(DataInput).
publicvoid writeFields(DataOutput).
publicvoid write(DataOutput).
1. 1 & 4
2. 2 & 3
3. 3 & 4
4. 2 & 4
Question135. Mapreduce has undergone a complete
overhaul in hadoop?
1.0.21
2.0.23
3.0.24
4.0.26
Question136. __________ is the slave/worker node
and holds the user data in the form of data blocks
1.Data node
2.Name node
3.Data block
4.Replication
Qyestion137. Spark is engineered from the bottom up
for performance running __________
1.100x
2.150x
3.200x
4.None of the above
Question138. _________nodes are the mechanism by
which a workflow triggers the execution of a
computation/processing task
1.Server
2.Client
3.Mechanism
4.Action
Question139. __________ maps input key/value pairs
to a set of intermediate key/value pairs
1.Mapper
2.Reducer
3.Mapper and reducer
4.None of the above
Question140. Zookeeper keep track of the cluster
state such as the ____________- table location
1.Domain
2.Node
3.Root
4.All of the above
Question141. When __________ contents exceed a
configurable threshold, the memtable data, which
includes indexes, is put in a queue to be flushed to disk
1.Subtable
2.Memtable
3.Intable
4.Memorytable
Question142. Apache knox accesses hadoop cluster
over
1.HTTP
2.TCT
3.ICMP
4.None of the above
Question143. ___________ supports a new command
shell beeline that works with hiveserver2.
1.Hiveserver2
2.Hiveserver3
3.Hiveserver4
4.Hiveserver5
Question144. ________ sink can be a text file, the
console display, a simple HDFC path or a null bucket
where the data is simply deleted
1.Collector tier event
2.Agent tier event
3.Basic
4.None of the above
Question145. __________ name node is used when
the primary name node goes down
1.Rack
2.Data
3.Secondary
4.None of the above
Question146. Data transfer between web-console
and clients are protected by using
1.SSL
2.Kerberos
3.SSH
4.None of the above
Question147. Which of the following is one of the
possible state for a workflow jobs?
1.PREP
2.START
3.RESUME
4.END
Question148. Stratus will be a polygot _______
framework
1.Daas
2.Paas
3.Saas
4.Raas
Question149. All file access user java’s _______ APIs
which give lucen stronger index safety
1.NIO.2
2.NIO.3
3.NIO.4
4.NIO.5
Question150. Which of the following is a standard
compliment XML Query processor?
1.Whirr
2.VXQuery
3.Knife
4.Lens
Question151. ______ is a query processing and
optimization system for large-scale
1.MRQL
2.Nifi
3.Openaz
4.ODF toolkit
Question152. Reduce progress () gets the progress of
the jobs reduce tasks as a float between
1.0.0-1.0
2.1.0-2.0
3.2.0-3.0
4.3.0-4.0
Question153. _____- is a framework for building java
server applications GUIs
1.My faces
2.Muse
3.Flume
4.Big top
Question154. Apache fkune 1.3.0 is the fourth release
under the auspices of apache of the so-called _____
codeline
1.NG
2.ND
3.NF
4.NR
Question155. Starting in hive ______ the avro scheme
can be inferred from the hive table scheme
1.0.14
2.0.12
3.0.13
4.0.11
Question156. A workflow definition is a ___ with
control flow nodes or action nodes
1.CAG
2.DAG
3.BAG
4.None of the above
Question157. Lucene provides scalable high-
performance indexing over ______ per hour on
modern hardware
1.1TB
2.150GB
3.10GB
4.200 GB
Question158. The right level of parallelism for maps
seems to be around _____ maps pernode
1.1to 10
2.10 to 100
3.100 to 150
4.150 to 200
Question159. The LZO compression format is
composed of approximately ______ blocks of
compressed data
1.128k
2.256k
3.24k
4.36k
Question160. ___________ is the software
development collaboration tool.
1.Buildr
2.Cassandra
3.Bloodhound
4.All of the above
Question161. A ________ is an operation on the
stream that can transform the stream
1.Decorator
2.Source
3.Sinks
4.All of the above
Question162. _________ has the worlds largest
hadoop cluster
1.Apple
2.Datamatics
3.Facebook
4.None of the above
Question163. When a ___________ is triggered the
client receives a packet saying that the znode has
changed
1.Event’
2.Watch
3.Row
4.Value
Question164. Ambary leverages __________ for
system altering and will send emails when your
attention is needed
1.Nagios
2.Nagaond
3.Ganglia
4.None of the above
Question165. ____ is a software distribution
framework based on OSGi
1.ACE
2.Abdera
3.Zeppelin
4.Accumulo
Question166. Which of the following is content
managementand punlishing system based on cocoon?
1.Lib cloud
2.Kafka
3.Lenya
4.All of the above
Question167. If the failure is of ________ nature oozie
will suspend the workflow job
1.Transient
2.Non transient
3.Permanent
4.Non permanent
Question168. _________- node distributes code across
the cluster
1.Zookeeper
2.Nimbus
3.Supervisor
4.Non of the above
Question169. A workflow definition musthave one
_____ node
1.Start
2.Resume
3.Finish
4.Non of the above
Question170. _________ is a rest API for HCatalog
1.Webhcat
2.WbhCAT
3.InpJcat
4.None of the above
Question171. Which of the following fike contains user
defined functions (UDGCs)
1.Script2-local.pig
2.Pig.jar
3.Tutorial.jar
4.Excite.log.bz2
Question172. Helprace is using zookeeper on a ______
cluster in conjugation with hadoop and hBase
1.3 node
2.4 node
3.5 node
4.6 node
Question173. ____________ represents the logical
computations of your crunch pipelines
1.Do Fns
2.Three Fns
3.Do fn
4.None of the above
Question174. _____________ has stronger ordering
guarantees than a traditional messaging system
1.Kafka
2.Slider
3.Suz
4.None of the above
Question175. HBase is _________ defines only column
families
1.Row oriented
2.Scheme less
3.Fixed scheme
4.All of the above
Question176. An input ___________ is a chunk of the
input that is processed by a single map
1.Textformat
2.Split
3.Datanode
4.All of the above
Question177. ___________ permits data written by
one system to be efficiency sorted by another system
1.Complex data type
2.Order
3.Sort order
4.All of the above
Question178. __________ text is appropriate for most
non binary data types
1.Character
2.Binary
3.Delimited
4.None of the above
Question179. __________ is an open source set of
libraries tools examples and documentation
engineered
1.Kite
2.Kize
3.Ookie
4.All of the above
Question180. Map output larger than_______ percent
of the memory allocated to copying map outputs
1.10
2.15
3.25
4.35
Question181. Cassandra creates a __________
for each table which allows you to symlink a table to a
chosen physical drive or data volume
1.Directory
2.Subdirectory
3.Domain
4.Path
Question182. Use ________ and embedded the
scheme in the create statement
1.Scheme.literal
2.Scheme.lit
3.Row.literal
4.All of the above
Question183. Which of the following can be used to
launch spark jobs inside map reduce?
1, SIM
2. SIMR
3. SIR
4. RIS
Question184. HDFS works in a __________ fashion
1.Master worker
2.Master slave
3.Worker/slave
4.All of the above
Question185. HDFS by default replicates each data
block ____ times on different nodes and on at least
____ racks
1.3,2
2.1,2
3.2,3
4.1,3
Question186. You can run pig in batch mode using
__________
1.Pig shell command
2.Pig scripts
3.Pig options
4.All of the above
Question187. Which of the following is the primitive
data type in Avro?
1.Null
2.Boolean
3.Float
4.All of the above
Question188. ___________ name node is used when
the primary name node goes down
1.Rack
2.Data
3.Secondary
4.None of the abive
Question189. Which command is used to disable all the
tables matching the given regex?
1.Remove all
2.Drop all
3.Disable all
4.All of the above
Question190. Ambari ___________ deliver a template
approach to cluster deployment
1.View
2.Stack advisor
3.Blueprints
4.All of the above
Question191. Cassandra uses a protocol called
__________ to discover location and state information
1.Gossip
2.Intergos
3.Goss
4.All of the above
Question192. Gzip (short for GNU zip) generates
compressed files that have a ________extension
1..gzip
2..gz
3..gzp
4..g
Question193. Falcon provides _________ workflow for
copying data from source to target.
1.Recurring
2.Investment
3.Data
4.None of the above
Question194. ___________- is the node responsible for
all reads amd writes for the given partition
1.Replicas
2.Leader
3.Follower
4.Isr
Question195. The compression offset map grows to
_______ gb per terabyte compressed
1.1-3
2.42659
3.20-22
4.0-1
Question196. Drill also provides intuitive extensions to
SQL to work with ____________ data types
1.Simple
2.Nested
3.Int
4.All of the above
Question197. Hive uses _________- for logging
1.Logj4
2.Log4l
3.Log4i
4.Log4j
Question198. Spark SQL provides a domain specific
language to manipulate _______________---- in scala,
java or python
1.Spark streaming
2.Spark SQL
3.RDDs
4.All of the above
Question199. HBase is a distributed _________
database built on top of the hadoop file system
1.Column oriented
2.Row oriented
3.Tuple oriented
4.None of the above
Question200. Which of the following has method to
deal with metadata?
1.Load push down
2.Load metadata
3.Load caster
4.All of the above
Question201. Which of the following is a collaborative
data analytics and visualization tool?
1.ACE
2.Abdera
3.Zeppelin
4.Accumulo
Question202. Ignite is a unified _______ data fabric
providing high performance, distributed im memory
data management
1.Column
2.In memory
3.Row oriented
4.Column oriented
Question203. Avro messages are framed as a list of
__________
1.Buffers
2.Frames
3.Rows
4.Column
Question204. ___________is a distributed and scalable
OLAP engine built on hadoop to support extremely
large data sets
1.Kylin
2.Lens
3.Log4cxx2
4.MRQL
Question205. Sqoop is an open source tool written
at___________
1.Cloudera
2.IBM
3.Microsoft
4. All of the above
Question206. Zookeeper essentially mirrors the
_______ functionality exposed in the linux kernel
1.Iread
2.Inotify
3.Iwrite
4.Icount
Question207. Apache bigtop uses _________ for
continuous integration testing
1.Jenkinstop
2.Jerry
3.Jenkins
4.None of the above
Question208. Which of the following command is used
to show values to key used in pig?
1.Set
2.Declare
3.Display
4.All of the above
Question209. For apache ________ users storm utilizes
the same ODBC interfaces
1.C takers
2.Hive
3.Pig
4.Oozie
Question210. The tokens are passed through a lucene
___________ to produce NGrams of the desired length
1.Shnglefil
2.Shingle filter
3.Single filter
4.Collfilter
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata
Quiz bigdata

More Related Content

Similar to Quiz bigdata

OS Module-2.pptx
OS Module-2.pptxOS Module-2.pptx
OS Module-2.pptx
bleh23
 
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
lucenerevolution
 
KEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road mapKEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road map
lucenerevolution
 

Similar to Quiz bigdata (20)

Sayeh extension(v23)
Sayeh extension(v23)Sayeh extension(v23)
Sayeh extension(v23)
 
Ds
DsDs
Ds
 
Computer and information model test for bank exams
Computer and information model test for bank examsComputer and information model test for bank exams
Computer and information model test for bank exams
 
Devry CIS 247 Full Course Latest
Devry CIS 247 Full Course LatestDevry CIS 247 Full Course Latest
Devry CIS 247 Full Course Latest
 
OS Module-2.pptx
OS Module-2.pptxOS Module-2.pptx
OS Module-2.pptx
 
Lab manual etl
Lab manual etlLab manual etl
Lab manual etl
 
Big data quiz
Big data quizBig data quiz
Big data quiz
 
OS_module2. .pptx
OS_module2.                          .pptxOS_module2.                          .pptx
OS_module2. .pptx
 
C and data structure
C and data structureC and data structure
C and data structure
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Computer awarness question bank
Computer awarness question bankComputer awarness question bank
Computer awarness question bank
 
Class notes(week 9) on multithreading
Class notes(week 9) on multithreadingClass notes(week 9) on multithreading
Class notes(week 9) on multithreading
 
Nicpaper2009
Nicpaper2009Nicpaper2009
Nicpaper2009
 
Sbi po computer quiz
Sbi po computer quizSbi po computer quiz
Sbi po computer quiz
 
Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015
 
MCQ on Computer and ICT
MCQ on Computer and ICTMCQ on Computer and ICT
MCQ on Computer and ICT
 
Pawan industrial training presentation on Hadoop, Clustering and Network virt...
Pawan industrial training presentation on Hadoop, Clustering and Network virt...Pawan industrial training presentation on Hadoop, Clustering and Network virt...
Pawan industrial training presentation on Hadoop, Clustering and Network virt...
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
 
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
 
KEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road mapKEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road map
 

More from Imtiyaz Ahmad Khan (6)

RAILWAY SUMMER TRAINING REPORT
RAILWAY SUMMER TRAINING REPORTRAILWAY SUMMER TRAINING REPORT
RAILWAY SUMMER TRAINING REPORT
 
Gorakhpur railway summer training report
Gorakhpur railway summer training reportGorakhpur railway summer training report
Gorakhpur railway summer training report
 
Angular js PPT
Angular js PPTAngular js PPT
Angular js PPT
 
Android presentation
Android presentationAndroid presentation
Android presentation
 
OSI Model
OSI ModelOSI Model
OSI Model
 
Summer training ppt
Summer training pptSummer training ppt
Summer training ppt
 

Recently uploaded

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 

Quiz bigdata

  • 1. Question1. Which of the followingis a monitoring solution for hadoop? 1. Sirona 2. Sentry 3. Slider 4. Streams Question2. __________ is a distributed machine learning framework on top of spark 1. MLlib 2. Spark Streaming 3. GraphX 4. RDDs Question3. Point out the correct statement? 1. Knox is a stateless reverse proxy framework 2. Knox also intercepts REST/HTTP calls and provides authentication 3. Knox scales linearlyby adding more knox nodes as the load increases 4. All of the mentioned Question4. PCollection,PTable, and PGroupedTableall support a __________ operation. 1. Intersection 2. Union 3. OR
  • 2. 4. None of the mentioned Question 5. How many types of mode are present in Hama? 1. 2 2. 3 3. 4 4. 5 Question6. The IBM ____________ Platform provides all the foundational buildingblocks of trusted information, includingdata integration, data warehousing, master data management, big data and information governance. 1. Infostream 2. Infosphere 3. Infosurface 4. Infodata Question7. ________ is the name of the archive you would like to create. 1. Archive 2. Archive name 3. Name 4. None of the mentioned Question 8. Ambari provides a _______API that enables integration with existing tools, such as Microsoft System Center.
  • 3. 1. Restless 2. Web services 3. Restful 4. None of the mentioned Question9. _______ forge software for the development of software projects. 1. Oozie 2. Allura 3. Ambari 4. All of the mentioned Question10. Posting format now uses a __________ API when writing postings just like doc values. 1. Push 2. Pull 3. Read 4. All of the mentioned Question11. Point out the correct statement 1. BuildingPylucene requires CNU make, a recent version of ant capableof buildingjavalucene and a c++ compiler 2. Pylucene is supported on Mac OS X, linux, SOlaries and windows 3. Use of the setuptoolsis recommended for lucene 4. All the mentioned
  • 4. 5. Question12. ________ buildsvirtualmachines of branches trunk and 0.3 for KVM, VMWare and virtual box. 1. Bigtop-trunk-pakagetest 2. Bigtop-trunk-repository 3. Bigtop-VM-matrix 4. None of the mentioned Question13. Zookeeper is used for configuration,leader election in cloud edition of 1. Solr 2. Solur 3. Solar101 4. Solr Question14. How are keys and values presented and passed to the reducers during a standard sort and shuffle phase of Mapreduce? 1. Keys are presented to reducer in sorted order; values for a given key are not sorted 2. Keys are presented to reducer in sorted order; values for a given key are sorted in ascending order 3. Keys are presented to reducer in random order; values for a given key are not sorted 4. Keys are presented to reducer in random order; values for a given key are sorted in ascending order Question15. Datastage RTI is real time integrationpack for:
  • 5. 1. STD 2. ISD 3. EXD 4. None of the above Question16. Which mapreduce stage serves as a barrier, where all the previous stages must be completed before it may proceed? 1. Combine 2. Group (a.k.a. ‘shuffle’) 3. Reduce 4. Write Question17. Which of the following format is more compression aggressive? 1. Partition compressed 2. Record compressed 3. Block compressed 4. Uncompressed Question18. _________ is the way of encodingstructured data in an efficient yet extensible format. 1. Thrift 2. Protocol buffers 3. Avro 4. None of the above
  • 6. Question19. Which of the following argument is not supported by import-all-tabletool? 1. Class name 2. Package name 3. Database name 4. Table name Question20. Which of the following operating system is not supported by big top? 1. Fedora 2. Solaris 3. Ubuntu 4. SUSE Question21. Distributed modes are mapped in the _____ file. 1. Groomservers 2. Grervers 3. Grsvers 4. Groom Question22. ________ is the architectural center of hadoop that allowsmultipledata processing engines. 1. YARN 2. Hive 3. Incubator 4. Chuckwa
  • 7. Question23. Users can easily run spark on top of amazons____________ 1. ‘infosphere 2. ‘EC2 3. EMR 4. None of the above Question24. Which of the following projects is interface definitionlanguage for hadoop? 1. Oozie 2. Mahout 3. Thrift 4. Impala Question25. Output of the mapper is first written on the local disk for sorting and _____ process. 1. Shuffling 2. Secondary sorting 3. Forking 4. Reducing Question26. HDT projects work with eclipse version _____ and above 1. 3.4 2. 3.5 3. 3.6 4. 3.7
  • 8. Question27. Which of the following languageis not supported by spark? 1. Java 2. Pascal 3. Scala 4. Python Question28. Data analyticsscripts are written in __________ 1. Hivw 2. CQL 3. Piglatin 4. Java Question29. Ripper is a browser based mobile phone emulatordesigned to aid in the development of ______ bases mobile application. 1. Javascript’ 2. Java 3. C++ 4. HTML5 Question30. If you set the inlineLOB limit to ____, all large objects will be placed in external storage. 1. 0 2. 1 3. 2 4. 3
  • 9. Question31. Hadoop archives reliabilityby replicatingthe data across multiplehosts, and hence does not require _____ storage on hosts. 1. RAID 2. Standard RAID levels 3. ZFS 4. Operating system Question32. The configuration file must be owned by the user running 1. Data manager 2. Node manager 3. Validationmanager 4. None of the above Question33. ________ is non blockinga synchronous event driven high performance web framework 1. AWS 2. AWF 3. AWT 4. ASW Question34. Falcon provides seamless integrationwith 1. HCatalog 2. Metastore 3. HBase 4. Kafka
  • 10. Question35. One supported datatype that deserves special mention are: 1. Money 2. Counters 3. Smallint 4. Tinyint Question36. _______ are chukwa processes that actually produce data 1. Collectors 2. Agents 3. Hbase table 4. HCatalog Question37. Which of the following hadoopfile formats is supported by impala? 1. Sequencefile’ 2. Avro 3. Rcfile 4. All of the above Question38. Avro is said to be the future ___________ layer of hadoop 1. RMC 2. RPC 3. RDC 4. All of the above
  • 11. Question39. ______ nodes are the mechanism by which a workflow triggers the execution of a computation/processing task 1. Server 2. Client 3. Mechanism 4. Action Question40. The _______ attribute in the join node is the name of the workflow join node 1. Name 2. To 3. Down 4. All of the above Question41. Yarn commands are invoked by the _____ script 1. Hive 2. Bin 3. Hadoop 4. Home Question42. Which of the following function is used to read data in PIG? 1. Write 2. Read 3. Load 4. None of the above
  • 12. Question43. Which of the following hive commands is not supported by hcatalog? 1. Alter index rebuild 2. Create new 3. Show functions 4. Drop table Question44. Apache hadoopdevelopmenttools is an effort undergoing incubationat 1. ADF 2. ASF 3. HCC 4. AFS Question45. Kafka users key value pairs in the _________ file format for configuration 1. RFC 2. Avro 3. Property 4. None of the above Question46. Facebook tackles big data with __________ based in hadoop 1. Project prism 2. Prism 3. Project big 4. Project data
  • 13. Question47. The size of block in HDFCs is 1. 512 bytes 2. 64 mb 3. 1024 kb 4. None of the above Question48. Which is the most popularNoSQL databases for scalable big data store with hadoop? 1. Hbase 2. mongoDB 3. Cassandra 4. None of the above Question 49. A ________- can route requests to multiple knox instances 1. Collector 2. Load balancer 3. Comparator 4. All of the above Question50. Hcatalog is installedwith hive, starting with hive release 1. 0.10..0 2. 0.9.0 3. 0.11.0 4. 0.12.0 Question51. Tablemetadata in hive is:
  • 14. 1. Stored as metadata on the name node 2. Stored along with the data in HDFCs 3. Stored in the metastore 4. Stored in zookeeper Question52. Avro schemes are defined with ________ 1. JSON 2. XML 3. JAVA 4. All of the above Question53. Spark was initiallystarted by ___________ at uc Berkeley AMPlab in 2009 1. Matei Zaharia 2. Mahek Zaharia 3. Doug cutting 4. Stonebreaker Question54. __________ does rewrite data and pack rows into column for certain time periods 1. Open TS 2. Open TSDB 3. Open TSD 4. Open DB Question55. Which of the following phrases occur simultaneously 1. Shuffle and sort
  • 15. 2. Reduce and sort 3. Shuffle and map 4. All of the above Question56. ________ command fetches the contents of row or a cell 1. Select 2. Get 3. Put 4. None of the above Quesiotn57. _______ are encoded as a series of blocks 1. Arrays 2. Enum 3. Unions 4. Maps Question58. Hive also support custom extensions written in 1. C# 2. Java 3. C 4. C++ Question59. How many types of nodes are present in storm cluster? 1. 1 2. 2 3. 3
  • 16. 4. 4 Question60. All decision nodes must have a ____________ element to avoidbringing the workflow into an error state if none of the predicatesevaluatesto true. 1. Name 2. Default 3. Server 4. Client Question61. ________ is a rest API for Hcatalog 1. Web hcat 2. Wbhcat 3. Inphcat 4. None of the above Question62. Streaming supports streaming commands option as well as ____________ command options 1. Generic 2. Tool 3. Library 4. Task Questio63. By default collectorslisten on port 1. 8008 2. 8070 3. 8080 4. None of the above
  • 17. Question64. _______ communicate with the client and handledata related operations. 1. Master server 2. Region server 3. Htable 4. All of the above Question65. We can declare the scheme of our data either in ________ file 1. JSON 2. XML 3. SQL 4. VB Question66. ________ provides a couchbase server hadoop connector by means of sqoop 1. Memcache 2. Couchbase 3. Hbase 4. All of the above Question67. Storm integrates with _________ via apache slider 1. Scheduler 2. Yarn 3. Compaction 4. All of the above
  • 18. Question68. Avro-backed table can simply be created by using ___________ in a DDL statement 1. Stored as avro 2. Stored as hive’ 3. Stored as avrohive 4. Stored as serd Question69. Drill analyze semistructured/nested data coming from ______applications 1. RDBMS 2. NoSQL 3. newSQL 4. none of the above Question70. The hadoop list includes the HBase Database, the apache mahout __________ system and matrix operations. 1. Machinelearning 2. Pattern recognition 3. Statistical classification 4. Articficial classification Question71. Oozie workflow jobs are directed _______ graphs of actions 1. Acyclical 2. Cyclical 3. Elliptical
  • 19. 4. All of the above Question72. ___ is an open source SQL query engine for apache Hbase 1. Pig 2. Phoenix 3. Pivot’ 4. None of the above Question73. $ pig x tez_local will enable _____ mode in pig 1. Mapreduce 2. Tez 3. Local 4. None of the above Question74. In comparison to SQl, pig uses 1. Lazy evaluation 2. ETL 3. Supports pipelinessplits 4. All of the above Question75. For Apache _________ users, storm utilizes the same ODBC interfaces 1. C takes 2. Hive 3. Pig 4. Oozie
  • 20. Question76. In one or more actionsstarted by the workflow job are executed when the _________ node is reached, the actionswill be killed. 1. Kill’ 2. Start 3. End 4. Finish Question77. Which of the following data type is supported by hive? 1. Map 2. Record 3. String 4. Enum Question78. Hcatalog supports reading and writing files in any format for which a _____ can be written 1. SerDE 2. SaerDear 3. Doc Sear 4. All Question79. _______ is python port of the core project 1. Solr 2. Lucene core 3. Lucy 4. Pylucene
  • 21. Question80. Apache storm added open source, stream data processing to _________ data platform 1. Cloudera 2. Hortonworks 3. Local cloudera 4. Map R Question81. Which of the following is spatialinformation system? 1. Sling 2. Solr 3. SIS 4. All of the above Question82. _______ properties can be overriddenby specifying them in a job-xml file or configuration element. 1. Pipe 2. Decision 3. Flag 4. None of the above Question83. CDH process and control sensitive data and facilities: 1. Multi-tenancy 2. Flexibility 3. Scalability 4. All of the above
  • 22. Qyestion84. Avro supports _________ kinds of complex types 1. 3 2. 4 3. 6 4. 7 Question85. With _________we can store data and read it easily with variousprogramming languages. 1. Thrift 2. Protocol buffers 3. Avro 4. None of the above Question86. A float parameter defaultsto 0.0001f, which means we can deal with 1 error every ________ rows 1. 1000 2. 10000 3. 1 millionrows 4. None of the above Question87. The ________ data mapper framework makes it easier to use a databasewith Java or.NET applications 1. iBix 2. Helix 3. iBATIS 4. iBAT
  • 23. Question88. ___________ is the most popularhigh level java API in Hadoop Ecosystem 1. scalding 2. HCatalog 3. Cascalog 4. Cascading Question89. Spark includesa collection over _________ operationsfor transforming data and familier data frame APIs for manipulatingsemi-structured data 1. 50 2. 60 3. 70 4. 80 Question90. Zookeper’s architecture supports high ________ through redundantservices 1. Flexibilty’ 2. Scalability 3. Availability 4. Interactivity Question91. The Lucene ____________ is pleasedto announcethe availabilityof Apache Lucene 5.0.0 and Apache solr 5.0.0 1. PMC 2. RPC
  • 24. 3. CPM 4. All of the above Question92. EC2 capacity can be increased or decreased in real time from as few as one to more than ________ virtual machines simultaneousl 1. 1000 2. 2000 3. 3000 4. None of the above Question93. HTD has been tested on_________- and Juno. And can work 0n kepler as well 1. Raibow 2. Indigo 3. Idiavo 4. Hadovo Question94. Each kafka partitionhas one server which acts as the __________ 1. Leaders 2. Followers 3. Staters 4. All of the above Question95. The right numbers of reduces seems to be 1. 0.9 2. 0.8
  • 25. 3. 0.36 4. 0.95 Question96. Which of the following is a configuration management system? 1. Alex 2. Puppet 3. Acem 4. None of the above Question97. Which of the following is the only for storage with limited compute? 1. Hot 2. Cold 3. Warm 4. All_SSD Question98. Grooms servers start up with a _______ instance and a RPC proxy to contact the bsp master 1. RPC 2. BSP Peer 3. LPC 4. None of the above Question99. A ________ represents a distributed, immutable collectionof elements of type t. 1. Pcollect 2. Pcollection
  • 26. 3. Pcol 4. All of the above Question100. ________ is used to read data from bytes buffers 1. Write{} 2. Read{} 3. Readwrite{} 4. All of the above Q101-Which is the default Input Formats defined in Hadoop? 1. SequenceFileInputFormat 2. ByteInputFormat 3. KeyValueInputFormat 4. TextInputFormat Q102. Which of the following is not an inputformat in Hadoop? 1.TextInputFormat 2. ByteInputFormat 3. SequenceFileInputFormat
  • 27. 4. KeyValueInputFormat Q103. Which of the following is a valid flow in Hadoop ? 1.Input -> Reducer -> Mapper -> Combiner -> -> Output 2. Input -> Mapper -> Reducer -> Combiner -> Output 3. Input -> Mapper -> Combiner -> Reducer -> Output 4. Input -> Reducer -> Combiner -> Mapper -> Output Q104. MapReduce was devised by ... 1.Apple 2. Google 3. Microsoft 4. Samsung Q105. Which of the following is not a phase of Reducer ? 1. Map 2. Reduce 3. Shuffle
  • 28. 4. Sort Q106. How many instances of Job tracker can run on Hadoop cluster ? 1.1 2. 2 3.3 4.4 Q107. Which of the following is not the Dameon process that runs on a hadoopcluster ? 1.JobTracker 2.DataNode 3.TaskTracker 4.TaskNode Q108-As companies move past the experimental phase with Hadoop,many cite the need for additionalcapabilities, including:
  • 29. 1.Improved data storage and informationretrieval 2.Improved extract, transform and loadfeatures for data integration 3.Improved data warehousing functionality 4.Improved security, workload management and SQL support Q109-Point out the correct statement : 1.Hadoopdo need specializedhardware to process the data 2.Hadoop2.0 allowslive stream processing of real time data 3.In Hadoop programming framework output files are dividedin to lines or records 4.None of the mentioned Q110-. According to analysts, for what can traditionalIT systems provide a foundationwhen they’re integrated with big data technologies like Hadoop? 1.Big data management and data mining 2. Data warehousing and business intelligence 3.Management of Hadoopclusters 4.Collectingand storing unstructured data
  • 30. Q111- Point out the wrong statement : 1.Hardtop’s processing capabilitiesare huge and its real advantagelies in the abilityto process terabytes & petabytes of data 2.Hadoopuses a programming model called “MapReduce”,all the programs should confirms to this model in order to work on Hadoopplatform 3.The programming model, MapReduce,used by Hadoop is difficult to write and test 4.All of the mentioned Q112- What was Hadoopnamed after? 1. Creator Doug Cutting’s favorite circus act 2.Cutting’s high school rock band 3.The toy elephantof Cutting’s son 4.A sound Cutting’slaptop made during Hadoop’s development Q113- All of the following accurately describe Hadoop, EXCEPT: 1.Open source 2. Real-time 3.Java-based 4. Distributed computing approach
  • 31. Q114- __________ can best be described as a programming model used to develop Hadoop-basedapplicationsthat can process massive amounts of data. 1.MapReduce 2.Mahout 3.Oozie 4.All of the mentioned Q115- __________ has the world’s largest Hadoop cluster. 1.Apple 2. Datamatics 3.Facebook 4.None of the mentioned Q116- Facebook Tackles Big Data With _______ based on Hadoop. 1.‘Project Prism’ 2.‘Prism’ 3.‘Project Big’ 4. ‘Project Data’
  • 32. Q 117- What is the main problem faced while reading and writing data in parallelfrom multiple disks? 1.Processing high volume of data faster. 2. Combining data from multipledisks. 3. The software required to do this task is extremely costly. 4. The hardware required to do this task is extremely costly. Q118 - Under Hadoop High Availability,Fencing means 1.Preventing a previously active namenode from start running again. 2. Preventing the start of a failover in the event of network failure with the active namenode. 3. Preventing the power down to the previously active namenode.
  • 33. 4. Preventing a previously active namenodefrom writing to the edit log. Q119 - The default replicationfactor for HDFS file system in hadoopis 1.1 2. 2 3. 3 4. 4 Q120 - The hadfscommand put is used to 1.Copy files from local file system to HDFS. 2. Copy files or directories from local file system to HDFS.
  • 34. 3. Copy files from from HDFS to local filesystem. 4. Copy files or directories from HDFS to localfilesystem. Q121 - The namenodeknows that the datanodeis active using a mechanism known as 1.heartbeats 2. datapulse 3. h-signal 4. Active-pulse Q122 - When a machine is declared as a datanode,the disk space in it 1.Can be used only for HDFS storage
  • 35. 2. Can be used for both HDFS and non-HDFs storage 3. Cannot be accessed by non-hadoop commands 4. cannot store text files. Q123 - The data from a remote hadoopcluster can 1. not be read by another hadoopcluster 2. be read using http 3. be read using hhtp 4. be read suing hftp Q124 - Which one is not one of the big data feature? 1. Velocity
  • 36. 2. Veracity 3. volume 4. variety Q125 - What is HBASE? 1. Hbase is separate set of the Java API for Hadoop cluster. 2. Hbase is a part of the Apache Hadoopproject that provides interface for scanning large amount of data using Hadoopinfrastructure. 3. Hbase is a "database"like interface to Hadoopcluster data. 4. HBase is a part of the Apache Hadoop project that provides a SQL like interface for data processing.
  • 37. Q125 - Which of the following is false about RawComparator ? 1.Compare the keys by byte. 2. Performance can be improved in sort and suffle phase by using RawComparator. 3. Intermediary keys are deserialized to perform a comparison. Q 126 - Zookeeper ensures that 1. All the namenodes are actively serving the client requests 2. Only one namenode is actively serving the client requests 3. A failover is triggered when any of the datanodefails. 4. A failover can not be started by hadoopadministrator.
  • 38. Q 127 - Which scenario demandshighest bandwidthfor data transfer between nodes in Hadoop? 1. Different nodes on the same rack 2. Nodes on different racks in the same data center. 3. Nodes in different data centers 4. Data on the same node. Q128 - The hadoopframe work is written in 1. C++ 2. Python 3. Java
  • 39. 4. GO Q129 - When a client contacts the namenode for accessing a file, the namenoderesponds with 1. Size of the file requested. 2. Block ID of the file requested. 3. Block ID and hostname of any one of the data nodes containingthat block. 4. Block ID and hostname of all the data nodes containing that block. Q130 - Which of the following is not a goal of HDFS? 1. Fault detection and recovery 2. Handle huge dataset
  • 40. 3. Prevent deletionof data 4. Provide high network bandwidthfor data movement Q 131 - In HDFS the files cannot be 1. read 2. deleted 3. executed 4. Archived Q132 - The number of tasks a task tracker can accept dependson 1. Maximum memory availablein the node
  • 41. 2. Not limited 3. Number of slots configured in it 4. As decided by the jobTracker Q133 - When using HDFS, what occurs when a file is deleted from the command line? 1. It is permanently deleted if trash is enabled. 2. It is placed into a trash directory common to all users for that cluster. 3. It is permanently deleted and the file attributesare recorded in a log file. 4. It is moved into the trash directory of the user who deleted it if trash is enabled.
  • 42. Q134 - The org.apache.hadoop.io.Writableinterface declares which two methods? (Choose 2 answers.) publicvoid readFields(DataInput). publicvoid read(DataInput). publicvoid writeFields(DataOutput). publicvoid write(DataOutput). 1. 1 & 4 2. 2 & 3 3. 3 & 4 4. 2 & 4
  • 43. Question135. Mapreduce has undergone a complete overhaul in hadoop? 1.0.21 2.0.23 3.0.24 4.0.26 Question136. __________ is the slave/worker node and holds the user data in the form of data blocks 1.Data node 2.Name node 3.Data block 4.Replication Qyestion137. Spark is engineered from the bottom up for performance running __________ 1.100x 2.150x 3.200x 4.None of the above Question138. _________nodes are the mechanism by which a workflow triggers the execution of a computation/processing task 1.Server
  • 44. 2.Client 3.Mechanism 4.Action Question139. __________ maps input key/value pairs to a set of intermediate key/value pairs 1.Mapper 2.Reducer 3.Mapper and reducer 4.None of the above Question140. Zookeeper keep track of the cluster state such as the ____________- table location 1.Domain 2.Node 3.Root 4.All of the above Question141. When __________ contents exceed a configurable threshold, the memtable data, which includes indexes, is put in a queue to be flushed to disk 1.Subtable 2.Memtable 3.Intable 4.Memorytable
  • 45. Question142. Apache knox accesses hadoop cluster over 1.HTTP 2.TCT 3.ICMP 4.None of the above Question143. ___________ supports a new command shell beeline that works with hiveserver2. 1.Hiveserver2 2.Hiveserver3 3.Hiveserver4 4.Hiveserver5 Question144. ________ sink can be a text file, the console display, a simple HDFC path or a null bucket where the data is simply deleted 1.Collector tier event 2.Agent tier event 3.Basic 4.None of the above Question145. __________ name node is used when the primary name node goes down 1.Rack
  • 46. 2.Data 3.Secondary 4.None of the above Question146. Data transfer between web-console and clients are protected by using 1.SSL 2.Kerberos 3.SSH 4.None of the above Question147. Which of the following is one of the possible state for a workflow jobs? 1.PREP 2.START 3.RESUME 4.END Question148. Stratus will be a polygot _______ framework 1.Daas 2.Paas 3.Saas 4.Raas
  • 47. Question149. All file access user java’s _______ APIs which give lucen stronger index safety 1.NIO.2 2.NIO.3 3.NIO.4 4.NIO.5 Question150. Which of the following is a standard compliment XML Query processor? 1.Whirr 2.VXQuery 3.Knife 4.Lens Question151. ______ is a query processing and optimization system for large-scale 1.MRQL 2.Nifi 3.Openaz 4.ODF toolkit Question152. Reduce progress () gets the progress of the jobs reduce tasks as a float between 1.0.0-1.0 2.1.0-2.0 3.2.0-3.0
  • 48. 4.3.0-4.0 Question153. _____- is a framework for building java server applications GUIs 1.My faces 2.Muse 3.Flume 4.Big top Question154. Apache fkune 1.3.0 is the fourth release under the auspices of apache of the so-called _____ codeline 1.NG 2.ND 3.NF 4.NR Question155. Starting in hive ______ the avro scheme can be inferred from the hive table scheme 1.0.14 2.0.12 3.0.13 4.0.11 Question156. A workflow definition is a ___ with control flow nodes or action nodes
  • 49. 1.CAG 2.DAG 3.BAG 4.None of the above Question157. Lucene provides scalable high- performance indexing over ______ per hour on modern hardware 1.1TB 2.150GB 3.10GB 4.200 GB Question158. The right level of parallelism for maps seems to be around _____ maps pernode 1.1to 10 2.10 to 100 3.100 to 150 4.150 to 200 Question159. The LZO compression format is composed of approximately ______ blocks of compressed data 1.128k 2.256k
  • 50. 3.24k 4.36k Question160. ___________ is the software development collaboration tool. 1.Buildr 2.Cassandra 3.Bloodhound 4.All of the above Question161. A ________ is an operation on the stream that can transform the stream 1.Decorator 2.Source 3.Sinks 4.All of the above Question162. _________ has the worlds largest hadoop cluster 1.Apple 2.Datamatics 3.Facebook 4.None of the above
  • 51. Question163. When a ___________ is triggered the client receives a packet saying that the znode has changed 1.Event’ 2.Watch 3.Row 4.Value Question164. Ambary leverages __________ for system altering and will send emails when your attention is needed 1.Nagios 2.Nagaond 3.Ganglia 4.None of the above Question165. ____ is a software distribution framework based on OSGi 1.ACE 2.Abdera 3.Zeppelin 4.Accumulo Question166. Which of the following is content managementand punlishing system based on cocoon?
  • 52. 1.Lib cloud 2.Kafka 3.Lenya 4.All of the above Question167. If the failure is of ________ nature oozie will suspend the workflow job 1.Transient 2.Non transient 3.Permanent 4.Non permanent Question168. _________- node distributes code across the cluster 1.Zookeeper 2.Nimbus 3.Supervisor 4.Non of the above Question169. A workflow definition musthave one _____ node 1.Start 2.Resume 3.Finish 4.Non of the above
  • 53. Question170. _________ is a rest API for HCatalog 1.Webhcat 2.WbhCAT 3.InpJcat 4.None of the above Question171. Which of the following fike contains user defined functions (UDGCs) 1.Script2-local.pig 2.Pig.jar 3.Tutorial.jar 4.Excite.log.bz2 Question172. Helprace is using zookeeper on a ______ cluster in conjugation with hadoop and hBase 1.3 node 2.4 node 3.5 node 4.6 node Question173. ____________ represents the logical computations of your crunch pipelines 1.Do Fns 2.Three Fns 3.Do fn
  • 54. 4.None of the above Question174. _____________ has stronger ordering guarantees than a traditional messaging system 1.Kafka 2.Slider 3.Suz 4.None of the above Question175. HBase is _________ defines only column families 1.Row oriented 2.Scheme less 3.Fixed scheme 4.All of the above Question176. An input ___________ is a chunk of the input that is processed by a single map 1.Textformat 2.Split 3.Datanode 4.All of the above Question177. ___________ permits data written by one system to be efficiency sorted by another system 1.Complex data type
  • 55. 2.Order 3.Sort order 4.All of the above Question178. __________ text is appropriate for most non binary data types 1.Character 2.Binary 3.Delimited 4.None of the above Question179. __________ is an open source set of libraries tools examples and documentation engineered 1.Kite 2.Kize 3.Ookie 4.All of the above Question180. Map output larger than_______ percent of the memory allocated to copying map outputs 1.10 2.15 3.25 4.35
  • 56. Question181. Cassandra creates a __________ for each table which allows you to symlink a table to a chosen physical drive or data volume 1.Directory 2.Subdirectory 3.Domain 4.Path Question182. Use ________ and embedded the scheme in the create statement 1.Scheme.literal 2.Scheme.lit 3.Row.literal 4.All of the above Question183. Which of the following can be used to launch spark jobs inside map reduce? 1, SIM 2. SIMR 3. SIR 4. RIS Question184. HDFS works in a __________ fashion 1.Master worker
  • 57. 2.Master slave 3.Worker/slave 4.All of the above Question185. HDFS by default replicates each data block ____ times on different nodes and on at least ____ racks 1.3,2 2.1,2 3.2,3 4.1,3 Question186. You can run pig in batch mode using __________ 1.Pig shell command 2.Pig scripts 3.Pig options 4.All of the above Question187. Which of the following is the primitive data type in Avro? 1.Null 2.Boolean 3.Float 4.All of the above
  • 58. Question188. ___________ name node is used when the primary name node goes down 1.Rack 2.Data 3.Secondary 4.None of the abive Question189. Which command is used to disable all the tables matching the given regex? 1.Remove all 2.Drop all 3.Disable all 4.All of the above Question190. Ambari ___________ deliver a template approach to cluster deployment 1.View 2.Stack advisor 3.Blueprints 4.All of the above Question191. Cassandra uses a protocol called __________ to discover location and state information 1.Gossip 2.Intergos
  • 59. 3.Goss 4.All of the above Question192. Gzip (short for GNU zip) generates compressed files that have a ________extension 1..gzip 2..gz 3..gzp 4..g Question193. Falcon provides _________ workflow for copying data from source to target. 1.Recurring 2.Investment 3.Data 4.None of the above Question194. ___________- is the node responsible for all reads amd writes for the given partition 1.Replicas 2.Leader 3.Follower 4.Isr Question195. The compression offset map grows to _______ gb per terabyte compressed
  • 60. 1.1-3 2.42659 3.20-22 4.0-1 Question196. Drill also provides intuitive extensions to SQL to work with ____________ data types 1.Simple 2.Nested 3.Int 4.All of the above Question197. Hive uses _________- for logging 1.Logj4 2.Log4l 3.Log4i 4.Log4j Question198. Spark SQL provides a domain specific language to manipulate _______________---- in scala, java or python 1.Spark streaming 2.Spark SQL 3.RDDs 4.All of the above
  • 61. Question199. HBase is a distributed _________ database built on top of the hadoop file system 1.Column oriented 2.Row oriented 3.Tuple oriented 4.None of the above Question200. Which of the following has method to deal with metadata? 1.Load push down 2.Load metadata 3.Load caster 4.All of the above Question201. Which of the following is a collaborative data analytics and visualization tool? 1.ACE 2.Abdera 3.Zeppelin 4.Accumulo Question202. Ignite is a unified _______ data fabric providing high performance, distributed im memory data management 1.Column
  • 62. 2.In memory 3.Row oriented 4.Column oriented Question203. Avro messages are framed as a list of __________ 1.Buffers 2.Frames 3.Rows 4.Column Question204. ___________is a distributed and scalable OLAP engine built on hadoop to support extremely large data sets 1.Kylin 2.Lens 3.Log4cxx2 4.MRQL Question205. Sqoop is an open source tool written at___________ 1.Cloudera 2.IBM 3.Microsoft 4. All of the above
  • 63. Question206. Zookeeper essentially mirrors the _______ functionality exposed in the linux kernel 1.Iread 2.Inotify 3.Iwrite 4.Icount Question207. Apache bigtop uses _________ for continuous integration testing 1.Jenkinstop 2.Jerry 3.Jenkins 4.None of the above Question208. Which of the following command is used to show values to key used in pig? 1.Set 2.Declare 3.Display 4.All of the above Question209. For apache ________ users storm utilizes the same ODBC interfaces 1.C takers 2.Hive
  • 64. 3.Pig 4.Oozie Question210. The tokens are passed through a lucene ___________ to produce NGrams of the desired length 1.Shnglefil 2.Shingle filter 3.Single filter 4.Collfilter