TESTING BIG DATA
Adoniram Mishra
Benazeer Khan
Agenda
❖ What is Big Data?
❖ Hadoop Ecosystem
❖ Testing Challenges
❖ Test Pyramid
What is Big Data?
What is Hadoop
Hadoop Ecosystem
HDFS
MapReduce
High Level Example
Data
Source
HBase HiveData
Source
Data
Source
I
M
P
O
R
T HQL
UI
Reports
Other
applications
M
R
M
R
M
R
Testing Challenges
Long Feedback Cycle
Data
Deadlines
Automation
Countering Challenges
❖ Sampling.
❖ Basic testing concepts - Boundary Value testing, Equivalence class
partitioning.
❖ Automation
Nightly Build
Tests
Hive Query
Validator
Unit Tests
UI Test
Test Pyramid
Hive Query Validator
Hive Query Validator
Query 1
Query 2
Query 3
Correct
Correct
Incorrect
Nightly tests & Smoke Tests
SMOKE
MODULE 3MODULE 1
MODULE 2
MODULE 4
About Us
● Adoniram Mishra (adoniram@thoughtworks.com)
● Benazeer Khan (benak@thoughtworks.com)
Follow us at:
Facebook: https://www.facebook.com/groups/vodqa/
Twitter: https://twitter.com/vodqancr
Big data testing (1)

Big data testing (1)