Agile Testing Alliance hosted it's 16th Meetup in Pune on 9th Dec, 2017. Shreya Pal was one of the speakers in the meetup and gave a insightful session on BigData Testing. All the rights belong the author
4. Why not the old approach?
Traditional relational databases like Oracle, MySQL, SQL Server cannot be used to big data
since most of the data we have will be in unstructured format.
• Variety of data – Data can be in the form of images, video, pictures, text, audio etc. This
could be military records, surveillance videos, biological records, genomic data, research
data etc. This data cannot be stored in the row and column format of the RDBMS.
• The volume of data stored in big data is huge. This data needs to be processed fast and
this requires parallel processing of the data. Parallel processing of RDBMS data will be
extremely expensive and inefficient.
• Data creation velocity – Traditional databases cannot handle the velocity with which large
volumes of data is created. Example: 6000 tweets are created every second. 510,000
comments are created every minute. Traditional databases cannot handle this velocity of
data being stored or retrieved.
18. Properties Traditional database testing Big data testing
Data Tester work with structured data Tester works with both
structured as well as
unstructured data
Testing approach is well defined and
time-tested
Testing approach requires
focused R&D efforts
Tester has the option of "Sampling"
strategy doing manually or
"Exhaustive Verification" strategy by
automation tool
"Sampling" strategy in Big
data is a challenge
19. Properties Traditional database testing Big data testing
Validation
Tools
Tester uses either the Excel
based macros or UI based
automation tools
Tester works with both
structured as well as
unstructured data No defined
tools, the range is vast from
programming tools like
MapReduce to HIVEQL
Testing Tools can be used
with basic operating
knowledge and less training.
It requires a specific set of skills
and training to operate testing
tool. Also, the tools are in their
nascent stage and overtime it
may come up with new features.