Presented by Ibrahim Muhammadi. Founder - AppWorx.cc
Big Data is revolutionizing how businesses make decisions now. More and more decisions and strategies are now based on data.
3. With more and more digitalization, there is huge
amounts of structured, semistructured and
unstructured data that is being generated.
cc: phsymyst - https://www.flickr.com/photos/78624556@N08
4. In the early days of this explosive growth in digital
data, businesses used to discard additional data
because there was no feasible way to make any sense
out of itcc: Kentrosaurus - https://www.flickr.com/photos/86125591@N00
5. But this is changing rapidly with advancements in
infrastructure needed for data storage and processing
collectively known as BIG DATA
cc: Tom Raftery - https://www.flickr.com/photos/67945918@N00
6. 3Vs of big data: extreme volume of data, wide
variety of data types and the velocity at which the
data must be processed
cc: dalbera - https://www.flickr.com/photos/72746018@N00
7. Such voluminous data can come from different
sources, such as business sales records, the collected
results of experiments, real-time sensors used in IOT
and morecc: bionicteaching - https://www.flickr.com/photos/29096601@N00
8. Adequate compute power is needed to achieve the desired
velocity. This can potentially demand hundreds or thousands
of servers that can distribute the work and operate
collaboratively
cc: midom - https://www.flickr.com/photos/81295370@N00
9. In this short presentation we will look at some of
the more popular tools that have made the Big
Data revolution possible.
cc: Glenn Zucman - https://www.flickr.com/photos/18182611@N00
11. Distributed data storage and processing on consumer
grade hardware makes big data feasible. One open
source project for this is Hadoop.
cc: NASA Goddard Photo and Video - https://www.flickr.com/photos/24662369@N07
12. Hadoop enables distributed processing of large data sets
across clusters of computers using simple programming
models. It is designed to scale up to thousands of machines.
cc: solofotones - https://www.flickr.com/photos/14754973@N08
13. Rather than rely on hardware to deliver high-
availability, the Hadoop library is designed to detect
and handle failures at the application layer, so
delivering a highly-available service.cc: neil cummings - https://www.flickr.com/photos/23874985@N07
15. Another open source tool that is used for Big Data is
Elasticsearch which can do blazing fast searches on
semistructured or unstructured datasets.
cc: DocChewbacca - https://www.flickr.com/photos/49462908@N00
16. Elasticsearch is a part of the Elastic stack or the ELK
stack that also contains Logstash (a data collection and
log parsing tool) and Kibana (for analytics and
visualization)cc: PLeia2 - https://www.flickr.com/photos/64684255@N00
18. Data migration using ETL (Extract - Transform - Load) does
not work well with Big Data and hence the traditional ETL
architecture is now changing to real-time data streaming
cc: SidPix - https://www.flickr.com/photos/22357152@N02
19. Apache Kafka is a high-throughput distributed message
system that is being adopted by hundreds of
companies to manage their real-time data.
cc: r2hox - https://www.flickr.com/photos/72764087@N00
20. Kafka is a perfect tool for building data
pipelines: it is reliable, scalable, and
efficient.cc: ikarusmedia - https://www.flickr.com/photos/32650580@N06
21. R - the language and environment for
statistical computing
22. R is an integrated suite of software
facilities for data manipulation, calculation
and graphical display.cc: Crystal Writer - https://www.flickr.com/photos/17483452@N00
23. With over 2 million users worldwide R is rapidly
becoming the leading programming language in
statistics and data science.
cc: Marc_Smith - https://www.flickr.com/photos/49503165485@N01
24. It is a great tool for data analysis and
can be efficiently used on very large
data sets.cc: Régis Gaidot - https://www.flickr.com/photos/22019171@N00
25. Big Data is the next frontier for innovation,
competition and productivity - in all fields from
healthcare to retail, from manufacturing to personal
and location data.cc: danielfoster437 - https://www.flickr.com/photos/17423713@N03
26. In most industries, established competitors and new
entrants will leverage data-driven strategies to
innovate, compete, and capture value from deep real-
time informationcc: verbeeldingskr8 - https://www.flickr.com/photos/35429044@N04
27. We at appworx.cc offer data services that can help
retail and other clients achieve their big data goals
quickly.
https://www.appworx.cc/datacc: Jason Michael - https://www.flickr.com/photos/70194213@N00