The document discusses various big data technologies, describing their purpose, developer, programming languages used, and current stable versions. It covers technologies for data storage like Hadoop, MongoDB, and Splunk Hunk. Data mining technologies mentioned include Presto, RapidMiner, and Elasticsearch. Data analytics technologies covered are Apache Kafka, Splunk, KNIME, Apache Spark, and R-Language. The document also discusses data visualization technologies like Tableau and Plotly, as well as emerging technologies like TensorFlow, Apache Beam, Docker, Apache Airflow, and Kubernetes.
5. Big-data technologies are used to perform accurate
analysis to generate conclusions and predictions so as to
minimise the risks in real-time.
BIG-DATA TECHNOLOGY
www.edureka.co
WHAT IS A BIG-DATA TECHNOLOGY?
13. www.edureka.co
Apache Hadoop
⢠Distributed Data Processing Environment
⢠Uses commodity hardware
⢠Developed by: Apache Software Foundation
in the year 2011 10th of Dec.
⢠Written in: JAVA
⢠Current stable version: Hadoop 3.11
15. www.edureka.co
MongoDB
⢠NoSQL document database
⢠Developed by: MongoDB in the year 2009
11th of Feb
⢠Written in: C++, Go, JavaScript, Python
⢠Current stable version: MongoDB 4.0.10
19. www.edureka.co
Splunk Hunk
⢠Access data from remote Hadoop Clusters.
⢠Developed by: Splunk INC in the year 2013.
⢠Written in: JAVA
⢠Current stable version: Splunk 6.2
21. www.edureka.co
Presto
⢠Open Source Distributed SQL Query Engine
⢠Developed by: Apache Foundation in the
year 2013.
⢠Written in: JAVA
⢠Current stable version: Presto 0.22
23. www.edureka.co
RapidMiner
⢠Powerful and Robust Graphical User
Interface.
⢠Developed by: RapidMiner in the year 2001
⢠Written in: JAVA
⢠Current stable version: RapidMiner 9.2
25. www.edureka.co
ElasticSearch
⢠Based on Lucene Library
⢠Developed by: Elastic NV in the year 2012.
⢠Written in: JAVA
⢠Current stable version: ElasticSearch 7.1
28. www.edureka.co
Apache Kafka
⢠Distributed Streaming Platform
⢠Developed by: Apache Software Foundation
in the year 2011
⢠Written in: Scala, JAVA
⢠Current stable version: Apache Kafka 2.2.0
30. www.edureka.co
Splunk
⢠Used in Application Management, Web
Analytics and Security and compliance
⢠Developed by: Splunk INC in the year 2014
6th May
⢠Written in: AJAX, C++, Python, XML
⢠Current stable version: Splunk 7.3
32. www.edureka.co
KNIME
⢠Used to create Data Flows and uses
Extension Mechanism
⢠Developed by: KNIME in the year 2008
⢠Written in: JAVA
⢠Current stable version: KNIME 3.7.2
41. www.edureka.co
TableAU
⢠Creates NoCode data queries.
⢠Can import all ranges of data sizes
⢠Developed by: TableAU 2013 May 17th
⢠Written in: JAVA, C++, Python, C
⢠Current stable version: TableAU 8.2
43. www.edureka.co
Plotly
⢠Creates graphs faster and more efficient
⢠Developed by: Plotly in the year 2012
⢠Written in: JavaScript
⢠Current stable version: Plotly 1.47.4
47. www.edureka.co
TensorFlow
⢠End to End open source platform for
Machine Learning
⢠Developed by: Google Brain Team in the year
2019
⢠Written in: Python, C++, CUDA
⢠Current stable version: TensorFlow 2.0 beta
49. www.edureka.co
Apache Beam
⢠Provides Portable API Layer.
⢠Developed by: Apache Software Foundation
in the year 2016 June 15th
⢠Written in: JAVA, Python
⢠Current stable version: Apache Beam 0.1.0
incubating.
51. www.edureka.co
Docker
⢠Create, Deploy and Run Applications using
Containers
⢠Developed by: Docker INC in the year 2003
13th of March.
⢠Written in: Go
⢠Current stable version: Docker 18.09
53. www.edureka.co
Apache AirFlow
⢠Work Flow automation and Scheduling
system
⢠Developed by: Apache Software Foundation
on May 15th 2019
⢠Written in: Python
⢠Current stable version: Apache AirFlow
1.10.3
55. www.edureka.co
Kubernetes
⢠Vendor Agnostic Cluster.
⢠Developed by: Cloud Native Computing
Foundation in the year 2015 21st of July
⢠Written in: Go
⢠Current stable version: Kubernetes 1.14