SlideShare a Scribd company logo
XGem
XGem
Big Data
Agenda
 What is Big Data?
 Big Data Technologies
 What is Hadoop?
 Big Data Components
 Hadoop Distributions
 HortonWorkd Data Platform
 Log Analyzed
What is Big Data?
Ernst and Young offers the following definition:
Big Data refers to the dynamic, large and disparate volumes of data being created by people, tools and machines. It
requires new, innovative, and scalable technology to collect, host and analytically process the vast amount of data
gathered in order to derive real-time business insights that relate to consumers, risk, profit, performance, productivity
management and enhanced shareholder value.
The research firm Gartner, defines Big Data as follows:
Big Data is high-volume, high-velocity, and/or high-variety information assets that demand cost-effective,
innovative forms of information processing that enable enhanced insight, decision making and process
automation.
5V’s del Big Data
BigData
3
Variety is the diversity of the data. We have structured
data that fits neatly into rows and columns, or
relational databases and unstructured data that is not
organized in a pre-defined way, for example Tweets,
blogposts, pictures, numbers, and even video data.
Variety
1
Velocity
Velocity is the idea that data is being generated
extremely fast, a process that never stops. Attributes
include near or real-time streaming and local and
cloud-based technologies that can process information
very quickly.
4
Veracity is the conformity to facts and accuracy.
Is the information real, or is it false?
Veracity
2
Volume
Volume is the scale of the data, or the increase in
the amount of data stored.
5
VALUE
Big Data
Value isn't just profit. It may be medical or social benefits, or
customer, employee, personal satisfaction or crime prevention. The
main reasons for why people invest time to understand Big Data is to
derive value from it.
VALUE
Big Data Technologies
What is Apache Hadoop?
• Hadoop is an open-source software
framework used to store and process huge
amounts of data.
• Owned by Apache Software Foundation
• Transforms commodity hardware into a
service that:
• Stores petabytes of data reliably (HDFS)
• Allows huge distributed computations
(MapReduce)
• Key attributes:
• Redundant and reliable
• Doesn’t stop or lose data even if hardware
fails
• Easy to program
• Extremely powerful
• Allows the development of big data
algorithms & tools
• Batch processing centric
• Runs on commodity hardware
• Computers & network
Who build Hadoop?
Who use Hadoop?
2006 2008 2009 2010
The Datagraph Blog
2007
How HDFS Works?
Namenode
Persistent Namespace
Metadata & Journal
Namespace
State
Block
Map
Heartbeats & Block Reports
Block ID  Block Locations
Datanodes
Block ID  Data
Hierarchal Namespace
File Name  BlockIDs
Horizontally Scale IO and Storage
b1
b5
b3
JBOD
BlockStorageNamespace
b2
b3
b1
JBOD
b3
b5
b2
JBOD
b1
b5
b2
JBOD
HDFS Data Reliability
Namenode
Namespace
State
Block
Map
b1
b5
b3
JBOD
b2
b3
b4
JBOD
b3
b5
b2
JBOD
b1b5
b2
JBOD
2. copy
3.
blockReceived
1.
replicate
Bad/lost block
replica
Periodically
check block
checksums
What is the Hadoop framework?
Hadoop framework Components
Hadoop Distributions
Agenda
Hortonworks Solutions
Log Analytics Systems Today
LOG
ANALYTICS
PLATFORMNetwork
Device Logs
• Not all data can be captured
• Not all captured data is valuable
• Transport all data
LOG
ANALYTICS
PLATFORM
Network
Device Logs
HDP
HDF
2. Content-based routing based on dynamic
evaluation of content, attributes, priority
1. Integrate and enrich logs across
data centers and security zones
3. Cost effectively expand collection and grow
timescale of logs collected
Expand Storage Options of Log Data
Thanks!

More Related Content

What's hot

Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US InformationJulian Tong
 
Big data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and HadoopBig data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and Hadoop
SamiraChandan
 
Big Idea For Big Data
Big Idea For Big DataBig Idea For Big Data
Big Idea For Big Data
Dexlab Analytics
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Denodo
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentationAASTHA PANDEY
 
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo
 
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Denodo
 
Big data tools
Big data toolsBig data tools
Big data tools
Novita Sari
 
Denodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and BusinessDenodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
SAishwaryaDinesh
 
Introduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIntroduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data Science
IMC Institute
 
Organising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldOrganising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data World
DataWorks Summit/Hadoop Summit
 
Big data competitive landscape overview
Big data competitive landscape overviewBig data competitive landscape overview
Big data competitive landscape overviewBisakha Praharaj
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companies
Robert Smith
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
Emran Hossain
 
Die Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AIDie Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AI
Denodo
 
Forecast of Big Data Trends
Forecast of Big Data TrendsForecast of Big Data Trends
Forecast of Big Data Trends
IMC Institute
 
Earley Executive Roundtable Summary - Data Analytics
Earley Executive Roundtable Summary - Data AnalyticsEarley Executive Roundtable Summary - Data Analytics
Earley Executive Roundtable Summary - Data Analytics
Earley Information Science
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
Big data(1st presentation)
Big data(1st presentation)Big data(1st presentation)
Big data(1st presentation)
Takrim Ul Islam Laskar
 

What's hot (20)

Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Big data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and HadoopBig data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and Hadoop
 
Big Idea For Big Data
Big Idea For Big DataBig Idea For Big Data
Big Idea For Big Data
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
 
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
 
Big data tools
Big data toolsBig data tools
Big data tools
 
Denodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and BusinessDenodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and Business
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
 
Introduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIntroduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data Science
 
Organising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldOrganising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data World
 
Big data competitive landscape overview
Big data competitive landscape overviewBig data competitive landscape overview
Big data competitive landscape overview
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companies
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
Die Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AIDie Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AI
 
Forecast of Big Data Trends
Forecast of Big Data TrendsForecast of Big Data Trends
Forecast of Big Data Trends
 
Earley Executive Roundtable Summary - Data Analytics
Earley Executive Roundtable Summary - Data AnalyticsEarley Executive Roundtable Summary - Data Analytics
Earley Executive Roundtable Summary - Data Analytics
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Big data(1st presentation)
Big data(1st presentation)Big data(1st presentation)
Big data(1st presentation)
 

Similar to xGem BigData

Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
FredReynolds2
 
Big Data Hadoop
Big Data HadoopBig Data Hadoop
Big Data Hadoop
Techsparks
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
ahmed alshikh
 
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Survey
ijeei-iaes
 
The Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data HadoopThe Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data Hadoop
IBM Software India
 
Hadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | SysforeHadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | Sysfore
Sysfore Technologies
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
Rajesh Kumar
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
Mohamed Magdy
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
Ajay Ohri
 
Big data
Big dataBig data
Haven 2 0
Haven 2 0 Haven 2 0
Big data
Big dataBig data
Big data
Nimish Kochhar
 
Big data
Big dataBig data
Big data
Nimish Kochhar
 
Learn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant ResourceLearn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant Resource
Assignment Help
 
How to tackle big data from a security
How to tackle big data from a securityHow to tackle big data from a security
How to tackle big data from a security
Tyrone Systems
 
Big data analytics - hadoop
Big data analytics - hadoopBig data analytics - hadoop
Big data analytics - hadoop
Vishwajeet Jadeja
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
Sai Paravastu
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data
Big DataBig Data
Big Data
Kirubaburi R
 

Similar to xGem BigData (20)

Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Big Data Hadoop
Big Data HadoopBig Data Hadoop
Big Data Hadoop
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
 
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Survey
 
The Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data HadoopThe Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data Hadoop
 
Hadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | SysforeHadoop and Big Data Analytics | Sysfore
Hadoop and Big Data Analytics | Sysfore
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
 
Big data
Big dataBig data
Big data
 
Haven 2 0
Haven 2 0 Haven 2 0
Haven 2 0
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Learn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant ResourceLearn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant Resource
 
How to tackle big data from a security
How to tackle big data from a securityHow to tackle big data from a security
How to tackle big data from a security
 
Big data analytics - hadoop
Big data analytics - hadoopBig data analytics - hadoop
Big data analytics - hadoop
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Big Data
Big DataBig Data
Big Data
 

More from Julio Castro

Blockchain zero administration with python
Blockchain zero administration with pythonBlockchain zero administration with python
Blockchain zero administration with python
Julio Castro
 
Jasper
JasperJasper
Jasper
Julio Castro
 
Nifi
NifiNifi
Digital transformation
Digital transformationDigital transformation
Digital transformation
Julio Castro
 
Mobile Offline First
Mobile Offline FirstMobile Offline First
Mobile Offline First
Julio Castro
 
Keynote xgem
Keynote xgemKeynote xgem
Keynote xgem
Julio Castro
 

More from Julio Castro (6)

Blockchain zero administration with python
Blockchain zero administration with pythonBlockchain zero administration with python
Blockchain zero administration with python
 
Jasper
JasperJasper
Jasper
 
Nifi
NifiNifi
Nifi
 
Digital transformation
Digital transformationDigital transformation
Digital transformation
 
Mobile Offline First
Mobile Offline FirstMobile Offline First
Mobile Offline First
 
Keynote xgem
Keynote xgemKeynote xgem
Keynote xgem
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 

xGem BigData

  • 3. Agenda  What is Big Data?  Big Data Technologies  What is Hadoop?  Big Data Components  Hadoop Distributions  HortonWorkd Data Platform  Log Analyzed
  • 4. What is Big Data? Ernst and Young offers the following definition: Big Data refers to the dynamic, large and disparate volumes of data being created by people, tools and machines. It requires new, innovative, and scalable technology to collect, host and analytically process the vast amount of data gathered in order to derive real-time business insights that relate to consumers, risk, profit, performance, productivity management and enhanced shareholder value. The research firm Gartner, defines Big Data as follows: Big Data is high-volume, high-velocity, and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making and process automation.
  • 5. 5V’s del Big Data BigData 3 Variety is the diversity of the data. We have structured data that fits neatly into rows and columns, or relational databases and unstructured data that is not organized in a pre-defined way, for example Tweets, blogposts, pictures, numbers, and even video data. Variety 1 Velocity Velocity is the idea that data is being generated extremely fast, a process that never stops. Attributes include near or real-time streaming and local and cloud-based technologies that can process information very quickly. 4 Veracity is the conformity to facts and accuracy. Is the information real, or is it false? Veracity 2 Volume Volume is the scale of the data, or the increase in the amount of data stored. 5 VALUE
  • 6. Big Data Value isn't just profit. It may be medical or social benefits, or customer, employee, personal satisfaction or crime prevention. The main reasons for why people invest time to understand Big Data is to derive value from it. VALUE
  • 8. What is Apache Hadoop? • Hadoop is an open-source software framework used to store and process huge amounts of data. • Owned by Apache Software Foundation • Transforms commodity hardware into a service that: • Stores petabytes of data reliably (HDFS) • Allows huge distributed computations (MapReduce) • Key attributes: • Redundant and reliable • Doesn’t stop or lose data even if hardware fails • Easy to program • Extremely powerful • Allows the development of big data algorithms & tools • Batch processing centric • Runs on commodity hardware • Computers & network
  • 10. Who use Hadoop? 2006 2008 2009 2010 The Datagraph Blog 2007
  • 11. How HDFS Works? Namenode Persistent Namespace Metadata & Journal Namespace State Block Map Heartbeats & Block Reports Block ID  Block Locations Datanodes Block ID  Data Hierarchal Namespace File Name  BlockIDs Horizontally Scale IO and Storage b1 b5 b3 JBOD BlockStorageNamespace b2 b3 b1 JBOD b3 b5 b2 JBOD b1 b5 b2 JBOD
  • 12. HDFS Data Reliability Namenode Namespace State Block Map b1 b5 b3 JBOD b2 b3 b4 JBOD b3 b5 b2 JBOD b1b5 b2 JBOD 2. copy 3. blockReceived 1. replicate Bad/lost block replica Periodically check block checksums
  • 13.
  • 14. What is the Hadoop framework?
  • 18. Log Analytics Systems Today LOG ANALYTICS PLATFORMNetwork Device Logs • Not all data can be captured • Not all captured data is valuable • Transport all data
  • 19. LOG ANALYTICS PLATFORM Network Device Logs HDP HDF 2. Content-based routing based on dynamic evaluation of content, attributes, priority 1. Integrate and enrich logs across data centers and security zones 3. Cost effectively expand collection and grow timescale of logs collected Expand Storage Options of Log Data