SlideShare a Scribd company logo
Big Data Analytics and
Machine Intelligence
Giza At A Glance
• We are system integrator
• 43 years in the market
• Work in 25 countries
• 4 Regions of operation
• Enterprise Business
Solutions
• SCADA
• Transmission &
Distribution
• Transportation
Infrastructure
• Field Solutions
• Smart Buildings
Contents
• Introduction
• When Data is “Big”
• Big Data Information System Layers
Data Platform
Data Science & Advanced Analytics
Information Presentation
Actionable Insights
• Machine Intelligence
Introduction
• 2014, EMC & IDC digital universe report
• A study to analyze and forecast the amount of
data produced annually
• It is the universe of digital data
• Like the physical universe
It expands fast
Includes stars
Includes dark matter
About everything
The Digital Universe
Digital Universe Expands Fast
• Digital data doubles every two year
• Expected 44 ZB by 2020  44 Trillion GB
– ZB  103 EB  106 PB  109 TB
• Every second 205,000 new GB
• During this presentation ~ new 550 Million GB
• Less than 25% of recorded data is tagged
Telecommunication Revolution
• Smart phones full of
sensors
• Smart phone cameras
• High speed networks
• Mobile penetration
• Multiple devices per
customer
• Huge amount of data
transferred
• Communication
control data
Social Networks
• YouTube Statistics
 1,300,000,000 users
 300 hours / minute
uploaded
 30 million visitors /
day
Internet of Things: Smart Cities
• Metering
• Smart homes
• Smart buildings
• Smart parking
• Street lighting
• Traffic monitoring
• And others
Internet of Things: Smart Farming
• Weather measuring
• Air sensors
• Water sensors
• Water leakage sensors
• Soil monitoring
• Irrigation monitoring and control
• Harvesting machines tracking and monitoring
• Farm animals tracking and monitoring
• And others
Internet of Things: Industrial
• Air craft sensors gather ~1TB per flight
• Jet engines produces ~25 MB per flight hour per
engine
• Think about
– power plants,
– oil plants,
– water plants, etc.
When Big Data is “Big”
• Gartner, the known provenance of 3Vs of Big Data defines
Big Data as: High-volume, high-velocity and high-variety
information assets that demand cost-effective, innovative
forms of information processing for enhanced insight and
decision making.
• IDC defines Big Data technologies as: A new
generation of technologies and architectures, designed to
economically extract value from very large volumes of a
wide variety of data by enabling high-velocity capture,
discovery, and/or analysis.
Definitions
3-Vs Gartner Model
• Structured, semi-structured and non-structured data
• Semi-structured
Log files
Manually edited excel files
Others
• Non-structured
Chat conversations
Emails
Images & videos
Others
• Most of this data already belongs to organizations, but it is
sitting there unused — that’s why Gartner calls it dark data
Data Variety
• The speed at which data is:-
Created
Stored
Analyzed
• In Big Data systems, data is created in real-time or
near real-time
Data Velocity
• 90% of all data ever created, was created in past 2
years
• Estimated amount of data doubles every two year
• The era of a trillion sensor is upon us
Data Volume
Big Data Information
System Layers
Big Data Information System
Layers
Actionable Insights
Information
Presentation
Data Science &
Advanced Analytics
Data Platform
Data Platform
Actionable Insights
Information
Presentation
Data Science &
Advanced Analytics
Data Platform
Hadoop Distributed File System
(HDFS)
• Open source project
• Java-based file system that
• Scalable up to 200 PB
• Up to 4500 server of single cluster
• Close to a billion files and blocks
• Concurrent access through
“YARN”
Map-Reduce Algorithm
• A framework for
processing problems in
parallel
• Uses multiple computing
cluster nodes
Apache HBase
• Open source project
• Non-relational database
• Column-oriented key-value
data store
• Part of Hadoop project
• Can serve as input & output of
map-reduce jobs in Hadoop
• Data access through Java API
Apache Phoenix
• Open source
• Part of Apache Hadoop
Project
• Based on Apache HBase
• Provides a JDBC and
ODBC drivers for Hbase
Hadoop Distributions
• Top Known:-
- Cloudera
- MapR
- Hortonworks
- IBM
- Pivotal HD
- Intel distribution
• Cloud based:-
- Azure HDInsight
- Amazon Elastic MapReduce
Hadoop Hortonworks Ecosystem
Massively Parallel Processing
(MPP) Data Warehouse
Architecture
• Share nothing architecture, no single point of failure
• Scale horizontally by adding nodes
• Breaks large queries across nodes for parallel
processing
• Higher data ingestion rates through parallelized data
movement
MPP Database Examples
• Teradata
• Netezza
• Vertica
• Greenplum
• Microsoft PDW (Parallel
Data Warehouse)
• DB2 UDB with database
partitioning feature
(DPF)
Pivotal Greenplum Architecture
Actionable Insights
Information
Presentation
Data Science &
Advanced Analytics
Data Platform
Data Science and Advanced
Analytics Layer
Types of Data Analytics
Analytics
Descriptive
Diagnostic
Predictive
Prescriptive
Descriptive Analytics
• What happened
- Which KPIs
- Which time frame
- Which filter
- What chart type
- How remove noise
Diagnostic Analytics
• Why happened
- Why this KPI is low
- What factors of KPI
- Which factors use
to compare
- How to compare
with changing
single factor and fix
others
On-Line Analytical Processing
(OLAP)
Predictive Analytics
• Predict / Forecasting
• Segmentation
• Classification
• Anomaly detection
• Sentiment Prediction
Prescriptive Analytics
• What is the best
course of action?
• Simulation
• Optimization
• What-if analysis
Data Mining
• Data mining is the computing process of discovering
patterns in large data sets.
• Cross Industry Standard Process for Data Mining
(CRISP-DM):-
- Business understanding
- Data understanding
- Modeling
- Evaluation
- Deployment
Data Mining Techniques
• Regression
• Classification
• Cluster Analysis
• Correlation Analysis
• Outlier Analysis
• Anomaly Detection
Proprietary Data Mining Tools
• SAS Analytics
• IBM SPSS
• SAP Predictive Analytics
• Angoss Predictive
Analytics
• KXEN Predictive Analytics
• Oracle Data Mining (ODM)
• Statistica
• TIBCO Analytics
• Matlab
Open Source
• Python packages
• R Project
• RapidMiner
• KNIME
• Weka
• Octave
• GGobi
• Tangara
• Prediction IO
Information Presentation
Actionable Insights
Information
Presentation
Data Science &
Advanced Analytics
Data Platform
Reporting / Dashboards
• Reporting
Rich formatted and interactive
reports
Reports with / or without
parameters
Using scheduling capabilities
• Dashboards
Publishing web based / mobile
reports
Interactive display for KPI
comparisons with targets
Integration with operational
applications and or event
processing engines
Alerts
• Alerts of business intelligence and analytics content
via:
Emails
SMS
Or customized receiver (i.e. custom web
service)
Geospatial and Location
Intelligence
• Combining geographical
and location-related data
from data sources
including:-
- Aerial maps
- GISs
- Consumer
demographics
• Displaying relationships by
overlaying data on
interactive maps
Mobile Information Presentation
• Develop and deliver
content to mobile devices
• Publishing mode and/or
interactive mode
• Takes advantage of mobile
devices’ native caps i.e.:-
- Touch screens
- Camera
- Location awareness
- Natural-Language
query
Actionable Insights
Information
Presentation
Data Science &
Advanced Analytics
Data Platform
Actionable Insights
Linking Insights to Actions
• Forrester reports that
74% of firms want to be
“data driven”
• But only 29% are
actually successfully
connecting analytics to
action
• Actionable insights are
the missing link
Attributes of Actionable Insights
Aligned with your
business goals
Insight results have
context
Relevance; Insights
delivered to the right
person, in the right time
and settings
Insights are Specific
Novel insights have an
advantage over familiar
ones
Clarity of the insight
Machine Intelligence
Machine Learning
“Machine Learning is giving
computers the ability to learn
without being explicitly
programmed.”
~ Arthur Samuel
Why Machine Learning for Big
Data Analytics
• Dark data makes up more than 90% of the digital
universe
• This is huge amount of data volume, formats, and
sources to be handled in a conventional way
• Analysis of non-structured data like images, videos,
and sound files is usually done using Machine
Learning algorithms
• More data  better training results
Artificial Neural Networks (ANN)
• Computing systems are
inspired by biological neural
networks
• Based on a collection of
artificial “neurons” connected
by “synaptic connections”
• Synaptic connections have
weights to control transmitted
signal strength
• Neurons may have thresholds
to control aggregated signal
transmission
Deep Neural Networks (DNN)
• ANN with multiple hidden
layers between the input
and output layers
• The extra layers enable
composition of features from
lower layers
• Applied technology for
tagging of huge amount of
Dark Data images, videos,
speech, music, etc.
Graphics Processing Units (GPU)
• Rapidly create images in frame buffers for output
to display device
• General Purpose GPU (GPGPU), stream
processor or vector processor running compute
kernels
• Suitable for deep neural networks learning
• Several orders of magnitude higher than CPU
• GPU clusters
• Cloud-based GPU (IaaS)
Combining HDFS with GPU
Conventional Large Scale Distributed Deep
Learning on Hadoop Clusters
©2017 Giza Systems. All rights reserved.
Giza Systems, a leading systems integrator in the MEA region, designs and deploys industry-specific technology solutions for asset-intensive industries
such as the Telecoms, Utilities, Oil & Gas, Transportation and other market sectors. We help our clients streamline their operations and businesses
through our portfolio of solutions, managed services, and consultancy practice. Our team of 800 professionals are spread throughout the region with
anchor offices in Cairo, Riyadh, Dubai, Nairobi, Dar-es-Salaam and Abuja, allowing us to service an ever-increasing client base in over 40 countries.
Thank You!

More Related Content

What's hot

The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architectureJoseph D'Antoni
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
Guido Schmutz
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Rizaldy Ignacio
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
 
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceHigh Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
DataWorks Summit
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Big Data Spain
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
Rittman Analytics
 
AGIT 2015 - Hans Viehmann: "Big Data and Smart Cities"
AGIT 2015  - Hans Viehmann: "Big Data and Smart Cities"AGIT 2015  - Hans Viehmann: "Big Data and Smart Cities"
AGIT 2015 - Hans Viehmann: "Big Data and Smart Cities"jstrobl
 
Log I am your father
Log I am your fatherLog I am your father
Log I am your father
DataWorks Summit/Hadoop Summit
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Dataconomy Media
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
MapR Technologies
 
Operating a secure big data platform in a multi-cloud environment
Operating a secure big data platform in a multi-cloud environmentOperating a secure big data platform in a multi-cloud environment
Operating a secure big data platform in a multi-cloud environment
DataWorks Summit
 
Big Data Architecture and Design Patterns
Big Data Architecture and Design PatternsBig Data Architecture and Design Patterns
Big Data Architecture and Design Patterns
John Yeung
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
Big Data Spain
 
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsOperationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Kinetica
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017
Rittman Analytics
 
Solving Performance Problems on Hadoop
Solving Performance Problems on HadoopSolving Performance Problems on Hadoop
Solving Performance Problems on Hadoop
Tyler Mitchell
 
Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»
Anna Shymchenko
 

What's hot (20)

The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
 
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceHigh Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
 
AGIT 2015 - Hans Viehmann: "Big Data and Smart Cities"
AGIT 2015  - Hans Viehmann: "Big Data and Smart Cities"AGIT 2015  - Hans Viehmann: "Big Data and Smart Cities"
AGIT 2015 - Hans Viehmann: "Big Data and Smart Cities"
 
Log I am your father
Log I am your fatherLog I am your father
Log I am your father
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
Operating a secure big data platform in a multi-cloud environment
Operating a secure big data platform in a multi-cloud environmentOperating a secure big data platform in a multi-cloud environment
Operating a secure big data platform in a multi-cloud environment
 
Big Data Architecture and Design Patterns
Big Data Architecture and Design PatternsBig Data Architecture and Design Patterns
Big Data Architecture and Design Patterns
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
 
Operational-Analytics
Operational-AnalyticsOperational-Analytics
Operational-Analytics
 
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database AnalyticsOperationalizing Machine Learning Using GPU-accelerated, In-database Analytics
Operationalizing Machine Learning Using GPU-accelerated, In-database Analytics
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017
 
Solving Performance Problems on Hadoop
Solving Performance Problems on HadoopSolving Performance Problems on Hadoop
Solving Performance Problems on Hadoop
 
Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»
 

Similar to Big data analytics and machine intelligence v5.0

"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Dataconomy Media
 
Big data.ppt
Big data.pptBig data.ppt
Big data.ppt
IdontKnow66967
 
Lecture1
Lecture1Lecture1
Lecture1
Manish Singh
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013IntelAPAC
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Perficient, Inc.
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
Dr. Anita Goel
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
Customer value analysis of big data products
Customer value analysis of big data productsCustomer value analysis of big data products
Customer value analysis of big data products
Vikas Sardana
 
How to build a data stack from scratch
How to build a data stack from scratchHow to build a data stack from scratch
How to build a data stack from scratch
Vinayak Hegde
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data Analytics
Amazon Web Services
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
RojaT4
 
10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About 10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About
Jesus Rodriguez
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
Michael Hiskey
 
Ankus, bigdata deployment and orchestration framework
Ankus, bigdata deployment and orchestration frameworkAnkus, bigdata deployment and orchestration framework
Ankus, bigdata deployment and orchestration framework
Ashrith Mekala
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
Tung Nguyen
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
Nagarjuna D.N
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Tech Triveni
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in details
AbhishekKumarAgrahar2
 
Cloud computing infrastructure
Cloud computing infrastructure Cloud computing infrastructure
Cloud computing infrastructure
Dr. Anita Goel
 

Similar to Big data analytics and machine intelligence v5.0 (20)

"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Big data.ppt
Big data.pptBig data.ppt
Big data.ppt
 
Lecture1
Lecture1Lecture1
Lecture1
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
Customer value analysis of big data products
Customer value analysis of big data productsCustomer value analysis of big data products
Customer value analysis of big data products
 
How to build a data stack from scratch
How to build a data stack from scratchHow to build a data stack from scratch
How to build a data stack from scratch
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data Analytics
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About 10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
 
Ankus, bigdata deployment and orchestration framework
Ankus, bigdata deployment and orchestration frameworkAnkus, bigdata deployment and orchestration framework
Ankus, bigdata deployment and orchestration framework
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in details
 
Cloud computing infrastructure
Cloud computing infrastructure Cloud computing infrastructure
Cloud computing infrastructure
 

More from Amr Kamel Deklel

Transfer learning with LTANN-MEM & NSA for solving multi-objective symbolic r...
Transfer learning with LTANN-MEM & NSA for solving multi-objective symbolic r...Transfer learning with LTANN-MEM & NSA for solving multi-objective symbolic r...
Transfer learning with LTANN-MEM & NSA for solving multi-objective symbolic r...
Amr Kamel Deklel
 
Cognitive Architectures - Research Circle
Cognitive Architectures - Research CircleCognitive Architectures - Research Circle
Cognitive Architectures - Research CircleAmr Kamel Deklel
 
Cognitive Architectures - Amr Kamel - 2015
Cognitive Architectures - Amr Kamel - 2015Cognitive Architectures - Amr Kamel - 2015
Cognitive Architectures - Amr Kamel - 2015Amr Kamel Deklel
 
Quantum computing
Quantum computingQuantum computing
Quantum computing
Amr Kamel Deklel
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
Amr Kamel Deklel
 
Mahfazty mobile payment in egypt
Mahfazty   mobile payment in egyptMahfazty   mobile payment in egypt
Mahfazty mobile payment in egypt
Amr Kamel Deklel
 
Turkcell Financial Position
Turkcell Financial PositionTurkcell Financial Position
Turkcell Financial Position
Amr Kamel Deklel
 
Cloud computing
Cloud computingCloud computing
Cloud computing
Amr Kamel Deklel
 
TURKCELL CASE STUDY
TURKCELL CASE STUDYTURKCELL CASE STUDY
TURKCELL CASE STUDY
Amr Kamel Deklel
 
Evolving Comprehensible Neural Network Trees
Evolving Comprehensible Neural Network TreesEvolving Comprehensible Neural Network Trees
Evolving Comprehensible Neural Network Trees
Amr Kamel Deklel
 

More from Amr Kamel Deklel (10)

Transfer learning with LTANN-MEM & NSA for solving multi-objective symbolic r...
Transfer learning with LTANN-MEM & NSA for solving multi-objective symbolic r...Transfer learning with LTANN-MEM & NSA for solving multi-objective symbolic r...
Transfer learning with LTANN-MEM & NSA for solving multi-objective symbolic r...
 
Cognitive Architectures - Research Circle
Cognitive Architectures - Research CircleCognitive Architectures - Research Circle
Cognitive Architectures - Research Circle
 
Cognitive Architectures - Amr Kamel - 2015
Cognitive Architectures - Amr Kamel - 2015Cognitive Architectures - Amr Kamel - 2015
Cognitive Architectures - Amr Kamel - 2015
 
Quantum computing
Quantum computingQuantum computing
Quantum computing
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Mahfazty mobile payment in egypt
Mahfazty   mobile payment in egyptMahfazty   mobile payment in egypt
Mahfazty mobile payment in egypt
 
Turkcell Financial Position
Turkcell Financial PositionTurkcell Financial Position
Turkcell Financial Position
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
TURKCELL CASE STUDY
TURKCELL CASE STUDYTURKCELL CASE STUDY
TURKCELL CASE STUDY
 
Evolving Comprehensible Neural Network Trees
Evolving Comprehensible Neural Network TreesEvolving Comprehensible Neural Network Trees
Evolving Comprehensible Neural Network Trees
 

Recently uploaded

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 

Recently uploaded (20)

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 

Big data analytics and machine intelligence v5.0

  • 1. Big Data Analytics and Machine Intelligence
  • 2. Giza At A Glance • We are system integrator • 43 years in the market • Work in 25 countries • 4 Regions of operation • Enterprise Business Solutions • SCADA • Transmission & Distribution • Transportation Infrastructure • Field Solutions • Smart Buildings
  • 3. Contents • Introduction • When Data is “Big” • Big Data Information System Layers Data Platform Data Science & Advanced Analytics Information Presentation Actionable Insights • Machine Intelligence
  • 5. • 2014, EMC & IDC digital universe report • A study to analyze and forecast the amount of data produced annually • It is the universe of digital data • Like the physical universe It expands fast Includes stars Includes dark matter About everything The Digital Universe
  • 6. Digital Universe Expands Fast • Digital data doubles every two year • Expected 44 ZB by 2020  44 Trillion GB – ZB  103 EB  106 PB  109 TB • Every second 205,000 new GB • During this presentation ~ new 550 Million GB • Less than 25% of recorded data is tagged
  • 7. Telecommunication Revolution • Smart phones full of sensors • Smart phone cameras • High speed networks • Mobile penetration • Multiple devices per customer • Huge amount of data transferred • Communication control data
  • 8. Social Networks • YouTube Statistics  1,300,000,000 users  300 hours / minute uploaded  30 million visitors / day
  • 9. Internet of Things: Smart Cities • Metering • Smart homes • Smart buildings • Smart parking • Street lighting • Traffic monitoring • And others
  • 10. Internet of Things: Smart Farming • Weather measuring • Air sensors • Water sensors • Water leakage sensors • Soil monitoring • Irrigation monitoring and control • Harvesting machines tracking and monitoring • Farm animals tracking and monitoring • And others
  • 11. Internet of Things: Industrial • Air craft sensors gather ~1TB per flight • Jet engines produces ~25 MB per flight hour per engine • Think about – power plants, – oil plants, – water plants, etc.
  • 12. When Big Data is “Big”
  • 13. • Gartner, the known provenance of 3Vs of Big Data defines Big Data as: High-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. • IDC defines Big Data technologies as: A new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis. Definitions
  • 15. • Structured, semi-structured and non-structured data • Semi-structured Log files Manually edited excel files Others • Non-structured Chat conversations Emails Images & videos Others • Most of this data already belongs to organizations, but it is sitting there unused — that’s why Gartner calls it dark data Data Variety
  • 16. • The speed at which data is:- Created Stored Analyzed • In Big Data systems, data is created in real-time or near real-time Data Velocity
  • 17. • 90% of all data ever created, was created in past 2 years • Estimated amount of data doubles every two year • The era of a trillion sensor is upon us Data Volume
  • 19. Big Data Information System Layers Actionable Insights Information Presentation Data Science & Advanced Analytics Data Platform
  • 20. Data Platform Actionable Insights Information Presentation Data Science & Advanced Analytics Data Platform
  • 21. Hadoop Distributed File System (HDFS) • Open source project • Java-based file system that • Scalable up to 200 PB • Up to 4500 server of single cluster • Close to a billion files and blocks • Concurrent access through “YARN”
  • 22. Map-Reduce Algorithm • A framework for processing problems in parallel • Uses multiple computing cluster nodes
  • 23. Apache HBase • Open source project • Non-relational database • Column-oriented key-value data store • Part of Hadoop project • Can serve as input & output of map-reduce jobs in Hadoop • Data access through Java API
  • 24. Apache Phoenix • Open source • Part of Apache Hadoop Project • Based on Apache HBase • Provides a JDBC and ODBC drivers for Hbase
  • 25. Hadoop Distributions • Top Known:- - Cloudera - MapR - Hortonworks - IBM - Pivotal HD - Intel distribution • Cloud based:- - Azure HDInsight - Amazon Elastic MapReduce
  • 27. Massively Parallel Processing (MPP) Data Warehouse Architecture • Share nothing architecture, no single point of failure • Scale horizontally by adding nodes • Breaks large queries across nodes for parallel processing • Higher data ingestion rates through parallelized data movement
  • 28. MPP Database Examples • Teradata • Netezza • Vertica • Greenplum • Microsoft PDW (Parallel Data Warehouse) • DB2 UDB with database partitioning feature (DPF)
  • 30. Actionable Insights Information Presentation Data Science & Advanced Analytics Data Platform Data Science and Advanced Analytics Layer
  • 31. Types of Data Analytics Analytics Descriptive Diagnostic Predictive Prescriptive
  • 32. Descriptive Analytics • What happened - Which KPIs - Which time frame - Which filter - What chart type - How remove noise
  • 33. Diagnostic Analytics • Why happened - Why this KPI is low - What factors of KPI - Which factors use to compare - How to compare with changing single factor and fix others
  • 35. Predictive Analytics • Predict / Forecasting • Segmentation • Classification • Anomaly detection • Sentiment Prediction
  • 36. Prescriptive Analytics • What is the best course of action? • Simulation • Optimization • What-if analysis
  • 37. Data Mining • Data mining is the computing process of discovering patterns in large data sets. • Cross Industry Standard Process for Data Mining (CRISP-DM):- - Business understanding - Data understanding - Modeling - Evaluation - Deployment
  • 38. Data Mining Techniques • Regression • Classification • Cluster Analysis • Correlation Analysis • Outlier Analysis • Anomaly Detection
  • 39. Proprietary Data Mining Tools • SAS Analytics • IBM SPSS • SAP Predictive Analytics • Angoss Predictive Analytics • KXEN Predictive Analytics • Oracle Data Mining (ODM) • Statistica • TIBCO Analytics • Matlab
  • 40. Open Source • Python packages • R Project • RapidMiner • KNIME • Weka • Octave • GGobi • Tangara • Prediction IO
  • 42. Reporting / Dashboards • Reporting Rich formatted and interactive reports Reports with / or without parameters Using scheduling capabilities • Dashboards Publishing web based / mobile reports Interactive display for KPI comparisons with targets Integration with operational applications and or event processing engines
  • 43. Alerts • Alerts of business intelligence and analytics content via: Emails SMS Or customized receiver (i.e. custom web service)
  • 44. Geospatial and Location Intelligence • Combining geographical and location-related data from data sources including:- - Aerial maps - GISs - Consumer demographics • Displaying relationships by overlaying data on interactive maps
  • 45. Mobile Information Presentation • Develop and deliver content to mobile devices • Publishing mode and/or interactive mode • Takes advantage of mobile devices’ native caps i.e.:- - Touch screens - Camera - Location awareness - Natural-Language query
  • 46. Actionable Insights Information Presentation Data Science & Advanced Analytics Data Platform Actionable Insights
  • 47. Linking Insights to Actions • Forrester reports that 74% of firms want to be “data driven” • But only 29% are actually successfully connecting analytics to action • Actionable insights are the missing link
  • 48. Attributes of Actionable Insights Aligned with your business goals Insight results have context Relevance; Insights delivered to the right person, in the right time and settings Insights are Specific Novel insights have an advantage over familiar ones Clarity of the insight
  • 50. Machine Learning “Machine Learning is giving computers the ability to learn without being explicitly programmed.” ~ Arthur Samuel
  • 51. Why Machine Learning for Big Data Analytics • Dark data makes up more than 90% of the digital universe • This is huge amount of data volume, formats, and sources to be handled in a conventional way • Analysis of non-structured data like images, videos, and sound files is usually done using Machine Learning algorithms • More data  better training results
  • 52. Artificial Neural Networks (ANN) • Computing systems are inspired by biological neural networks • Based on a collection of artificial “neurons” connected by “synaptic connections” • Synaptic connections have weights to control transmitted signal strength • Neurons may have thresholds to control aggregated signal transmission
  • 53. Deep Neural Networks (DNN) • ANN with multiple hidden layers between the input and output layers • The extra layers enable composition of features from lower layers • Applied technology for tagging of huge amount of Dark Data images, videos, speech, music, etc.
  • 54. Graphics Processing Units (GPU) • Rapidly create images in frame buffers for output to display device • General Purpose GPU (GPGPU), stream processor or vector processor running compute kernels • Suitable for deep neural networks learning • Several orders of magnitude higher than CPU • GPU clusters • Cloud-based GPU (IaaS)
  • 55. Combining HDFS with GPU Conventional Large Scale Distributed Deep Learning on Hadoop Clusters
  • 56. ©2017 Giza Systems. All rights reserved. Giza Systems, a leading systems integrator in the MEA region, designs and deploys industry-specific technology solutions for asset-intensive industries such as the Telecoms, Utilities, Oil & Gas, Transportation and other market sectors. We help our clients streamline their operations and businesses through our portfolio of solutions, managed services, and consultancy practice. Our team of 800 professionals are spread throughout the region with anchor offices in Cairo, Riyadh, Dubai, Nairobi, Dar-es-Salaam and Abuja, allowing us to service an ever-increasing client base in over 40 countries. Thank You!

Editor's Notes

  1. 57