SlideShare a Scribd company logo
1 of 33
Oracle Big Data Cloud Service
Presented by : Mandeep Kaur Sandhu
Senior Oracle DBA (university Of Auckland)
Download these slides from : mandysandhu.com
• Introduction to Big Data
• Oracle Big data deployment models
• Oracle Big Data cloud Service
• Core Principles
• Access and Admin tasks
• Data Management tools
• Event Hub
• Conclusion
2
Goals
3
What is Big Data??
Variety
Velocity
Volume
• Big data is a term
that describe Large
or complex datasets
• Traditional data
Processing system
failed to analyse
this data
• Big data identify
the value of data
An open Source Software Platform for distributed storage
and processing – Highly Scalable , Reliable and Available
4
What is Hadoop??
Hadoop
Logically Distributed file
system
Framework for processing
Designed to run on small/large
machine for parallel processing
Allow resource Growth
Avoid Vendor Locks in
HDFS
MapReduce
HDFS stores the
data in cluster
• NameNode
• DataNode
5
Two Components
Programming Model for processing large data sets
• Map - set of data and converts into another set of data
• Reduce – Take output of Map as input and combine into smaller set
MapReduce
6
7
Oracle Big Data Deployment
Models
Oracle Big Data Cloud
service model
delivered in your data
centre, behind your
firewall
Oracle Big Data
Cloud at
Customer(BDCC)
On- Premises
engineered system
designed to deliver
predictable Hadoop
infrastructure
Oracle Big Data
Appliance X6
Oracle public cloud
infrastructure with
cluster nodes and
data sources
Oracle Big Data
Cloud Service
(BDCS)
Operational Efficiency
• Out of box installation
• Automated cluster management
• Cloudera Manager
Security
• Data in encrypted – At rest and motion
• Authorization and Authentication
• Network Firewall
Versatility
• Cloudera distribution – Apache Hadoop Enterprise Data hub
• Install and operate third party software
8
BDCS - Core Principles
Highly Efficient Cluster Management
• Fault Tolerant – HA Hadoop Infrastructure
• Fully tested Hadoop upgrades
Cluster Nodes
• Cluster is a collection of nodes
• Permanent nodes
• Edge Nodes
• Compute Nodes
9
BDCS - Features
• Master or Data node
• Last for the lifetime of the cluster
• Each nodes has:
• 32 OCPU’s
• 256 GB RAM
• 48 TB Storage
• Full Cloudera distribution – Licence and Support
10
Permanent Nodes
• Empty Nodes – OS and disk
• Hadoop client configs
• Interface between Hadoop cluster and outside
Network
• Permanent node
Note: No data Node role
11
Edge Nodes
• CPU and Memory
• No disks
• Temporary nodes
• Need to Have cluster to add compute nodes
• Cluster can be extended up to 15 cluster
compute nodes
• No HDFS data
12
Compute Nodes
• Oracle Linux 6 and Oracle Java – JDK8
• Cloudera Enterprise (Data Hub Edition)
• CDH 5.X with support for YARN and MR2
• Cloudera Impala
• HBASE
• Cloudera Search
• Apache Spark
• Oracle R distribution
• Oracle Big Data Spatial and Graph
13
BDCS – Included Software
Oracle Big Data SQL Cloud Service
• Unified SQL access
• Dedicated instances
14
BDCS – Additional Component
Oracle cloud
Cloudera 12c
B X
• Login to Oracle cloud
• choose Oracle Big data Cloud service
• Start Pack 1 –> 3 Nodes
• Additional Node – Added later
• Big Data SQL node
15
Oracle BDCS – Service Instance
• Go to Oracle big data service instance
• Create service cluster
• Provide tags and Instance Name
16
Oracle BDCS – Service Cluster
• Select Big data Appliance system – Service instance
• SSH keys
17
Oracle BDCS – Service Cluster
Starter pack 1 –> 3 instances
Lowest IP address –> Master Node
18
Oracle BDCS – Admin page
• You can connect via– opc
• CLI – bdacli
• Overall information about cluster
19
Oracle BDCS – Connect
• Open Cloudera console
• Username/password
20
Access Cloudera console
• Add nodes in one node increment – up to total 60 nodes
• Four Permanent Hadoop nodes – Allow additional Edge Node
• Extend/Shrink the service
21
Administrative Tasks
• Open Cloudera console – Hue
• Same account detail as CM
• Add Group
• Add User
• Upload file
22
Hue – Group/user and File upload
• GUI based console
• Login username – bigdatamgr
• Explore jobs and data stored
• Usage and Health of cluster
• YARN jobs
23
Big Data Manager Console
• Zeepelin Notebooks – Interactive analysis using R and Python
24
Oracle Big Manager - Notebook
odcp
• Command line for copy large files
• Take input and split it into chunks
• Uses spark to provide parallel transfer
Examples:
odcp hdfs:///user/mandy/bigdata01.csv hdfs:///user/mandy/bigdata01.csv_copy
odcp hdfs:///user/mandy/bigdata01.csv swift://aserver.1234/bigdata01.csv_copy
odcp hdfs:///user/mandy/bigdata01.csv s3://aserver/bigdata01.csv_copy
odcp s3://user/mandy/bigdata01.csv s3://mandy01/bigdata01.csv_copy
25
Data Management - odcp
odiff
• Oracle distribution diff – To compare large Data sets
• Compatible with cloudera distribution
• Minimum block size to compare – 5MB
• Maximum – 2GB
Examples:
/usr/bin/odiff hdfs:///user/mandy/bigdata01.csv
swift://aserver.1234/bigdata01.csv_copy
/usr/bin/odiff -V hdfs:///user/mandy/bigdata01.csv
swift://aserver.1234/bigdata01.csv_copy
/usr/bin/odiff -d hdfs:///user/mandy/bigdata01.csv
swift://aserver.1234/bigdata01.csv_copy
26
Data Management - odiff
bda-oss-admin
• To Manage data and resources
• Can set the environment variables
• Configure the cluster with storage provider
Examples:
bdm-oss-admin --cm-username admin --cm-password abce1234
bdm-oss-admin restart_cluster
#!/bin/bash
export CM_ADMIN="my_CM_admin_username"
27
Data Management
bdm-cli
• Big data command line interface to copy data and mange copy jobs
• Duplicate of odcp commands
bdm-cli copy
bdm-cli create_job
28
Data Management – bdm-cli
Oracle Big Data Cloud Service
Direct ingest into oracle BDCS
29
Data ingest options
Customer Data Centre
Flume
SCP
SCP(SSH
protocol)
Common ingests using
Flume or ETL work
VPN and
FastConnect
• Open Source stream processing
• Real time streaming
• High throughput and Low latency platform
30
Apache Kafka
Steams Processing
IOT
Anomaly Detection
Data Integration
Data Lakes
HDFS
Objects storage
Log Aggregation
Click Streams
Server logs
Messaging
Traditional Apps
Micros-services
• Fully Managed streaming data platform
• Provide world’s most popular message broker( kafka)
• Flexible
• Available full managed and dedicated deployment option
• Elastic – horizontally and Vertically
• Access
• REST API access
• SSH access to Kafka cluster
31
Oracle Event Hub Cloud Service
• Start you big data journey now
• Built and populate a data lake
• Help business to solve the problems by using data
• Register for oracle cloud free trail
https://cloud.oracle.com/tryit
32
Conclusion
Thank you for your time!!
Follow and Subscribe Me.
Blog mandysandhu.com
Twitter @mandysandhu14
LinkedIn kaurmandeep88

More Related Content

What's hot

Collaborate 17 - Database consolidation using the oracle multitenant architec...
Collaborate 17 - Database consolidation using the oracle multitenant architec...Collaborate 17 - Database consolidation using the oracle multitenant architec...
Collaborate 17 - Database consolidation using the oracle multitenant architec...Pini Dibask
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB plc
 
How QBerg scaled to store data longer, query it faster
How QBerg scaled to store data longer, query it fasterHow QBerg scaled to store data longer, query it faster
How QBerg scaled to store data longer, query it fasterMariaDB plc
 
Winning Performance Challenges in Oracle Multitenant
Winning Performance Challenges in Oracle MultitenantWinning Performance Challenges in Oracle Multitenant
Winning Performance Challenges in Oracle MultitenantPini Dibask
 
Oracle 12c and its pluggable databases
Oracle 12c and its pluggable databasesOracle 12c and its pluggable databases
Oracle 12c and its pluggable databasesGustavo Rene Antunez
 
Database Consolidation using Oracle Multitenant
Database Consolidation using Oracle MultitenantDatabase Consolidation using Oracle Multitenant
Database Consolidation using Oracle MultitenantPini Dibask
 
Oracle 12c PDB insights
Oracle 12c PDB insightsOracle 12c PDB insights
Oracle 12c PDB insightsKirill Loifman
 
How DBAs can garner the power of the Oracle Public Cloud?
How DBAs can garner the  power of the Oracle Public  Cloud?How DBAs can garner the  power of the Oracle Public  Cloud?
How DBAs can garner the power of the Oracle Public Cloud?Gustavo Rene Antunez
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBMariaDB plc
 
Oracle data guard for beginners
Oracle data guard for beginnersOracle data guard for beginners
Oracle data guard for beginnersPini Dibask
 
How we switched to columnar at SpendHQ
How we switched to columnar at SpendHQHow we switched to columnar at SpendHQ
How we switched to columnar at SpendHQMariaDB plc
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
 
Oracle 12c Multitenant architecture
Oracle 12c Multitenant architectureOracle 12c Multitenant architecture
Oracle 12c Multitenant architecturenaderattia
 
Oracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra PasalapudiOracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra Pasalapudipasalapudi123
 
RMAN in 12c: The Next Generation (WP)
RMAN in 12c: The Next Generation (WP)RMAN in 12c: The Next Generation (WP)
RMAN in 12c: The Next Generation (WP)Gustavo Rene Antunez
 
Fast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceFast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceGustavo Rene Antunez
 
My First 100 days with a MySQL DBMS
My First 100 days with a MySQL DBMSMy First 100 days with a MySQL DBMS
My First 100 days with a MySQL DBMSGustavo Rene Antunez
 
Oracle database 12c intro
Oracle database 12c introOracle database 12c intro
Oracle database 12c intropasalapudi
 

What's hot (20)

Collaborate 17 - Database consolidation using the oracle multitenant architec...
Collaborate 17 - Database consolidation using the oracle multitenant architec...Collaborate 17 - Database consolidation using the oracle multitenant architec...
Collaborate 17 - Database consolidation using the oracle multitenant architec...
 
MariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introductionMariaDB Enterprise Tools introduction
MariaDB Enterprise Tools introduction
 
How QBerg scaled to store data longer, query it faster
How QBerg scaled to store data longer, query it fasterHow QBerg scaled to store data longer, query it faster
How QBerg scaled to store data longer, query it faster
 
Winning Performance Challenges in Oracle Multitenant
Winning Performance Challenges in Oracle MultitenantWinning Performance Challenges in Oracle Multitenant
Winning Performance Challenges in Oracle Multitenant
 
Oracle 12c and its pluggable databases
Oracle 12c and its pluggable databasesOracle 12c and its pluggable databases
Oracle 12c and its pluggable databases
 
Database Consolidation using Oracle Multitenant
Database Consolidation using Oracle MultitenantDatabase Consolidation using Oracle Multitenant
Database Consolidation using Oracle Multitenant
 
Oracle 12c
Oracle 12cOracle 12c
Oracle 12c
 
Oracle 12c PDB insights
Oracle 12c PDB insightsOracle 12c PDB insights
Oracle 12c PDB insights
 
How DBAs can garner the power of the Oracle Public Cloud?
How DBAs can garner the  power of the Oracle Public  Cloud?How DBAs can garner the  power of the Oracle Public  Cloud?
How DBAs can garner the power of the Oracle Public Cloud?
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDB
 
Oracle data guard for beginners
Oracle data guard for beginnersOracle data guard for beginners
Oracle data guard for beginners
 
How we switched to columnar at SpendHQ
How we switched to columnar at SpendHQHow we switched to columnar at SpendHQ
How we switched to columnar at SpendHQ
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
 
Oracle 12c Multitenant architecture
Oracle 12c Multitenant architectureOracle 12c Multitenant architecture
Oracle 12c Multitenant architecture
 
Oracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra PasalapudiOracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra Pasalapudi
 
RMAN in 12c: The Next Generation (WP)
RMAN in 12c: The Next Generation (WP)RMAN in 12c: The Next Generation (WP)
RMAN in 12c: The Next Generation (WP)
 
Fast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceFast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud Service
 
My First 100 days with a MySQL DBMS
My First 100 days with a MySQL DBMSMy First 100 days with a MySQL DBMS
My First 100 days with a MySQL DBMS
 
Oracle GoldenGate for Oracle DBAs
Oracle GoldenGate for Oracle DBAsOracle GoldenGate for Oracle DBAs
Oracle GoldenGate for Oracle DBAs
 
Oracle database 12c intro
Oracle database 12c introOracle database 12c intro
Oracle database 12c intro
 

Similar to Oracle Big Data Cloud service

Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopMike Pittaro
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecturesaipriyacoool
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauSam Palani
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresmkorremans
 
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemThings Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemZohar Elkayam
 
The Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-ServiceThe Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-ServiceBlueData, Inc.
 
Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Zohar Elkayam
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introductionfardinjamshidi
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016StampedeCon
 
Managing Big Data: An Introduction to Data Intensive Computing
Managing Big Data: An Introduction to Data Intensive ComputingManaging Big Data: An Introduction to Data Intensive Computing
Managing Big Data: An Introduction to Data Intensive ComputingCollin Bennett
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...SpringPeople
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
 

Similar to Oracle Big Data Cloud service (20)

Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
 
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the EnterpriseDeploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-features
 
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemThings Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
 
The Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-ServiceThe Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-Service
 
Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016Rapid Cluster Computing with Apache Spark 2016
Rapid Cluster Computing with Apache Spark 2016
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introduction
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Managing Big Data: An Introduction to Data Intensive Computing
Managing Big Data: An Introduction to Data Intensive ComputingManaging Big Data: An Introduction to Data Intensive Computing
Managing Big Data: An Introduction to Data Intensive Computing
 
Kanthaka - High Volume CDR Analyzer
Kanthaka - High Volume CDR AnalyzerKanthaka - High Volume CDR Analyzer
Kanthaka - High Volume CDR Analyzer
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
 
Big data applications
Big data applicationsBig data applications
Big data applications
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 

Recently uploaded

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Oracle Big Data Cloud service

  • 1. Oracle Big Data Cloud Service Presented by : Mandeep Kaur Sandhu Senior Oracle DBA (university Of Auckland) Download these slides from : mandysandhu.com
  • 2. • Introduction to Big Data • Oracle Big data deployment models • Oracle Big Data cloud Service • Core Principles • Access and Admin tasks • Data Management tools • Event Hub • Conclusion 2 Goals
  • 3. 3 What is Big Data?? Variety Velocity Volume • Big data is a term that describe Large or complex datasets • Traditional data Processing system failed to analyse this data • Big data identify the value of data
  • 4. An open Source Software Platform for distributed storage and processing – Highly Scalable , Reliable and Available 4 What is Hadoop?? Hadoop Logically Distributed file system Framework for processing Designed to run on small/large machine for parallel processing Allow resource Growth Avoid Vendor Locks in
  • 5. HDFS MapReduce HDFS stores the data in cluster • NameNode • DataNode 5 Two Components
  • 6. Programming Model for processing large data sets • Map - set of data and converts into another set of data • Reduce – Take output of Map as input and combine into smaller set MapReduce 6
  • 7. 7 Oracle Big Data Deployment Models Oracle Big Data Cloud service model delivered in your data centre, behind your firewall Oracle Big Data Cloud at Customer(BDCC) On- Premises engineered system designed to deliver predictable Hadoop infrastructure Oracle Big Data Appliance X6 Oracle public cloud infrastructure with cluster nodes and data sources Oracle Big Data Cloud Service (BDCS)
  • 8. Operational Efficiency • Out of box installation • Automated cluster management • Cloudera Manager Security • Data in encrypted – At rest and motion • Authorization and Authentication • Network Firewall Versatility • Cloudera distribution – Apache Hadoop Enterprise Data hub • Install and operate third party software 8 BDCS - Core Principles
  • 9. Highly Efficient Cluster Management • Fault Tolerant – HA Hadoop Infrastructure • Fully tested Hadoop upgrades Cluster Nodes • Cluster is a collection of nodes • Permanent nodes • Edge Nodes • Compute Nodes 9 BDCS - Features
  • 10. • Master or Data node • Last for the lifetime of the cluster • Each nodes has: • 32 OCPU’s • 256 GB RAM • 48 TB Storage • Full Cloudera distribution – Licence and Support 10 Permanent Nodes
  • 11. • Empty Nodes – OS and disk • Hadoop client configs • Interface between Hadoop cluster and outside Network • Permanent node Note: No data Node role 11 Edge Nodes
  • 12. • CPU and Memory • No disks • Temporary nodes • Need to Have cluster to add compute nodes • Cluster can be extended up to 15 cluster compute nodes • No HDFS data 12 Compute Nodes
  • 13. • Oracle Linux 6 and Oracle Java – JDK8 • Cloudera Enterprise (Data Hub Edition) • CDH 5.X with support for YARN and MR2 • Cloudera Impala • HBASE • Cloudera Search • Apache Spark • Oracle R distribution • Oracle Big Data Spatial and Graph 13 BDCS – Included Software
  • 14. Oracle Big Data SQL Cloud Service • Unified SQL access • Dedicated instances 14 BDCS – Additional Component Oracle cloud Cloudera 12c B X
  • 15. • Login to Oracle cloud • choose Oracle Big data Cloud service • Start Pack 1 –> 3 Nodes • Additional Node – Added later • Big Data SQL node 15 Oracle BDCS – Service Instance
  • 16. • Go to Oracle big data service instance • Create service cluster • Provide tags and Instance Name 16 Oracle BDCS – Service Cluster
  • 17. • Select Big data Appliance system – Service instance • SSH keys 17 Oracle BDCS – Service Cluster
  • 18. Starter pack 1 –> 3 instances Lowest IP address –> Master Node 18 Oracle BDCS – Admin page
  • 19. • You can connect via– opc • CLI – bdacli • Overall information about cluster 19 Oracle BDCS – Connect
  • 20. • Open Cloudera console • Username/password 20 Access Cloudera console
  • 21. • Add nodes in one node increment – up to total 60 nodes • Four Permanent Hadoop nodes – Allow additional Edge Node • Extend/Shrink the service 21 Administrative Tasks
  • 22. • Open Cloudera console – Hue • Same account detail as CM • Add Group • Add User • Upload file 22 Hue – Group/user and File upload
  • 23. • GUI based console • Login username – bigdatamgr • Explore jobs and data stored • Usage and Health of cluster • YARN jobs 23 Big Data Manager Console
  • 24. • Zeepelin Notebooks – Interactive analysis using R and Python 24 Oracle Big Manager - Notebook
  • 25. odcp • Command line for copy large files • Take input and split it into chunks • Uses spark to provide parallel transfer Examples: odcp hdfs:///user/mandy/bigdata01.csv hdfs:///user/mandy/bigdata01.csv_copy odcp hdfs:///user/mandy/bigdata01.csv swift://aserver.1234/bigdata01.csv_copy odcp hdfs:///user/mandy/bigdata01.csv s3://aserver/bigdata01.csv_copy odcp s3://user/mandy/bigdata01.csv s3://mandy01/bigdata01.csv_copy 25 Data Management - odcp
  • 26. odiff • Oracle distribution diff – To compare large Data sets • Compatible with cloudera distribution • Minimum block size to compare – 5MB • Maximum – 2GB Examples: /usr/bin/odiff hdfs:///user/mandy/bigdata01.csv swift://aserver.1234/bigdata01.csv_copy /usr/bin/odiff -V hdfs:///user/mandy/bigdata01.csv swift://aserver.1234/bigdata01.csv_copy /usr/bin/odiff -d hdfs:///user/mandy/bigdata01.csv swift://aserver.1234/bigdata01.csv_copy 26 Data Management - odiff
  • 27. bda-oss-admin • To Manage data and resources • Can set the environment variables • Configure the cluster with storage provider Examples: bdm-oss-admin --cm-username admin --cm-password abce1234 bdm-oss-admin restart_cluster #!/bin/bash export CM_ADMIN="my_CM_admin_username" 27 Data Management
  • 28. bdm-cli • Big data command line interface to copy data and mange copy jobs • Duplicate of odcp commands bdm-cli copy bdm-cli create_job 28 Data Management – bdm-cli
  • 29. Oracle Big Data Cloud Service Direct ingest into oracle BDCS 29 Data ingest options Customer Data Centre Flume SCP SCP(SSH protocol) Common ingests using Flume or ETL work VPN and FastConnect
  • 30. • Open Source stream processing • Real time streaming • High throughput and Low latency platform 30 Apache Kafka Steams Processing IOT Anomaly Detection Data Integration Data Lakes HDFS Objects storage Log Aggregation Click Streams Server logs Messaging Traditional Apps Micros-services
  • 31. • Fully Managed streaming data platform • Provide world’s most popular message broker( kafka) • Flexible • Available full managed and dedicated deployment option • Elastic – horizontally and Vertically • Access • REST API access • SSH access to Kafka cluster 31 Oracle Event Hub Cloud Service
  • 32. • Start you big data journey now • Built and populate a data lake • Help business to solve the problems by using data • Register for oracle cloud free trail https://cloud.oracle.com/tryit 32 Conclusion
  • 33. Thank you for your time!! Follow and Subscribe Me. Blog mandysandhu.com Twitter @mandysandhu14 LinkedIn kaurmandeep88