SlideShare a Scribd company logo
October 11, 2014 
Getting started with 
Hadoop on the Cloud 
Nicolas Morales – Solutions Engineer – nicolasm@us.ibm.com 
@NicolasJMorales 
© 1 2014 IBM Corporation
Welcome 
Goal: Get you started with Hadoop on the Cloud 
 Hadoop 
− What technical problem is it helping solve?  BIG DATA 
− What is Hadoop? 
− BigInsights (IBM’s Hadoop distro) 
 Bluemix (IBM’s PaaS cloud solution) 
− What technical problem is it helping solve? 
− Analytics for Hadoop in the Cloud 
 Demo  Get hands-on 
− Bluemix: bluemix.net 
− Hadoop Dev: ibm.biz/hadoopdev 
© 2 2014 IBM Corporation
It starts with a line of code. 
© 3 2014 IBM Corporation
© 4 2014 IBM Corporation
© 5 2014 IBM Corporation
! #$% 
© 6 2014 IBM Corporation
What is Big Data? 
A way to describe data problems that are 
unsolvable using traditional tools 
More Analytics on More Data for More People 
© 7 2014 IBM Corporation
What Data? 
Transactional  
Application Data 
Machine Data Social Data Enterprise 
Content 
© 8 2014 IBM Corporation 
© 2013 IBM Corporation 
More Analytics on More Data for More People
© 9 2014 IBM Corporation 
9
© 10 2014 IBM Corporation 
10 
In 2005 there were 1.3 billion RFID tags in 
circulation around the world…… 
……by the end of 2011, this was about 30 
billion and growing even faster.
An increasingly sensor-enabled and instrumented 
business environment generates HUGE volumes of 
data with MACHINE SPEED characteristics… 
1 BILLION lines of code 
EACH engine generating 10 TB every 30 minutes! 
© 11 2014 IBM Corporation
Welcome to the Instrumented Interconnected World! 
12+ TBs 
of tweet data 
every day 
12 
25+ TBs of 
log data 
every day 
? TBs of 
data every 
day 
4.6 
billion 
camera 
phones 
world 
wide 
100s of 
millions 
of GPS 
enabled 
devices 
sold 
annually 
2+ 
billion 
people 
on the 
Web by 
end 
2011 
30 billion 
RFID tags 
today 
(1.3B in 2005) 
76 million smart 
meters in 2009… 
200M by 2014 
© 12 2014 IBM Corporation
83x 
6,000,000 users on Twitter 
pushing out 300,000 
tweets per day 
500,000,000 users on Twitter 
pushing out 400,000,000 
tweets per day 
1333x 
© 13 2014 IBM Corporation 
13
We’ve Moved into a New Era of Computing 
12+terabytes 
Volume 
Velocity 
Variety Veracity 
5+million 
Only 1 in 3 
of Tweets 
create daily. 
100’s 
© 14 2014 IBM Corporation 
14 
decision makers trust 
their information. 
of different types of data. 
trade events 
per second.
Imagine the Possibilities of Harnessing Your Data Resources 
Big data challenges exist in every organization today 
Government cuts acoustic 
analysis from hours to 
70 Milliseconds 
Retailer reduces time to 
run queries by 80% to 
optimize inventory 
Utility avoids power 
failures by analyzing 
10 PB of data in minutes 
Stock Exchange cuts 
queries from 26 hours to 
2 minutes on 2 PB 
Hospital analyses streaming 
vitals to detect illness 
24 hours earlier 
Telco analyses streaming 
network data to reduce 
hardware costs by 90% 
© 15 2014 IBM Corporation
Every Industry can Leverage Big Data and Analytics 
Insurance 
• 360 
 
View of Domain 
or Subject 
• Catastrophe Modeling 
• Fraud  Abuse 
• Producer Performance 
Analytics 
• Analytics Sandbox 
Banking 
• Optimizing Offers and 
Cross-sell 
• Customer Service and 
Call Center Efficiency 
• Fraud Detection  
Investigation 
• Credit  Counterparty 
Risk 
Telco 
• Pro-active Call Center 
• Network Analytics 
• Location Based 
Services 
Energy  
Utilities 
• Smart Meter Analytics 
• Distribution Load 
Forecasting/Scheduling 
• Condition Based 
Maintenance 
• Create  Target 
Customer Offerings 
Media  
Entertainment 
• Business process 
transformation 
• Audience  Marketing 
Optimization 
• Multi-Channel 
Enablement 
• Digital commerce 
optimization 
Retail 
• Actionable Customer 
Insight 
• Merchandise 
Optimization 
• Dynamic Pricing 
Travel  
Transport 
• Customer Analytics  
Loyalty Marketing 
• Predictive Maintenance 
Analytics 
• Capacity  Pricing 
Optimization 
Consumer 
Products 
• Shelf Availability 
• Promotional Spend 
Optimization 
• Merchandising 
Compliance 
• Promotion Exceptions 
 Alerts 
Government 
• Civilian Services 
• Defense  Intelligence 
• Tax  Treasury Services 
Healthcare 
• Measure  Act on 
Population Health 
Outcomes 
• Engage Consumers in 
their Healthcare 
Automotive 
• Advanced Condition 
Monitoring 
• Data Warehouse 
Optimization 
• Actionable Customer 
Intelligence 
Life 
Sciences 
• Increase visibility into 
drug safety and 
effectiveness 
Chemical  
Petroleum 
• Operational Surveillance, 
Analysis  Optimization 
• Data Warehouse 
Consolidation, Integration 
 Augmentation 
• Big Data Exploration for 
Interdisciplinary 
Collaboration 
Aerospace 
 Defense 
• Uniform Information 
Access Platform 
• Data Warehouse 
Optimization 
• Airliner Certification 
Platform 
• Advanced Condition 
Monitoring (ACM) 
Electronics 
• Customer/ Channel 
Analytics 
• Advanced Condition 
Monitoring 
© 16 2014 IBM Corporation 
© 2013 IBM Corporation
Enabling everybody to leverage Big Data 
GPS 
External Data 
Business Users 
...offer personalized price 
promotions to different customer 
segments in real-time 
Business Development 
... find and deliver new 
mechanisms to monetize 
network traffic and partner 
with upstream content 
providers 
Administrators 
...secure, manage, and optimize data 
access and analysis operations 
Executive Leaders 
...get real-time reports and analysis 
based on data inside as well as 
outside the enterprise (web, social 
media etc.) 
Business Analysts 
... analyze social media buzz 
for the new services/offerings 
to gauge initial success and 
any course correction needed 
Developers 
... develop new Apps and 
detailed algorithms in response 
to user and business 
requirements 
Data Scientists 
... analyze subscriber usage pattern 
in real-time and combine that with the 
profile for delivering promotional or 
retention offers 
© 17 2014 IBM Corporation
Leveraging Big Data Requires Multiple Platform Capabilities 
Understand and navigate 
federated big data sources 
Manage  store huge 
volume of any data 
Federated Discovery and Navigation 
Hadoop File System 
MapReduce 
Structure and control data Data Warehousing 
Manage streaming data Stream Computing 
Analyze unstructured data Text Analytics Engine 
Integrate and govern all 
data sources 
Integration, Data Quality, Security, 
Lifecycle Management, MDM 
© 18 2014 IBM Corporation
What is Hadoop? 
 Apache open source software framework for reliable, scalable, distributed 
computing of massive amount of data 
 Hides underlying system details and complexities from user 
 Developed in Java 
 Core sub projects: 
− MapReduce 
− Hadoop Distributed File System a.k.a. HDFS 
 Supported by several Hadoop-related projects 
 HBase 
 Zookeeper 
 Avro 
 Flume 
 etc 
 Meant for heterogeneous commodity hardware 
© 19 2014 IBM Corporation
Design Principles of Hadoop 
 New way of storing and processing the data: 
− Let system handle most of the issues automatically: 
• Failures 
• Scalability 
• Reduce communications 
• Distribute data and processing power to where the data is 
• Make parallelism part of operating system 
• Relatively inexpensive hardware 
 Bring processing to Data! 
 Hadoop = HDFS + MapReduce infrastructure + … 
 Optimized to handle 
− Massive amounts of data through parallelism 
− A variety of data (structured, unstructured, semi-structured) 
− Using inexpensive commodity hardware 
 Reliability provided through replication 
© 20 2014 IBM Corporation
Map-Reduce  Hadoop  BigInsights 
© 21 2014 IBM Corporation
Hadoop Open Source Projects 
 Hadoop is supplemented by an ecosystem of open source projects 
© 22 2014 IBM Corporation
What’s a Hadoop Distribution? 
 What’s a Linux Distribution? 
− Linux Kernel 
− Open Source Tools around Kernel 
− Installer 
− Administration UI 
 Open Source Distribution Formula 
− Kernel 
− Core Projects around Kernel 
− Value Add 
• Test Components 
• Installer 
• Administration UI 
• Apps 
© 23 2014 IBM Corporation
IBM Enriches Hadoop 
 Scalable 
− New nodes can be added 
on the fly 
 Affordable 
− Massively parallel computing on 
commodity servers 
 Flexible 
− Hadoop is schema-less, and can absorb 
any type of data 
 Fault Tolerant 
− Through MapReduce 
software framework 
 Performance  reliability 
− Adaptive MapReduce, Compression, 
Indexing, Flexible Scheduler, +++ 
 Enterprise Hardening of Hadoop 
 Productivity Accelerators 
− Web-based UI’s and tools 
− End-user visualization 
− Analytic Accelerators 
− +++ 
 Enterprise Integration 
− To extend  enrich your information 
supply chain 
© 24 2014 IBM Corporation 
24
IBM BigInsights – Open Source and IBM Value Adds 
ANSI SQL 
BigSQL Optimized SQL support 
Search 
BigIndex and Data Explorer 
Predictive Modeling 
BigR scalable data mining” on R 
Real-time Analytics 
InfoSphere Streams 
Application Tooling 
Toolkits and accelerators 
Data Exploration 
BigSheets “schema-on-read” tooling 
Text Analytics 
Text processing with AQL 
Data Governance and Security 
Data Click, LDAP and Secured Cluster 
Enterprise Performance 
Adaptive Map Reduce  Big SQL 
Storage Integration 
GPFS POSIX Distributed Filesystem 
Oozie Jaql ZooKeeper Hive 
HDFS MapReduce HBase Flume 
Pig 
Lucene 
HCatalog 
Sqoop 
100% based on Apache Open Source Hadoop Components 
© 25 2014 IBM Corporation
Manage your cluster from the integrated Web Console 
 Start or stop services 
 Monitor overall system health 
 Inspect status of specific services 
 Add / remove nodes 
 Manage your Apps and workflows from the console 
 Drill down into Map/Reduce, Tasks, Attempts 
 Access status, logs, counters of individual flows / 
jobs 
© 26 2014 IBM Corporation
Manage your HDFS Files 
 Navigate the distributed file system to see what’s stored 
 Create/remove/rename directories 
 Modify permissions 
 Upload / download files, remove/rename files, Edit files 
 Execute Hadoop file system shell commands 
© 27 2014 IBM Corporation
Monitoring cluster, components and applications 
 Cluster: system load average, 
CPU/Disk/Memory/Network 
utilization, nodes live status 
 HDFS: block and file info, 
NameNode JVM and GC info, 
throughput bytes written/read 
 Mapreduce: Jobs status, Mapper, 
Reducer, JobTracker 
 HBase: region split info, #of 
queries/stored files/regions etc 
 Hive: metadata store (call frequency 
and duration) 
 Oozie statistics 
 Zookeeper: queries, latency, 
watcher count, followers etc 
 Flume: source and sink, 
#of retries and bytes written etc 
EXT E N S I B L E !! 
Build your own Monitoring Dashboards, 
with the key KPI that are of your interest! 
© 28 2014 IBM Corporation
Text Analytics: Getting measurable insights 
 Most of the world’s data is in unstructured or semi-structured text. 
 Social media is full with discussions about products and services 
 Company Internal Information is locked in blobs, description fields, and 
sometimes even discarded 
 How do you get a metrics based understanding of facts from unstructured text? 
'()
)*
+ 
Healthcare Analytics: E-Medical records, hospital 
reports 
Public Sectors Case files, police records, emergency calls… 
Automotive Quality Insight: Tech notes, call logs, 
online media 
Insurance Fraud: Insurance claims 
Social Media for Marketing: twitter, facebook, blogs, 
forums
© 29 2014 IBM Corporation
Big R 
R Clients 
“End-to-end integration of R into IBM BigInsights” 
Pull data 
(summaries) to 
R client 
Data Sources 
R Packages 
1 
2 
Embedded R Execution 
R Packages 
1. Explore, visualize, transform, 
and model big data using 
familiar R syntax and 
paradigm 
2. Scale out R 
• Partitioning of large data 
(“divide”) 
• Parallel cluster execution of 
pushed down R code 
(“conquer”) 
• All of this from within the R 
environment (Jaql, 
Map/Reduce are hidden 
from you 
• Almost any R package can 
run in this environment 
Or, push R 
functions 
right on the 
data 
© 30 2014 IBM Corporation
BigSheets - Spreadsheet-style Analytic Tool 
How it works 
 Model “big data” collected from various 
 Filter and enrich content with built-in 
 Combine data in different collections 
 Visualize results through spreadsheets, 
 Export data into common formats (if 
No programming knowledge needed! 
sources as collections 
functions 
charts 
desired) 
© 31 2014 IBM Corporation
Overview of Application Development Lifecycle 
Editors for: Java, Java MapReduce, Hive, Jaql, Pig, Big 
SQL, BigSheets Reader, BigSheets Macro, AQL 
module, Jaql Module, etc … 
Package and publish your 
application using 
the BigInsights Eclipse 
Task Launcher 
How it works 
 Sample your Data 
 Develop your application using 
BigInsights tools 
 Test your application 
 Package and publish your 
application 
 Deploy your application on the 
cluster 
Task Wizards for the ease of use 
to Develop Applications 
© 32 2014 IBM Corporation
Running Applications in Big Data 
How it works 
Build in Apps make it easy to run Big 
Data applications  tasks: 
 Import and Export Data from a 
Database or files 
 Import and Export Web and Social 
Data 
 Perform Tex Analytics on specified 
content 
 Query HBase Content 
 Query content stored in BigInsights 
using Big SQL. 
 Execute Pig or JAQL applications. 
E XT E N S I B L E !! Build your own 
applications and make them easy to 
execute from an appealing Application 
launcher 
© 33 2014 IBM Corporation
Big SQL 
SQL-based 
Application 
IBM data server 
client 
Big SQL Engine 
SQL MPP Run-time 
Data Sources 
CSV 
CSV 
Seq 
Seq 
Parquet 
Parquet 
RC 
RC 
ORC 
ORC 
Avro 
Avro 
Custom 
Custom 
JSON 
JSON 
– SELECT: joins, unions, aggregates, subqueries . . . 
– GRANT/REVOKE, INSERT … INTO 
– PL/SQL 
– Stored procs, user-defined functions 
– IBM data server JDBC and ODBC drivers 
– Java MapReduce layer replaced with high performance 
– Continuous running daemons (no start up latency) 
– Message passing allow data to flow between nodes 
– In-memory operations with ability to spill to disk (useful 
for aggregrations, sorts that exceed available RAM) 
– Cost-based query optimization with 140+ rewrite rules 
 Integration with RDBMSs via LOAD, query 
34 
 IBM’s SQL engine for Hadoop 
 Comprehensive, standard SQL 
 Optimization and performance 
IBM MPP engine (C++) 
without persisting intermediate results 
 Various storage formats supported 
– Data persisted in DFS, Hive 
– No IBM proprietary format required 
federation 
BigInsights 
© 34 2014 IBM Corporation
© 35 2014 IBM Corporation 
3 
5 
Big Data Accelerators Make it Easier than Ever to Build Big Data 
Applications 
Telecommunications 
Event Data 
CDR streaming analytics 
Deep Customer Event 
Analytics 
Ships with InfoSphere 
Streams 
Social Data Analytics 
Sentiment Analytics, Intent to 
purchase 
Ships with InfoSphere 
BigInsights  Streams 
Machine Data 
Analytics 
Operational data including 
logs for operations efficiency 
Ships with InfoSphere 
BigInsights
Social Data Analytics 
Using social media as a rich source of information 
Maybe our politicians should take 
a playbook out of the rivalry 
between duke/unc and take it 
to the courts 
http://ity.com/wfUsir 
Maybe our politicians should take 
a playbook out of the rivalry 
between duke/unc and take it 
to the courts 
http://ity.com/wfUsir 
Behavior 
I'm at Mickey's Irish Pub Downtown 
(206 3rd St, Court Ave, Raleigh) w/ 
2 others http://4sq.com/gbsaYR 
I'm at Mickey's Irish Pub Downtown 
(206 3rd St, Court Ave, Raleigh) w/ 
2 others http://4sq.com/gbsaYR 
@silliesylvia good!!! U 
shouldnt! Think about the 
important stuff, like ur 43rd 
birthday ;) 
btw happy birthday Sylvia ;) 
@silliesylvia good!!! U 
shouldnt! Think about the 
important stuff, like ur 43rd 
birthday ;) 
btw happy birthday Sylvia ;) 
Location 
Interest 
@silliesylvia I 3 your leather 
leggings!! Its so katniss!! 
@silliesylvia I 3 your leather 
leggings!! Its so katniss!! 
Interest 
@bamagirl can’t wait to 
watch sherlock with you! 
Oh, robert downey jr, I still 
love you but bbc is so 
amazing 
@bamagirl can’t wait to 
watch sherlock with you! 
Oh, robert downey jr, I still 
love you but bbc is so 
amazing 
Intent to consume 
Age 
360 degree profile 
Personal Attributes 
• Sylvia Campbell, Female, In a 
Relationship 
• 32 years old, birthday on 7/17 
• Lives near Raleigh, NC 
• College graduate; Income of 80-120k 
Buzz/Sentiment 
• Retweets BF’s comments 
• Interest in BBC shows: Downton Abbey, 
Sherlock, Fringe, (PP?) 
• Sherlock Holmes, Robert Downey, Jr. 
• Hunger Games, Katniss/J. Lawrence 
Interests/Behavior 
• Watch movies, tv shows 
• Romance plots, “hero types”, strong 
women 
• Uses iPad 3, Redbox, Hulu 
• Shopping , interest in sales/deals 
• Duke/ UNC basketball 
Consumption 
dear redbox please have 
kings speech for my new tv 
colin firth movie marathon 
dear redbox please have 
kings speech for my new tv 
colin firth movie marathon 
Intent to consume 
@silliesylvia $10 dollars says 
matthew  mary get married 
next season :) 
#downtownabbey 
@silliesylvia $10 dollars says 
matthew  mary get married 
next season :) 
#downtownabbey 
OMG OMG. just 
dropped my new ipad3 
crappola!!! 
OMG OMG. just 
dropped my new ipad3 
crappola!!! 
Consumption 
Prediction 
© 36 2014 IBM Corporation
Machine Data Analysis is a Business Imperative 
 Cost of system down-time 
− 49 percent of Fortune 500 companies experience more than 80 hours of system down time 
annually1 
• Cost of down-time varies from $90,000/hour in the media sector to $6.48 million / hour for large 
online brokerages 
• 80 hours * $6.48M = approx $500M per year 
− System downtown costs North American businesses $26.5 billion a year in lost revenue2 
 When systems go down 
− Sales and other processes stop 
− Work in progress may be destroyed 
− Failure to meet SLA’s and contractual obligations can result in damages, fees, adverse publicity 
and damage to reputation 
− Customers are lost to competitors, some permanently 
− Productivity suffers and remediation costs additional $$$’s 
© 37 2014 IBM Corporation 
37 © 2013 IBM Corporation
© 38 2014 IBM Corporation
Evolution of Cloud Technologies 
Virtualization Dynamic Hybrid 
“I want to get more out 
of my existing 
hardware” 
“I want to strategically 
use public and private 
cloud together”. 
“I want to move my 
existing middleware 
workloads to the cloud” 
Cloud Native 
“I want to rapidly build new, 
born on the cloud, engaging 
applications in a continuous 
delivery model” 
Cloud Enabled 
Business Services (SaaS) 
“I want to use an app 
without having to own it” 
© 39 2014 IBM Corporation
PaaS sits at the center of the cloud delivery model 
IT 
Admin 
Infrastructure 
as a Service 
Developer Business Person 
Platform 
as a Service 
Software 
as a Service 
Client Manages 
Applications Applications Applications 
Data Data Data 
Runtime Runtime Runtime 
Vendor Manages in Cloud 
Middleware Middleware Middleware 
O/S O/S O/S 
Vendor Manages in Cloud 
Virtualization Virtualization Virtualization 
Servers Servers Servers 
Storage Storage Storage 
Networking Networking Networking 
Vendor Manages in Cloud 
Client Manages 
CCuussttoommiizzaattiioonn;; hhiigghheerr ccoossttss;; sslloowweerr ttiimmee ttoo vvaalluuee 
Standardization; lower costs; faster time to 
value 
© 40 2014 IBM Corporation
• Move quickly, see results fast. 
• Learn by tinkering and 
playing. 
• Needs to learn new skills 
through playing and 
experimenting safely. 
• Needs freedom to experiment 
without worrying about 
pricing right away. 
Developers, Developers, Developers! 
© 41 2014 IBM Corporation
© 42 2014 IBM Corporation 
42 
Bluemix is an open-standard, cloud-based platform for building, managing, 
and running applications of all types (web, mobile, big data, new smart 
devices, and so on). 
Go Live in Seconds 
The developer can choose 
any language runtime or 
bring their own. Zero to 
production in one command. 
DevOps 
Development, monitoring, 
deployment, and logging tools 
allow the developer to run the 
entire application. 
APIs and Services 
A catalog of IBM, third party, 
and open source API services 
allow the developer to stitch an 
application together in minutes. 
On-Prem Integration 
Build hybrid environments. 
Connect to on-premise assets 
plus other public and private 
clouds. 
Flexible Pricing 
Sign up in minutes. Pay as 
you go and subscription 
models offer choice and 
flexibility. 
Layered Security 
IBM secures the platform and 
infrastructure and provides 
you with the tools to secure 
your apps. 
What is Bluemix?
Create apps quickly with prebuilt services 
Choice 
Watson 
Services 
© 43 2014 IBM Corporation 
43 
• Runtimes, services, and 
tooling up to you 
Industry Leading IBM Capabilities 
• Services leveraging the 
depth of IBM software 
• Full range of capabilities 
Completeness 
• Open source platform and 
services 
• Third party to enable key use 
cases 
Security 
Services 
Web and 
application 
services 
Cloud 
Integration 
Services 
Mobile 
Services 
Database 
services 
Big Data 
services 
Internet 
of Things 
Services 
DevOps 
Services 
A full range of capabilities to suit any great idea.
Embracing Cloud Foundry as an Open Source PaaS 
Continuing our history of embracing and extending Open Source 
44 44 © ©2014 2014 IBM IBM Corporation 
Corporation
Cloud Foundry is more than code 
Meets Developer’s 
Needs 
Focus on app 
development, not 
provisioning VMs, 
databases, messaging 
servers, etc. 
Agile development 
model 
Deploy and scale in 
seconds 
Open Cloud Platform 
There is an increasing 
appetite for cloud-based 
mobile, social and analytics 
applications 
from line-of-business 
executives - drives the need 
for a more open cloud 
development platform 
Compelling Community 
Cloud Foundry has a 
compelling community and 
emerging ecosystem as well 
as a mature set of 
capabilities and robustness 
© 45 2014 IBM Corporation
IBM extends CF by adding developer tools, runtimes,  services 
Capabilities include Java, mobile backend 
development, application monitoring, as 
well as capabilities from ecosystem 
partners and open source — all through 
an as-a-service model in the cloud. 
© 46 2014 IBM Corporation
An Entire Continuum Working Together 
Infrastructure 
Services 
Virtual Appliance 
Application 
Server 
Operating 
system 
Metadata 
Virtual Appliance 
Application 
Server 
Operating 
system 
Metadata 
Virtual Appliance 
HTTP 
Server 
Operating 
system 
Metadata 
Defined Pattern 
Services 
Systems of Record 
Business 
Services 
Composable 
Services 
Analytics 
© 47 2014 IBM Corporation

More Related Content

What's hot

Big Data: SQL query federation for Hadoop and RDBMS data
Big Data:  SQL query federation for Hadoop and RDBMS dataBig Data:  SQL query federation for Hadoop and RDBMS data
Big Data: SQL query federation for Hadoop and RDBMS data
Cynthia Saracco
 
Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014
Nicolas Morales
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
Nicolas Morales
 
Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0
Nicolas Morales
 
Big SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopBig SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on Hadoop
Wilfried Hoge
 
Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014
Data Con LA
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Leons Petražickis
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
Edureka!
 
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdHadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
IBM Analytics
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop AdministrationEdureka!
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training
Keylabs
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Jonathan Seidman
 
Power Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS CloudPower Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS Cloud
Edureka!
 
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveJan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Yahoo Developer Network
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
nvvrajesh
 
A day in the life of hadoop administrator!
A day in the life of hadoop administrator!A day in the life of hadoop administrator!
A day in the life of hadoop administrator!
Edureka!
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
Cloudera, Inc.
 
Data warehousing with Hadoop
Data warehousing with HadoopData warehousing with Hadoop
Data warehousing with Hadoop
hadooparchbook
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
Edureka!
 

What's hot (20)

Big Data: SQL query federation for Hadoop and RDBMS data
Big Data:  SQL query federation for Hadoop and RDBMS dataBig Data:  SQL query federation for Hadoop and RDBMS data
Big Data: SQL query federation for Hadoop and RDBMS data
 
Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
 
Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0
 
Big SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopBig SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on Hadoop
 
Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
 
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdHadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
 
Power Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS CloudPower Hadoop Cluster with AWS Cloud
Power Hadoop Cluster with AWS Cloud
 
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveJan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
A day in the life of hadoop administrator!
A day in the life of hadoop administrator!A day in the life of hadoop administrator!
A day in the life of hadoop administrator!
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
 
Data warehousing with Hadoop
Data warehousing with HadoopData warehousing with Hadoop
Data warehousing with Hadoop
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 

Similar to Getting started with Hadoop on the Cloud with Bluemix

OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
Big Data Joe™ Rossi
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
Big Data Joe™ Rossi
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
Vikas Manoria
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
MapR Technologies
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
IBM Cloud Data Services
 
Big data Introduction by Mohan
Big data Introduction by MohanBig data Introduction by Mohan
Big data Introduction by Mohan
Venkata Reddy Konasani
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
MapR Technologies
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Kiththi Perera
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
Kiththi Perera
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
Hortonworks
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter AnalyticsAdrian Turcu
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
Cameron. A. Bradbury
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
Cameron. A. Bradbury
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
Digital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility companyDigital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility company
Ilham Ahmed
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
Bob Hardaway
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise
DataWorks Summit
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
Amazon Web Services
 

Similar to Getting started with Hadoop on the Cloud with Bluemix (20)

OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
 
Big data Introduction by Mohan
Big data Introduction by MohanBig data Introduction by Mohan
Big data Introduction by Mohan
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Digital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility companyDigital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility company
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 

More from Nicolas Morales

Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?
Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?
Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?
Nicolas Morales
 
InfoSphere BigInsights for Hadoop @ IBM Insight 2014
InfoSphere BigInsights for Hadoop @ IBM Insight 2014InfoSphere BigInsights for Hadoop @ IBM Insight 2014
InfoSphere BigInsights for Hadoop @ IBM Insight 2014
Nicolas Morales
 
IBM Big SQL @ Insight 2014
IBM Big SQL @ Insight 2014IBM Big SQL @ Insight 2014
IBM Big SQL @ Insight 2014
Nicolas Morales
 
60 minutes in the cloud: Predictive analytics made easy
60 minutes in the cloud: Predictive analytics made easy60 minutes in the cloud: Predictive analytics made easy
60 minutes in the cloud: Predictive analytics made easy
Nicolas Morales
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop Engine
Nicolas Morales
 
SQL-on-Hadoop without compromise: Big SQL 3.0
SQL-on-Hadoop without compromise: Big SQL 3.0SQL-on-Hadoop without compromise: Big SQL 3.0
SQL-on-Hadoop without compromise: Big SQL 3.0
Nicolas Morales
 
Text Analytics
Text Analytics Text Analytics
Text Analytics
Nicolas Morales
 
Social Data Analytics using IBM Big Data Technologies
Social Data Analytics using IBM Big Data TechnologiesSocial Data Analytics using IBM Big Data Technologies
Social Data Analytics using IBM Big Data Technologies
Nicolas Morales
 
Security and Audit for Big Data
Security and Audit for Big DataSecurity and Audit for Big Data
Security and Audit for Big Data
Nicolas Morales
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
Nicolas Morales
 

More from Nicolas Morales (10)

Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?
Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?
Benchmarking SQL-on-Hadoop Systems: TPC or not TPC?
 
InfoSphere BigInsights for Hadoop @ IBM Insight 2014
InfoSphere BigInsights for Hadoop @ IBM Insight 2014InfoSphere BigInsights for Hadoop @ IBM Insight 2014
InfoSphere BigInsights for Hadoop @ IBM Insight 2014
 
IBM Big SQL @ Insight 2014
IBM Big SQL @ Insight 2014IBM Big SQL @ Insight 2014
IBM Big SQL @ Insight 2014
 
60 minutes in the cloud: Predictive analytics made easy
60 minutes in the cloud: Predictive analytics made easy60 minutes in the cloud: Predictive analytics made easy
60 minutes in the cloud: Predictive analytics made easy
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop Engine
 
SQL-on-Hadoop without compromise: Big SQL 3.0
SQL-on-Hadoop without compromise: Big SQL 3.0SQL-on-Hadoop without compromise: Big SQL 3.0
SQL-on-Hadoop without compromise: Big SQL 3.0
 
Text Analytics
Text Analytics Text Analytics
Text Analytics
 
Social Data Analytics using IBM Big Data Technologies
Social Data Analytics using IBM Big Data TechnologiesSocial Data Analytics using IBM Big Data Technologies
Social Data Analytics using IBM Big Data Technologies
 
Security and Audit for Big Data
Security and Audit for Big DataSecurity and Audit for Big Data
Security and Audit for Big Data
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 

Recently uploaded

Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 

Recently uploaded (20)

Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 

Getting started with Hadoop on the Cloud with Bluemix

  • 1. October 11, 2014 Getting started with Hadoop on the Cloud Nicolas Morales – Solutions Engineer – nicolasm@us.ibm.com @NicolasJMorales © 1 2014 IBM Corporation
  • 2. Welcome Goal: Get you started with Hadoop on the Cloud Hadoop − What technical problem is it helping solve? BIG DATA − What is Hadoop? − BigInsights (IBM’s Hadoop distro) Bluemix (IBM’s PaaS cloud solution) − What technical problem is it helping solve? − Analytics for Hadoop in the Cloud Demo Get hands-on − Bluemix: bluemix.net − Hadoop Dev: ibm.biz/hadoopdev © 2 2014 IBM Corporation
  • 3. It starts with a line of code. © 3 2014 IBM Corporation
  • 4. © 4 2014 IBM Corporation
  • 5. © 5 2014 IBM Corporation
  • 6. ! #$% © 6 2014 IBM Corporation
  • 7. What is Big Data? A way to describe data problems that are unsolvable using traditional tools More Analytics on More Data for More People © 7 2014 IBM Corporation
  • 8. What Data? Transactional Application Data Machine Data Social Data Enterprise Content © 8 2014 IBM Corporation © 2013 IBM Corporation More Analytics on More Data for More People
  • 9. © 9 2014 IBM Corporation 9
  • 10. © 10 2014 IBM Corporation 10 In 2005 there were 1.3 billion RFID tags in circulation around the world…… ……by the end of 2011, this was about 30 billion and growing even faster.
  • 11. An increasingly sensor-enabled and instrumented business environment generates HUGE volumes of data with MACHINE SPEED characteristics… 1 BILLION lines of code EACH engine generating 10 TB every 30 minutes! © 11 2014 IBM Corporation
  • 12. Welcome to the Instrumented Interconnected World! 12+ TBs of tweet data every day 12 25+ TBs of log data every day ? TBs of data every day 4.6 billion camera phones world wide 100s of millions of GPS enabled devices sold annually 2+ billion people on the Web by end 2011 30 billion RFID tags today (1.3B in 2005) 76 million smart meters in 2009… 200M by 2014 © 12 2014 IBM Corporation
  • 13. 83x 6,000,000 users on Twitter pushing out 300,000 tweets per day 500,000,000 users on Twitter pushing out 400,000,000 tweets per day 1333x © 13 2014 IBM Corporation 13
  • 14. We’ve Moved into a New Era of Computing 12+terabytes Volume Velocity Variety Veracity 5+million Only 1 in 3 of Tweets create daily. 100’s © 14 2014 IBM Corporation 14 decision makers trust their information. of different types of data. trade events per second.
  • 15. Imagine the Possibilities of Harnessing Your Data Resources Big data challenges exist in every organization today Government cuts acoustic analysis from hours to 70 Milliseconds Retailer reduces time to run queries by 80% to optimize inventory Utility avoids power failures by analyzing 10 PB of data in minutes Stock Exchange cuts queries from 26 hours to 2 minutes on 2 PB Hospital analyses streaming vitals to detect illness 24 hours earlier Telco analyses streaming network data to reduce hardware costs by 90% © 15 2014 IBM Corporation
  • 16. Every Industry can Leverage Big Data and Analytics Insurance • 360 View of Domain or Subject • Catastrophe Modeling • Fraud Abuse • Producer Performance Analytics • Analytics Sandbox Banking • Optimizing Offers and Cross-sell • Customer Service and Call Center Efficiency • Fraud Detection Investigation • Credit Counterparty Risk Telco • Pro-active Call Center • Network Analytics • Location Based Services Energy Utilities • Smart Meter Analytics • Distribution Load Forecasting/Scheduling • Condition Based Maintenance • Create Target Customer Offerings Media Entertainment • Business process transformation • Audience Marketing Optimization • Multi-Channel Enablement • Digital commerce optimization Retail • Actionable Customer Insight • Merchandise Optimization • Dynamic Pricing Travel Transport • Customer Analytics Loyalty Marketing • Predictive Maintenance Analytics • Capacity Pricing Optimization Consumer Products • Shelf Availability • Promotional Spend Optimization • Merchandising Compliance • Promotion Exceptions Alerts Government • Civilian Services • Defense Intelligence • Tax Treasury Services Healthcare • Measure Act on Population Health Outcomes • Engage Consumers in their Healthcare Automotive • Advanced Condition Monitoring • Data Warehouse Optimization • Actionable Customer Intelligence Life Sciences • Increase visibility into drug safety and effectiveness Chemical Petroleum • Operational Surveillance, Analysis Optimization • Data Warehouse Consolidation, Integration Augmentation • Big Data Exploration for Interdisciplinary Collaboration Aerospace Defense • Uniform Information Access Platform • Data Warehouse Optimization • Airliner Certification Platform • Advanced Condition Monitoring (ACM) Electronics • Customer/ Channel Analytics • Advanced Condition Monitoring © 16 2014 IBM Corporation © 2013 IBM Corporation
  • 17. Enabling everybody to leverage Big Data GPS External Data Business Users ...offer personalized price promotions to different customer segments in real-time Business Development ... find and deliver new mechanisms to monetize network traffic and partner with upstream content providers Administrators ...secure, manage, and optimize data access and analysis operations Executive Leaders ...get real-time reports and analysis based on data inside as well as outside the enterprise (web, social media etc.) Business Analysts ... analyze social media buzz for the new services/offerings to gauge initial success and any course correction needed Developers ... develop new Apps and detailed algorithms in response to user and business requirements Data Scientists ... analyze subscriber usage pattern in real-time and combine that with the profile for delivering promotional or retention offers © 17 2014 IBM Corporation
  • 18. Leveraging Big Data Requires Multiple Platform Capabilities Understand and navigate federated big data sources Manage store huge volume of any data Federated Discovery and Navigation Hadoop File System MapReduce Structure and control data Data Warehousing Manage streaming data Stream Computing Analyze unstructured data Text Analytics Engine Integrate and govern all data sources Integration, Data Quality, Security, Lifecycle Management, MDM © 18 2014 IBM Corporation
  • 19. What is Hadoop? Apache open source software framework for reliable, scalable, distributed computing of massive amount of data Hides underlying system details and complexities from user Developed in Java Core sub projects: − MapReduce − Hadoop Distributed File System a.k.a. HDFS Supported by several Hadoop-related projects HBase Zookeeper Avro Flume etc Meant for heterogeneous commodity hardware © 19 2014 IBM Corporation
  • 20. Design Principles of Hadoop New way of storing and processing the data: − Let system handle most of the issues automatically: • Failures • Scalability • Reduce communications • Distribute data and processing power to where the data is • Make parallelism part of operating system • Relatively inexpensive hardware Bring processing to Data! Hadoop = HDFS + MapReduce infrastructure + … Optimized to handle − Massive amounts of data through parallelism − A variety of data (structured, unstructured, semi-structured) − Using inexpensive commodity hardware Reliability provided through replication © 20 2014 IBM Corporation
  • 21. Map-Reduce Hadoop BigInsights © 21 2014 IBM Corporation
  • 22. Hadoop Open Source Projects Hadoop is supplemented by an ecosystem of open source projects © 22 2014 IBM Corporation
  • 23. What’s a Hadoop Distribution? What’s a Linux Distribution? − Linux Kernel − Open Source Tools around Kernel − Installer − Administration UI Open Source Distribution Formula − Kernel − Core Projects around Kernel − Value Add • Test Components • Installer • Administration UI • Apps © 23 2014 IBM Corporation
  • 24. IBM Enriches Hadoop Scalable − New nodes can be added on the fly Affordable − Massively parallel computing on commodity servers Flexible − Hadoop is schema-less, and can absorb any type of data Fault Tolerant − Through MapReduce software framework Performance reliability − Adaptive MapReduce, Compression, Indexing, Flexible Scheduler, +++ Enterprise Hardening of Hadoop Productivity Accelerators − Web-based UI’s and tools − End-user visualization − Analytic Accelerators − +++ Enterprise Integration − To extend enrich your information supply chain © 24 2014 IBM Corporation 24
  • 25. IBM BigInsights – Open Source and IBM Value Adds ANSI SQL BigSQL Optimized SQL support Search BigIndex and Data Explorer Predictive Modeling BigR scalable data mining” on R Real-time Analytics InfoSphere Streams Application Tooling Toolkits and accelerators Data Exploration BigSheets “schema-on-read” tooling Text Analytics Text processing with AQL Data Governance and Security Data Click, LDAP and Secured Cluster Enterprise Performance Adaptive Map Reduce Big SQL Storage Integration GPFS POSIX Distributed Filesystem Oozie Jaql ZooKeeper Hive HDFS MapReduce HBase Flume Pig Lucene HCatalog Sqoop 100% based on Apache Open Source Hadoop Components © 25 2014 IBM Corporation
  • 26. Manage your cluster from the integrated Web Console Start or stop services Monitor overall system health Inspect status of specific services Add / remove nodes Manage your Apps and workflows from the console Drill down into Map/Reduce, Tasks, Attempts Access status, logs, counters of individual flows / jobs © 26 2014 IBM Corporation
  • 27. Manage your HDFS Files Navigate the distributed file system to see what’s stored Create/remove/rename directories Modify permissions Upload / download files, remove/rename files, Edit files Execute Hadoop file system shell commands © 27 2014 IBM Corporation
  • 28. Monitoring cluster, components and applications Cluster: system load average, CPU/Disk/Memory/Network utilization, nodes live status HDFS: block and file info, NameNode JVM and GC info, throughput bytes written/read Mapreduce: Jobs status, Mapper, Reducer, JobTracker HBase: region split info, #of queries/stored files/regions etc Hive: metadata store (call frequency and duration) Oozie statistics Zookeeper: queries, latency, watcher count, followers etc Flume: source and sink, #of retries and bytes written etc EXT E N S I B L E !! Build your own Monitoring Dashboards, with the key KPI that are of your interest! © 28 2014 IBM Corporation
  • 29. Text Analytics: Getting measurable insights Most of the world’s data is in unstructured or semi-structured text. Social media is full with discussions about products and services Company Internal Information is locked in blobs, description fields, and sometimes even discarded How do you get a metrics based understanding of facts from unstructured text? '()
  • 30. )*
  • 31. + Healthcare Analytics: E-Medical records, hospital reports Public Sectors Case files, police records, emergency calls… Automotive Quality Insight: Tech notes, call logs, online media Insurance Fraud: Insurance claims Social Media for Marketing: twitter, facebook, blogs, forums
  • 32.
  • 33.
  • 34. © 29 2014 IBM Corporation
  • 35. Big R R Clients “End-to-end integration of R into IBM BigInsights” Pull data (summaries) to R client Data Sources R Packages 1 2 Embedded R Execution R Packages 1. Explore, visualize, transform, and model big data using familiar R syntax and paradigm 2. Scale out R • Partitioning of large data (“divide”) • Parallel cluster execution of pushed down R code (“conquer”) • All of this from within the R environment (Jaql, Map/Reduce are hidden from you • Almost any R package can run in this environment Or, push R functions right on the data © 30 2014 IBM Corporation
  • 36. BigSheets - Spreadsheet-style Analytic Tool How it works Model “big data” collected from various Filter and enrich content with built-in Combine data in different collections Visualize results through spreadsheets, Export data into common formats (if No programming knowledge needed! sources as collections functions charts desired) © 31 2014 IBM Corporation
  • 37. Overview of Application Development Lifecycle Editors for: Java, Java MapReduce, Hive, Jaql, Pig, Big SQL, BigSheets Reader, BigSheets Macro, AQL module, Jaql Module, etc … Package and publish your application using the BigInsights Eclipse Task Launcher How it works Sample your Data Develop your application using BigInsights tools Test your application Package and publish your application Deploy your application on the cluster Task Wizards for the ease of use to Develop Applications © 32 2014 IBM Corporation
  • 38. Running Applications in Big Data How it works Build in Apps make it easy to run Big Data applications tasks: Import and Export Data from a Database or files Import and Export Web and Social Data Perform Tex Analytics on specified content Query HBase Content Query content stored in BigInsights using Big SQL. Execute Pig or JAQL applications. E XT E N S I B L E !! Build your own applications and make them easy to execute from an appealing Application launcher © 33 2014 IBM Corporation
  • 39. Big SQL SQL-based Application IBM data server client Big SQL Engine SQL MPP Run-time Data Sources CSV CSV Seq Seq Parquet Parquet RC RC ORC ORC Avro Avro Custom Custom JSON JSON – SELECT: joins, unions, aggregates, subqueries . . . – GRANT/REVOKE, INSERT … INTO – PL/SQL – Stored procs, user-defined functions – IBM data server JDBC and ODBC drivers – Java MapReduce layer replaced with high performance – Continuous running daemons (no start up latency) – Message passing allow data to flow between nodes – In-memory operations with ability to spill to disk (useful for aggregrations, sorts that exceed available RAM) – Cost-based query optimization with 140+ rewrite rules Integration with RDBMSs via LOAD, query 34 IBM’s SQL engine for Hadoop Comprehensive, standard SQL Optimization and performance IBM MPP engine (C++) without persisting intermediate results Various storage formats supported – Data persisted in DFS, Hive – No IBM proprietary format required federation BigInsights © 34 2014 IBM Corporation
  • 40. © 35 2014 IBM Corporation 3 5 Big Data Accelerators Make it Easier than Ever to Build Big Data Applications Telecommunications Event Data CDR streaming analytics Deep Customer Event Analytics Ships with InfoSphere Streams Social Data Analytics Sentiment Analytics, Intent to purchase Ships with InfoSphere BigInsights Streams Machine Data Analytics Operational data including logs for operations efficiency Ships with InfoSphere BigInsights
  • 41. Social Data Analytics Using social media as a rich source of information Maybe our politicians should take a playbook out of the rivalry between duke/unc and take it to the courts http://ity.com/wfUsir Maybe our politicians should take a playbook out of the rivalry between duke/unc and take it to the courts http://ity.com/wfUsir Behavior I'm at Mickey's Irish Pub Downtown (206 3rd St, Court Ave, Raleigh) w/ 2 others http://4sq.com/gbsaYR I'm at Mickey's Irish Pub Downtown (206 3rd St, Court Ave, Raleigh) w/ 2 others http://4sq.com/gbsaYR @silliesylvia good!!! U shouldnt! Think about the important stuff, like ur 43rd birthday ;) btw happy birthday Sylvia ;) @silliesylvia good!!! U shouldnt! Think about the important stuff, like ur 43rd birthday ;) btw happy birthday Sylvia ;) Location Interest @silliesylvia I 3 your leather leggings!! Its so katniss!! @silliesylvia I 3 your leather leggings!! Its so katniss!! Interest @bamagirl can’t wait to watch sherlock with you! Oh, robert downey jr, I still love you but bbc is so amazing @bamagirl can’t wait to watch sherlock with you! Oh, robert downey jr, I still love you but bbc is so amazing Intent to consume Age 360 degree profile Personal Attributes • Sylvia Campbell, Female, In a Relationship • 32 years old, birthday on 7/17 • Lives near Raleigh, NC • College graduate; Income of 80-120k Buzz/Sentiment • Retweets BF’s comments • Interest in BBC shows: Downton Abbey, Sherlock, Fringe, (PP?) • Sherlock Holmes, Robert Downey, Jr. • Hunger Games, Katniss/J. Lawrence Interests/Behavior • Watch movies, tv shows • Romance plots, “hero types”, strong women • Uses iPad 3, Redbox, Hulu • Shopping , interest in sales/deals • Duke/ UNC basketball Consumption dear redbox please have kings speech for my new tv colin firth movie marathon dear redbox please have kings speech for my new tv colin firth movie marathon Intent to consume @silliesylvia $10 dollars says matthew mary get married next season :) #downtownabbey @silliesylvia $10 dollars says matthew mary get married next season :) #downtownabbey OMG OMG. just dropped my new ipad3 crappola!!! OMG OMG. just dropped my new ipad3 crappola!!! Consumption Prediction © 36 2014 IBM Corporation
  • 42. Machine Data Analysis is a Business Imperative Cost of system down-time − 49 percent of Fortune 500 companies experience more than 80 hours of system down time annually1 • Cost of down-time varies from $90,000/hour in the media sector to $6.48 million / hour for large online brokerages • 80 hours * $6.48M = approx $500M per year − System downtown costs North American businesses $26.5 billion a year in lost revenue2 When systems go down − Sales and other processes stop − Work in progress may be destroyed − Failure to meet SLA’s and contractual obligations can result in damages, fees, adverse publicity and damage to reputation − Customers are lost to competitors, some permanently − Productivity suffers and remediation costs additional $$$’s © 37 2014 IBM Corporation 37 © 2013 IBM Corporation
  • 43. © 38 2014 IBM Corporation
  • 44. Evolution of Cloud Technologies Virtualization Dynamic Hybrid “I want to get more out of my existing hardware” “I want to strategically use public and private cloud together”. “I want to move my existing middleware workloads to the cloud” Cloud Native “I want to rapidly build new, born on the cloud, engaging applications in a continuous delivery model” Cloud Enabled Business Services (SaaS) “I want to use an app without having to own it” © 39 2014 IBM Corporation
  • 45. PaaS sits at the center of the cloud delivery model IT Admin Infrastructure as a Service Developer Business Person Platform as a Service Software as a Service Client Manages Applications Applications Applications Data Data Data Runtime Runtime Runtime Vendor Manages in Cloud Middleware Middleware Middleware O/S O/S O/S Vendor Manages in Cloud Virtualization Virtualization Virtualization Servers Servers Servers Storage Storage Storage Networking Networking Networking Vendor Manages in Cloud Client Manages CCuussttoommiizzaattiioonn;; hhiigghheerr ccoossttss;; sslloowweerr ttiimmee ttoo vvaalluuee Standardization; lower costs; faster time to value © 40 2014 IBM Corporation
  • 46. • Move quickly, see results fast. • Learn by tinkering and playing. • Needs to learn new skills through playing and experimenting safely. • Needs freedom to experiment without worrying about pricing right away. Developers, Developers, Developers! © 41 2014 IBM Corporation
  • 47. © 42 2014 IBM Corporation 42 Bluemix is an open-standard, cloud-based platform for building, managing, and running applications of all types (web, mobile, big data, new smart devices, and so on). Go Live in Seconds The developer can choose any language runtime or bring their own. Zero to production in one command. DevOps Development, monitoring, deployment, and logging tools allow the developer to run the entire application. APIs and Services A catalog of IBM, third party, and open source API services allow the developer to stitch an application together in minutes. On-Prem Integration Build hybrid environments. Connect to on-premise assets plus other public and private clouds. Flexible Pricing Sign up in minutes. Pay as you go and subscription models offer choice and flexibility. Layered Security IBM secures the platform and infrastructure and provides you with the tools to secure your apps. What is Bluemix?
  • 48. Create apps quickly with prebuilt services Choice Watson Services © 43 2014 IBM Corporation 43 • Runtimes, services, and tooling up to you Industry Leading IBM Capabilities • Services leveraging the depth of IBM software • Full range of capabilities Completeness • Open source platform and services • Third party to enable key use cases Security Services Web and application services Cloud Integration Services Mobile Services Database services Big Data services Internet of Things Services DevOps Services A full range of capabilities to suit any great idea.
  • 49. Embracing Cloud Foundry as an Open Source PaaS Continuing our history of embracing and extending Open Source 44 44 © ©2014 2014 IBM IBM Corporation Corporation
  • 50. Cloud Foundry is more than code Meets Developer’s Needs Focus on app development, not provisioning VMs, databases, messaging servers, etc. Agile development model Deploy and scale in seconds Open Cloud Platform There is an increasing appetite for cloud-based mobile, social and analytics applications from line-of-business executives - drives the need for a more open cloud development platform Compelling Community Cloud Foundry has a compelling community and emerging ecosystem as well as a mature set of capabilities and robustness © 45 2014 IBM Corporation
  • 51. IBM extends CF by adding developer tools, runtimes, services Capabilities include Java, mobile backend development, application monitoring, as well as capabilities from ecosystem partners and open source — all through an as-a-service model in the cloud. © 46 2014 IBM Corporation
  • 52. An Entire Continuum Working Together Infrastructure Services Virtual Appliance Application Server Operating system Metadata Virtual Appliance Application Server Operating system Metadata Virtual Appliance HTTP Server Operating system Metadata Defined Pattern Services Systems of Record Business Services Composable Services Analytics © 47 2014 IBM Corporation
  • 53. IBM Analytics for Hadoop Service Powered by − BigInsights 3.0 Bluemix Get started with Hadoop in Minutes − Tutorial: https://developer.ibm.com/hadoop/docs/tutorials/ Dedicated Single Node Env • BIAdmin Authority • Access to the Web console • Secure HTTPS channel powered by SSL certificates • Bluemix Single Sign On (SSO) © 48 2014 IBM Corporation
  • 54. Register today at bluemix.net With on-demand services and infrastructure, developers can go from 0 to running code in a matter of minutes. 1. Rapidly bring products and services to market at lower cost 2. Continuously deliver new functionality to their applications 3. Extend existing investments in IT infrastructure When coupled with DevOps, teams both large and small can automate the development and delivery of many applications. By connecting securely to on-prem infrastructure, organizations can extend their existing investments. © 49 2014 IBM Corporation
  • 55. Want to learn more? Download Quick Start Edition Test drive the technologies – Follow online tutorials – Enroll in online classes – Watch video demos, read articles, etc. Links all available from HadoopDev – https://developer.ibm.com/hadoop/ © 50 2014 IBM Corporation
  • 56. BigInsights Quick Start Edition Download: http://ibm.co/QuickStart © 51 2014 IBM Corporation
  • 57. Big Data Developers FREE All types of practitioners All skill levels Hands-on Labs Future Meetups: − Hadoop − Text Analytics − Real-time Analytics − SQL for Hadoop − HBase − Social Media Analytics − Machine Data Analytics − Security and Privacy http://www.meetup.com/BigDataDevelopers/ http://bigdatadevelopers.meetup.com/ © 52 2014 IBM Corporation