This document discusses how hyperconverged infrastructure from Nutanix can help kick start big data projects. It provides an overview of Nutanix's capabilities including its web-scale design principles, use of open source technologies like Cassandra and Zookeeper, ability to provide local flash storage, data locality, automatic disk balancing, snapshots, compression and erasure coding. It also discusses how Nutanix can help simplify management and provide analytics. Specific big data workloads that can benefit including Hadoop, NoSQL, Splunk and databases are also covered.
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Big Data LDN 2016: Kick Start your Big Data project with Hyperconverged Infrastructure
1. KICK START YOUR BIG DATA PROJECT WITH
HYPERCONVERGED INFRASTRUCTURE
Ray Hassan, Solutions & Performance Engineering
ray@nutanix.com
@cannybag
2. 2
About Nutanix
3750+ customers
Over 70 countries
6 continents
Founded in 2009
1980+ employees
Make datacenter infrastructure invisible,
elevating IT to focus on applications and services
8. 8
Local (Flash + HDD)
Single Storage Pool
Evolution of the Datacenter
High Perf and Capacity Optimization
Data Protection and DR
Resilience
Security
Node 1 Node 2 Node N
X86 X86 X86
Hypervisor Hypervisor Hypervisor
CVM CVM CVM
9. 9
Data Locality
Keep data on the same node
as VM
All read operations localized
on same node
ILM transparently moves
remote data to local
controller
Reduces network chattiness
significantly
Data follows VM during
vMotion/Live Migration
10. 10
Flash Made Easy
• Advanced Auto-Tiering reduces complicated configuration
• Ability to by-pass the flash tier
• All flash is made accessible to all of the nodes
• Ability to handle a variety of Big Data workloads – Hadoop, NoSQL, Kafka,
Spark, Splunk….
Com put e fingerprint and
st ore in met adat a
SSD
Mem ory
HDD
Cont ent
Cache
Op Log
Ext ent St ore
Ext ensibleCloud NA S, et c.
CacheDrain
Random
Sequential
Read I/ O
W rit e I/ O
NDFSEDE
11. 11
Automatic Disk Balancing
Real-time balancing of storage within the cluster/nodes
Supports heterogeneous and homogenous node types (Compute
heavy/Storage heavy)
Uniform distribution of data
Leverages MapReduce framework
Requires no manual intervention
This process is done both during runtime with node/disk placement as well as
back ground Curator process
Larger aka “Storage Heavy” nodes will have larger capacities
and hence hold more data
After disk balancing has run the utilization
will be uniform
NDFS
Hypervisor Hypervisor Hypervisor
VM 1 VM N VM 1 VM N VM 1 VM N
Storage Storage Storage
CVM CVM CVM35% 35%
35%
12. 12
Nutanix Local Snapshots (Time Stream)
RPO: minutes
RTO: minutes
Use Cases
Protection against Guest OS corruption
Snapshot VM environments
Self-Service File Level Restore
Points of differentiation
VM granularity
No performance impact
VM and application level consistency
Lower $ / GB with storage heavy/only
nodes
vdisk Local VM-Centric
Snapshots
Primary
Cluster
CPU
Memory
CPU
Memory
CPU
Memory
CPU
Memory
Nutanix
Snapshots have
byte-level
Resolution
13. 13
Compression
Inline and post-process compression
Inline: Data compressed as it’s written
MapReduce: Data compressed after “cold”
data is migrated to lower-performance
storage tiers
10100101
10101010
10100101
10101010
10100101
10101010
10100101
10101010
10100101
10101010
10100101
10101010
10100101
10101010
No impact to normal IO path
Ideal for random batch workloads
Uses Snappy algorithm
14. 14
Erasure Coding (EC-X)
RAID-5, RAID-6, RAID-DP on
Disks
Erasure Coding across Nodes
Storage optimization, keeping resiliency
unchanged
Optimizes availability (fast rebuilds)
Uses the power of the entire cluster
Up to 75% increase in usable capacity
• Bottlenecked by single disk
• Hardware defined
• Hot spares waste space
• Decreased write performance
• Slow rebuilds
15. 16
Save on Archiving and Licensing
NX-8150 Series
Compute and storage
NX-6035C Series
Storage only
10GbpsEthernet
IOPS Storage
18. 19
Enterprise Databases Meet Web-Scale
Transactional Databases
(OLTP)
• Localized I/O for low latency random
operations
• SSD for working set, indexes and key
database files
• Ability to automatically tier data
depending on usage
Analytical Databases
(OLAP/DSS)
• Local read I/O for high-performance
queries and reporting
• Abundant sequential write and read
throughput
• Scalable performance and capacity
ESXi
Local + Remote
(Flash + HDD)
Distributed Storage Fabric
Intelligent tiering, VM-centric management and
more…
Snapshots
Clones
Compression
Deduplication
Node 1 Node 2 Node N
X86 X86 X86
CVM CVM CVM
AHV Hyper-V AHVESXi Hyper-V AHVESXi Hyper-V
Acropolis App Mobility Fabric
Nutanix Controller VM
(One per node)
Tier 1 Workloads
(Running on all nodes)
19. 20
Splunk on Nutanix
Ability to ingest GBs of data per day
1 TB+/day of data ingest
ample for most deployments
Quick search capabilities
for mission critical applications
Accelerated search
through server-side flash
Ability to support
growth in data ingest rates
Predictable, linear performance
through distributed architecture
Self-contained deployments
due to data security and privacy
Quick, manageable
deployment through appliance
20. 21
Nutanix Big Data Resources
Big Data on Nutanix:
http://www.nutanix.com/solutions/big-data/
Reference Architectures:
http://go.nutanix.com/virtualizing-splunk-on-nutanix-
AHV.html
http://go.nutanix.com/hadoop-virtualization.html
Solution Notes:
http://go.nutanix.com/virtualizing-elastic-stack-on-ahv.html
Best Practice Guides
http://go.nutanix.com/best-practices-to-virtualizing-
mongoDB.html
http://www.nutanix.com/go/docker-container-best-practices-
guide-with-AHV.html
Points to land:
Goal is to make IT infrastructure invisible
One of the fastest growing infrastructure companies in the last decade, if not the fastest. Founded in 2009, we have been selling for over 4 years now.
Numbers are not current, but over 1750 customers growing at a phenomenal rate across 70 different countries. Focused on international presence very early.
Over 1100 employees across 70 different countries.
Purpose of the Slide:
Talk about where and how web-scale IT originated, and what some of the common are between different web-scale data centers. Web-scale data centers embody many of the principals of invisible infrastructure, and serve as the inspiration for Nutanix enterprise computing solutions.
Main Points:
Web companies like Google and Facebook started pushing the limits of existing infrastructure systems and processes in ways that traditional businesses did not. They needed infrastructure that could support their business requirements (rapid application development cycles, scale on demand, cost containment). They tried using existing infrastructure solutions, but quickly realized that legacy infra was a poor fit for their needs.
Over time, these companies developed an alternate approach to IT that enabled them to get past limitations in infrastructure. Some common traits of web-scale IT:
Infrastructure built from commodity server hardware pooled together using intelligent software. This allows customers to start small and scale one server at a time – true scale-out
The software in the system is distributed across all the nodes. You don’t have central metadata servers or name nodes. You don’t see controller bottlenecks
Embarrassingly parallel operations – everything in the system, including storage functions like deduplication and metadata management and system cleanup, is distributed across all nodes. There are no hotspots or bottlenecks, allowing for massive scale
Compute and storage sit very close to each other. Data does not have to go back and forth between storage and compute over a network. Data has gravity, so co-locating storage and compute eliminates network bottlenecks and system slowdown
Heavy automation eliminates the need for expensive, error-prone manual operations
Hyperconvergence solves this issue and a lot more. We still have servers with direct attached storage. But what we do is pool storage from all the servers in a cluster into a shared storage pool so that storage from all the independent nodes are available to all the nodes in the cluster and the associated VMs.
You can start off with as little as 3 nodes in a cluster and incrementally grow from there. When you need additional capacity – you just add a node – either compute heavy, storage heavy depending on what you need.
All the enterprise storage capabilities that shared storage solutions such as NetApp and EMC provided, Nutanix provides it.
Now addressing the bottleneck/hotspot problem in shared storage – the storage controller is virtualized and is present on every node in the system. When you can virtualize mission critical workloads such as Oracle and SQL, why not virtualize storage as well. To us storage is an App as well and should be the first App that should be virtualized.
Everytime a node is added a CVM gets added. All the requests from the user VM will be handled by the CVM that sits on the same node. So requests don’t typically have to go through the network.
Rebuilds take less
Purpose of this slide:Emphasize the capacity reduction made possible through optimization
Key Points:
Nutanix Virtual Computing Platform delivers data reduction functionality to help drive down the cost of storage. This includes
Inline compression
Data compressed as its written (synchronously)
Ideal for archival data such as Exchange logs for compliance
High performance for sequential workloads (OLAP databases)
Post-process compression
Data compressed after “cold” data is migrated to lower-performance storage tiers
Processed only when data and compute resources are available
No impact to normal IO path
Ideal for random batch workloads
Purpose-built for virtualization
Increased usable capacity across all storage tiers
Compression policies align with VM-centric workflows
Maximum compression/decompression performance with Snappy algorithm
Sub-block compression for granularity and maximum efficiency
6035C will migrate data of nodes and save on licening
Simplicity
Prism brings simplicity to infrastructure management, and features innovative One-Click technology that streamlines time-consuming IT tasks
One-Click Infrastructure Management - Nutanix Prism gives administrators an easy way to manage virtual environments running on Acropolis. It simplifies and streamlines common workflows for hypervisor and virtual machine (VM) management, from VM creation and migration to virtual network setup and hypervisor upgrades. Rather than replicating the full set of features found in other virtualization solutions, virtualization management in Prism has been designed for an uncluttered, consumer-grade experience.
One-click Operational Insight - Prism provides rich insights about day-to-day operations of the datacenter. Prism is powered by advanced machine learning technology with built-in heuristics and business intelligence to easily and quickly mine large volumes of system data and generate actionable insights for optimizing all aspects of infrastructure performance.
One-Click Remediation - Nutanix Prism reduces mean time to resolution significantly by proactively analyzing trends and predict when certain failures can occur. In addition to that Prism also provides customized recommendations to resolve issues with a single click. Instead of showing individual alerts, Prism intelligently maps alerts to specific services and shows the errors impact on applications, helping IT admins gain more confidence about how their applications are running.
The Nutanix platform is built for multi-workloads and supports multi-IO footprints required by workloads/use-cases. Platform can essentially support the random IO footprint for VDI, random and sequential I/O nature for SQL server and sequential IO requirements of general server workloads simultaneously while eliminating ‘noisy neighbor’ problem
In case of SQL server, it comes down to two scenarios:
Transactional databases/ OLTP databases: Minimizing latencies is very important in OLTP databases. Because Nutanix platform has local storage controller model, the IO requests do not go over network, which reduces latency. In traditional architecture the network lies in the data-path, which implies additional latency. Further Nutanix platform has server attached flash and in-memory and SSD caches so that indexes and heavy accessed files can reside in SSD for accelerated performance and less accessed files can reside on HDD tier.
Nutanix platform has ability to handle both Random IO and Sequential IO workloads equally well. Random IO nature of a database data file is handled with flash tier, the latency sensitive sequential nature log files/ tempDB are handled on SSD tier and standard sequential data can be facilitated to HDD tier.
2. Analytical Databases (OLAP/DSS): These databases involve lot of read workloads/batch reporting and insertion. Nutanix provides local read I/O for these reports from local controllers, which accelerates performance of the query and store procedures leading to faster reporting times for end user. Also, the platform has ample sequential and random throughput. If you need to write/load TB worth of data or you need to read GB/TB of data the platform can handle it.
As quantity of data will grow over time and with that you will need to scale your compute and/or storage infrastructure. Nutanix simplifies the scaling, with the portfolio of turnkey applications and you can chose from, based on your needs. And because with Nutanix distributed architecture the additional storage resources will be seamlessly pooled into existing resources and compute capacity will get added to your virtualization pool.
Goal: Explain that Splunk has a set of requirements that the Nutanix distributed architecture caters well to.
- Ability to ingest GBs of data per day – Splunk is collecting and storing large amounts of machine data from many sources
-Given Splunk is often utilized for making mission critical decisions, quick search capability is key. Nutanix server side flash and local data accelerates search
- As enterprises and business users get more comfortable with Splunk, the infrastructure needs to easily be able to support growth. Nutanix provides predicted, linear scale out through our storage controller that is on every single node.
-Given the sensitivity of the data collected and the number of business units within a n org utilizing Splunk – important to have self contained deployments
Goal: Explain that Splunk has a set of requirements that the Nutanix distributed architecture caters well to.
- Ability to ingest GBs of data per day – Splunk is collecting and storing large amounts of machine data from many sources
-Given Splunk is often utilized for making mission critical decisions, quick search capability is key. Nutanix server side flash and local data accelerates search
- As enterprises and business users get more comfortable with Splunk, the infrastructure needs to easily be able to support growth. Nutanix provides predicted, linear scale out through our storage controller that is on every single node.
-Given the sensitivity of the data collected and the number of business units within a n org utilizing Splunk – important to have self contained deployments