• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
How a Cloud Computing Provider Reached the Holy Grail of Visibility
 

How a Cloud Computing Provider Reached the Holy Grail of Visibility

on

  • 653 views

 

Statistics

Views

Total Views
653
Views on SlideShare
650
Embed Views
3

Actions

Likes
0
Downloads
20
Comments
0

2 Embeds 3

https://www.linkedin.com 2
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    How a Cloud Computing Provider Reached the Holy Grail of Visibility How a Cloud Computing Provider Reached the Holy Grail of Visibility Presentation Transcript

    • SPO3378How a Cloud ComputingProvider Reached theHoly Grail of VisibilityElad Gotfrid, CloudShareLeena Joshi, Splunk Inc #vmworldsponsor
    • How A Cloud Computing Provider Reached the Holy Grail of Visibility SPO3378 Elad Gotfrid Director of IT @ CloudShare Leena Joshi Director, Solutions Marketing, SplunkCONFIDENTIAL
    • Company Overview About:  Headquartered in San Mateo, CA  Founded in 2007  70,000+ users worldwide  Backed by leading VCs: Sequoia, CRV, Globespan, Gemini The Leading Cloud for Pre-Production  Focus on Dev/Test/Pre Production Segment  Many Fortune 500 customers including:  McAfee, HP, SAP, Cisco, Dell , Microsoft , IBM , Juniper  40% of Microsoft SharePoint MVPs and MCMs already adopted CloudShare for development, testing and training3 | CONFIDENTIAL
    • Company Platform Benefits  CloudShare IAAS (infrastructure as a service) platform grants each customer his own private multi-VM networked environment including compute resources, networking, IP, Preinstalled OS.4 | CONFIDENTIAL
    • CloudShare Operations Overview CloudShare platform is designed to handle high load:  Running 150,000 Customer Virtual Machines per month  During peak hours our system perform ~500 VM Resume/Suspend operations in an hour  Robust dynamic assignment of infrastructure resources including:  ESX Server  Storage units  Firewall  Switches  VLANs  Public IPs5 | CONFIDENTIAL
    • CloudShare Custom Cloud  CloudShare uses its own patent pending Backend “private cloud” system designed to handle all virtual machine and datacenter life cycle:  Environments operation  Environment lifecycle  Self healing & Error correction  Resource management  Manage large scale infrastructure:  15 VMware Virtual Centers  20 storage units  Hundreds of switch ports/Gateway configuration6 | CONFIDENTIAL
    • IT/Operations Challenges  Looking for a centralized console for complete IT/Operations visibility  Business Requirements: • Aggregate all IT/Infrastructure data into a single console Data Aggregation • Correlate business data with performance/application data Data Correlation • Analyze and search the data Data • Find patterns and correlation between events Analysis7 | CONFIDENTIAL
    • The Trick Is Finding a Way to Interact8 | CONFIDENTIAL
    • Enter Splunk  Evaluated Splunk for a narrow use initially  Quickly realized it could do a lot more  Eventually standardized on it9 | CONFIDENTIAL
    • Splunk Collects and Indexes Any Machine Data Customer Outside the Facing Data DatacenterClick-stream data Manufacturing, logistics…Shopping cart data CDRs & IPDRsOnline transaction data Power consumption Logfiles Configs Messages Traps Metrics Scripts Changes Tickets RFID data Alerts GPS dataWindows Linux/Unix Virtualization Applications Databases Networking & Cloud Registry Configurations Web logs Configurations Configurations Event logs syslog Hypervisor Log4J, JMS, JMX Audit/query logs syslog File system File system Guest OS, Apps .NET events Tables SNMP sysinternals ps, iostat, top Cloud Code and scripts Schemas netflowCopyright © 2012, Splunk Inc. 10 Listen to your data.
    • Splunk Collects and Indexes Any Machine Data Customer Outside the Facing Data DatacenterClick-stream data Manufacturing, logistics…Shopping cart data • Any amount, any location, any source CDRs & IPDRsOnline transaction data Power consumption Logfiles No upfront schema Configs Messages Traps Metrics Scripts Changes Tickets RFID data No custom connectors Alerts GPS data No RDBMSWindows Linux/Unix Virtualization Applications Databases Networking & Cloud Registry Configurations Web logs Configurations Configurations Event logs syslog Hypervisor Log4J, JMS, JMX Audit/query logs syslog File system File system Guest OS, Apps .NET events Tables SNMP sysinternals ps, iostat, top Cloud Code and scripts Schemas netflowCopyright © 2012, Splunk Inc. 11 Listen to your data.
    • Turn Machine Data Into Operational Intelligence Machine Data Operational Intelligence Business Insights Gain real-time insight from your machine data to make better-informed business decisions. Operational Visibility Gain operational visibility to make better-informed IT decisions. Proactive Monitoring Monitor infrastructure to identify issues, problems and attacks before they impact your customers and services. Search and Investigation Find and fix problems across the organization using machine data.Copyright © 2012, Splunk Inc. 12 Listen to your data.
    • A Single Solution for Operational Intelligence Single Data Store Single UI Across Use Cases Three Primary Capabilities Search/ Real-time Historical Navigation Visibility Analytics • Data drilldown • Live dashboards • Baseline and thresholds • “Needle in a haystack” • Event correlation • Trending • Root cause analysis / • Monitoring and alerting • Operational insights troubleshooting • Performance issues • Historical patterns • Incident investigations • Transaction levels • Compliance reports • SLA trackingCopyright © 2012, Splunk Inc. 13 Listen to your data.
    • Create and Share Dashboards in Minutes Auditors IT Executives Marketing & Business Analysts Other Executives & Business Owners Deliver new levels of visibility and insight for IT and the business from operational dataCopyright © 2012, Splunk Inc. 14 Listen to your data.
    • Splunk Adoption  Splunk adoption was IT and R&D driven:  From hundreds of daily e-mail alerts to few actionable email alerts  Massive use in QA for finding anomalies and issues  Dashboards for:  Performance trends  Current system status  Capacity planning  Root cause analysis  Business metrics  Viral adoption within the organization. From DevOps to IT, R&D, Marketing and Management15 | CONFIDENTIAL
    • Splunk As a Data Aggregator App IIS Google VMware Docs Backend SQL data Network/ Storage GWs/FW API Actions Incident Salesforce Management16 | CONFIDENTIAL
    • Splunk As a Central Platform in CloudShare Support/NOC: IT/Ops: R&D: Management: Marketing: • Performance • Capacity • Debug / Error • SLA • Tracking – Visits, data IT Planning Operations logsSecurity Compliance Leads, Deals, • BI (Cohort Usage patterns • System Alerts • Performance • Health analysis, Support/ Monitoring measurements dashboards) • Qualifying leads NOC: • • System Perform Usage • A/B Testing ance Data • Logs • System Alerts Developer Framework17 | CONFIDENTIAL
    • Splunk Provides Operational Intelligence  Allows CloudShare to correlate the business data (Users, Usage) with the IT/Infrastructure data  Examples :  Understand how much resources each customer consumes (CPU, Memory, Network, etc…) and when  Customer can have more than 1 VM or environment, Splunk helps us aggregate the data easily and look at the customer level usage18 | CONFIDENTIAL
    • Splunk Dashboards  Management Dashboard – full visibility to business critical Metrics SLA Dashboards - Measure service level - Analyze and present statistics according to business guidelines Capacity Planning - High Level status for management on capacity - True visibility into operational data19 | CONFIDENTIAL
    • Splunk Dashboards  Dashboard for high utilization storage consumers:  All storage related data is collected by splunk  List of number of IOPS per business unit or customer  We can easily identify our top storage consumer20 | CONFIDENTIAL
    • Splunk Dashboards  Dashboard for storage latency :  All VMware storage related data is collected by splunk  We can easily identify poor performing storage unit  Latency is calculated when a 20ms threshold was breached21 | CONFIDENTIAL
    • Splunk Dashboards  High Level Dashboard of system usage (used by the NOC/Support)22 | CONFIDENTIAL
    • Splunk Dashboards  High Level Dashboard of Network usage (used by the IT/Network Ops)  Show drilldown per user on number of connections, packets, traffic  Full visibility on traffic usage and patterns per customer23 | CONFIDENTIAL
    • BI usage for Marketing and Management  Tracking of conversion:  Visit to Lead  Lead to Deal  Lead Qualification  A/B Testing  Churn analysis  Cohort Analysis  User engagement score  Feature usage  Customers usage patterns24 | CONFIDENTIAL
    • Splunk App for VMware  Active participants in the beta group for Splunk app for VMware  Splunk app collect data from ESX,VC including :  ESX Logs, VC Logs  Performance  Tasks/Events  Inventory  Topology  Collect metrics from the host ESX/ESXi servers at a low level of granularity (20 second granularity)25 | CONFIDENTIAL
    • Splunk App for VMware #2  Collecting VMware performance data in large scale is a “Big Data” problem :  50,000,000 events per day  ~2 million events per hour  Five dedicated Splunk FA (data forwarder appliance) are used to gather all data in real time Forwarder Real Time Data ESX Splunk Appliance Dashboard26 | CONFIDENTIAL
    • Splunk Data Statistics  How Splunk handles CloudShare “Big Data”  Total of 6,000,000,000 events stored in splunk datastore  90,000,000 events per day  ~3.5 million events per hour  CloudShare deployed Splunk scale out architecture :  2 Indexers  1 Search Head  32GB RAM per server  1.5TB (total DB size)27 | CONFIDENTIAL
    • Best Practices Deploying Splunk For Your Cloud Environments  Splunk App for VMware:  Take in consideration the high amount of data each ESX generate  Properly size the FA hardware (CPU , Memory)  Add more engine process to each FA if needed  Consider creating a dedicated indexer for the VMware data in order to reduce the load  Storage monitoring  Let splunk collect both physical storage latency (from storage disks) and VMware vDisk latency to better understand and get root cause on latency problems  In a linked clone environment consider monitoring both volume level and vDisk level latency in order to understand if the problem is on the master disk or the clones  Network monitoring  Monitor network traffic for anomalies like high rate of open connection  Generate real time search in order to react quickly and shutdown abuse28 | CONFIDENTIAL
    • Summary  For Cloudshare, Splunk is our platform for operational intelligence  Once all data is placed in a central repository its very easy to correlate events and understand patterns  Use Splunk dashboards to create visibility in to other groups in the organization29 | CONFIDENTIAL
    • Experiment with Splunk  All users joining the session will receive 2 weeks trial account in CloudShare with a dedicated pre- installed and fully configured Splunk environment  To get access to your dedicated environment browse to: http://tinyurl.com/SplunkDemo30 | CONFIDENTIAL
    • Visit Splunk & CloudShare at  Booth # 1909  Theater presentations every hour  Learn more:  www.splunk.com/goto/vmware  vmware@splunk.com  sales@splunk.com31 | CONFIDENTIAL
    • 32 | CONFIDENTIAL
    • Company Highlights Company  Founded 2004, first software release in 2006  Headquarters: San Francisco, CA  Regional headquarters in Hong Kong and London  Over 580 employees, based in 10 countries  Q1 Revenue: $37.2 million; +80% year-over-year Business Model / Product  Free download  Current release: Splunk Enterprise 4.3 4,000+ Customers  Customers in over 75 countries  54 of the Fortune 100  Largest Customer: 100 Terabytes per day33 | CONFIDENTIAL
    • FILL OUTA SURVEYEVERY COMPLETE SURVEY IS ENTERED INTO DRAWING FOR A $25 VMWARE COMPANY STORE GIFT CERTIFICATE
    • SPO3378How a Cloud ComputingProvider Reached theHoly Grail of VisibilityElad Gotfrid, CloudShareLeena Joshi, Splunk Inc #vmworldsponsor