Sukanta Nanda Sr. Database Admin.
Cisco IT
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2
2
locations in
countries
offices
employees
2000+ Applications
1500+ Databases (Prod & Non-Prod)
HANA, Legacy EDW, Hadoop
Supporting Mission Critical Environments
32 data centers and server rooms
of data center space
of UPS power to raised floors
servers virtualized in new DCs, overall
Virtualization goal =
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3
CCIX
Quote
CCW -
96TB,
12 node RAC,
ERP + Custom
OCM
X-track
SOM
Oracle CM-
Offline Jobs
POM
Upload, Save,
Order, SNIF,
Convert contract, email
NGVS – VMs
(2012+)
Validation Jobs
QAS
Advanced Search
(Lucene VM)
SVE
Admin
sprice
AQS
AQS Opportunity - 2014+
(Lucene VM)
Web requests
Advanced Search
Requests
$U Jobs
CAAS
ASFAIL
BID/Customer
contract data
Opportunity data
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4
69K
unique users accessed CSCC/SMS3
32K external users
628K
# of estimates/quotes
$2.7B
booked (99.7% portal, <1% B2B)
18%
order touch rate
10.6 hrs
avg. order cycle time
1.3 million
hits per day
213K
service orders
92%
of services booked thru CSCC/SMS3
99.78%
availability
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5
“I can't even see/find my contracts. I used to
expect so much from Cisco and would have
recommended your products to anyone, but
not anymore. Why does everything, including
managing contracts or getting support, have to
be so complex. For being a company that I
thought of as innovative and a big reason for
the success of the internet, Cisco has fallen a
long way. It's like Cisco is stuck in 1999.”
Source: CSCC Feedback Form
“I am new to services from the product team and
have quite a few views on our tools”
Source: Services Sentiment Survey
“Simplify. I've spoken to many partners, and
resellers and managing your portal is a full time
job. Cisco needs less engineers developing your
website and a few administrators with some
common sense. Some of support folks I've spoken
to have trouble with the website...”
Source: CSCC Feedback Form
This tool is extremely complex and slow. It
takes hours and hours to do simple low value
quotes, coming up with error after error, and
regularly requiring manual intervention from
Cisco to get it just to do something that should
be simple. I really hope you can come up with
something better and quickly.”
Source: CSCC Feedback Form
“Our Customers are telling us that they feel like they
did back in the CSCC CAP days. (Policy, process &
tool?)”
Source: Juli Clark, Cisco Director Management Operations – CPE
“We have agents that cry using CSCC (Sales
turnaround expectations)”
Source: Katie, Convergys
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6
Performance
• Performance : Online - multi-seconds to few minutes, Offline Processing -
multi-minutes up to few hours
• Scalability : Non-linear degradation with large data
• Quarterly release : 16+ hours downtime
• Weekend downtimes – backups, purge table jobs,
EBF - 2+ hours downtime
• Unplanned downtime – (minimal)
• Quarterly releases mechanism
• Large number of people effort
• Stretched in doing 4 releases/year
• Need downtime + DBAs + SCM + Manual
preparation for deploys
• Complex, non-intuitive
• Difficult to change workflows
ScalabilityLatency Uptime
Agility Resiliency User Interface
• Data Reliability
• Resilient across data center
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7
Clients
IaaS - OpenStack StorageCompute Networking
Platform
Cassandra (Database)
Applications
Notifications Pricing Search Quoting
Data Loader Validation (drools)
Cisco UCE Browser
App
AndroidPartner app IOS
Upload
Conversion
Nginx
(Web Server)
Platform,build,TestAutomation
(Puppet,Nagios,Jenkins
Ordering Web
Tomcat
(Java appServer)
Elastic Search
(Search Engine, Log Mining)
Rabbit MQ
(Messaging)
HAProxy
(Load Balancer)
Memcached
(In Memory Cache)
Logstash
(Log Forwarder)
Kibana
(Log Visualizer)
Quartz
(Scheduler)
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8
Performance
• Performance : Online - Can query multiple Service Orders using Elastic
Search & Cassandra in few Seconds
• Scalability : Linear scalable by Addition of multiple nodes
• Elastic Search Improved by 300%
• Quarterly release : Can follow ITDT model
• Weekend downtimes – NONE
• EBF - NONE
• Unplanned downtime – NONE
• Quarterly releases mechanism
• Large number of people effort
• Stretched in doing 4 releases/year
• Need downtime + DBAs + SCM + Manual
preparation for deploys
• Complex, non-intuitive
• Difficult to change workflows
ScalabilityLatency Uptime
Agility Resiliency User Interface
• Complete Resiliant
• Resilient across data center
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9
4
1
6
5
2
3
(Transactional
Physical/SSD) 3
1
2
(ETL /Spark
Physical/SSD)
• C220 M4 Servers
• 256 GB Memory each
• 7 SSD Drives 960GB each
• RHEL 6.5 OS 64bit
• JBOD Configuration
• Datastax 4.8.6
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11
CISCO
DC2
[US]
HAProxy-1
(Process :
8000)
HAProxy-1
(Process : 80)
HAProxy-2
(Process : 80)
HAProxy-2
(Process: 8000)
https://ccrc.cisco.com/
https://ccrc-internal-2.cisco.com/
Public VIP
Internal VIP
Module1
VM
Module2
VM
Module3
VM
Module4
VM
Module5
VM
Module5
VM
Elastic Search
Cluster
CISCO
HAProxy-1
(Process :
8000)
HAProxy-1
(Process : 80)
HAProxy-2
(Process : 80)
HAProxy-2
(Process:
8000)
Public VIP
Internal VIP
Module1
VM
Module2
VM
Module3
VM
Module4
VM
Module5
VM
Module5
VM
Elastic Search
Cluster
DC1 [US]
https://ccrc-internal -1.cisco.com/
GSS
https://ccrc-external-1.cisco.com/ https://ccrc-external-2.cisco.com/
DMZ HAProxy-1 DMZ HAProxy-2
Internal VIP
DMZ HAProxy-1 DMZ HAProxy-2
Internal VIP
Cassandra
Cluster
Cassandra
DC1
Cassandra
DC2
© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13
Thank you.

Cassandra Adoption on Cisco UCS & Open stack

  • 1.
    Sukanta Nanda Sr.Database Admin. Cisco IT
  • 2.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 2 2 locations in countries offices employees 2000+ Applications 1500+ Databases (Prod & Non-Prod) HANA, Legacy EDW, Hadoop Supporting Mission Critical Environments 32 data centers and server rooms of data center space of UPS power to raised floors servers virtualized in new DCs, overall Virtualization goal =
  • 3.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 3 CCIX Quote CCW - 96TB, 12 node RAC, ERP + Custom OCM X-track SOM Oracle CM- Offline Jobs POM Upload, Save, Order, SNIF, Convert contract, email NGVS – VMs (2012+) Validation Jobs QAS Advanced Search (Lucene VM) SVE Admin sprice AQS AQS Opportunity - 2014+ (Lucene VM) Web requests Advanced Search Requests $U Jobs CAAS ASFAIL BID/Customer contract data Opportunity data
  • 4.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 4 69K unique users accessed CSCC/SMS3 32K external users 628K # of estimates/quotes $2.7B booked (99.7% portal, <1% B2B) 18% order touch rate 10.6 hrs avg. order cycle time 1.3 million hits per day 213K service orders 92% of services booked thru CSCC/SMS3 99.78% availability
  • 5.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 5 “I can't even see/find my contracts. I used to expect so much from Cisco and would have recommended your products to anyone, but not anymore. Why does everything, including managing contracts or getting support, have to be so complex. For being a company that I thought of as innovative and a big reason for the success of the internet, Cisco has fallen a long way. It's like Cisco is stuck in 1999.” Source: CSCC Feedback Form “I am new to services from the product team and have quite a few views on our tools” Source: Services Sentiment Survey “Simplify. I've spoken to many partners, and resellers and managing your portal is a full time job. Cisco needs less engineers developing your website and a few administrators with some common sense. Some of support folks I've spoken to have trouble with the website...” Source: CSCC Feedback Form This tool is extremely complex and slow. It takes hours and hours to do simple low value quotes, coming up with error after error, and regularly requiring manual intervention from Cisco to get it just to do something that should be simple. I really hope you can come up with something better and quickly.” Source: CSCC Feedback Form “Our Customers are telling us that they feel like they did back in the CSCC CAP days. (Policy, process & tool?)” Source: Juli Clark, Cisco Director Management Operations – CPE “We have agents that cry using CSCC (Sales turnaround expectations)” Source: Katie, Convergys
  • 6.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 6 Performance • Performance : Online - multi-seconds to few minutes, Offline Processing - multi-minutes up to few hours • Scalability : Non-linear degradation with large data • Quarterly release : 16+ hours downtime • Weekend downtimes – backups, purge table jobs, EBF - 2+ hours downtime • Unplanned downtime – (minimal) • Quarterly releases mechanism • Large number of people effort • Stretched in doing 4 releases/year • Need downtime + DBAs + SCM + Manual preparation for deploys • Complex, non-intuitive • Difficult to change workflows ScalabilityLatency Uptime Agility Resiliency User Interface • Data Reliability • Resilient across data center
  • 7.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 7 Clients IaaS - OpenStack StorageCompute Networking Platform Cassandra (Database) Applications Notifications Pricing Search Quoting Data Loader Validation (drools) Cisco UCE Browser App AndroidPartner app IOS Upload Conversion Nginx (Web Server) Platform,build,TestAutomation (Puppet,Nagios,Jenkins Ordering Web Tomcat (Java appServer) Elastic Search (Search Engine, Log Mining) Rabbit MQ (Messaging) HAProxy (Load Balancer) Memcached (In Memory Cache) Logstash (Log Forwarder) Kibana (Log Visualizer) Quartz (Scheduler)
  • 8.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 8 Performance • Performance : Online - Can query multiple Service Orders using Elastic Search & Cassandra in few Seconds • Scalability : Linear scalable by Addition of multiple nodes • Elastic Search Improved by 300% • Quarterly release : Can follow ITDT model • Weekend downtimes – NONE • EBF - NONE • Unplanned downtime – NONE • Quarterly releases mechanism • Large number of people effort • Stretched in doing 4 releases/year • Need downtime + DBAs + SCM + Manual preparation for deploys • Complex, non-intuitive • Difficult to change workflows ScalabilityLatency Uptime Agility Resiliency User Interface • Complete Resiliant • Resilient across data center
  • 9.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 9 4 1 6 5 2 3 (Transactional Physical/SSD) 3 1 2 (ETL /Spark Physical/SSD) • C220 M4 Servers • 256 GB Memory each • 7 SSD Drives 960GB each • RHEL 6.5 OS 64bit • JBOD Configuration • Datastax 4.8.6
  • 11.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 11 CISCO DC2 [US] HAProxy-1 (Process : 8000) HAProxy-1 (Process : 80) HAProxy-2 (Process : 80) HAProxy-2 (Process: 8000) https://ccrc.cisco.com/ https://ccrc-internal-2.cisco.com/ Public VIP Internal VIP Module1 VM Module2 VM Module3 VM Module4 VM Module5 VM Module5 VM Elastic Search Cluster CISCO HAProxy-1 (Process : 8000) HAProxy-1 (Process : 80) HAProxy-2 (Process : 80) HAProxy-2 (Process: 8000) Public VIP Internal VIP Module1 VM Module2 VM Module3 VM Module4 VM Module5 VM Module5 VM Elastic Search Cluster DC1 [US] https://ccrc-internal -1.cisco.com/ GSS https://ccrc-external-1.cisco.com/ https://ccrc-external-2.cisco.com/ DMZ HAProxy-1 DMZ HAProxy-2 Internal VIP DMZ HAProxy-1 DMZ HAProxy-2 Internal VIP Cassandra Cluster Cassandra DC1 Cassandra DC2
  • 13.
    © 2013 Ciscoand/or its affiliates. All rights reserved. Cisco Confidential 13 Thank you.

Editor's Notes

  • #10 Based on another mission critical application CCRC , which required high throughput & to scale linearly and support upto 2.5 million hits per day (check with CCRC team), we decided to put it on Physical C-Series servers with local storage (SSD) after comparing with the throughput from Openstack platform, . Here we have done JBOD config for better I/O throughput. The major difference here is the storage as well as more compute (CPU & Memory) resource. With Local SSD & huge CPU x Memory, we have achieved very good application performance. But it seems we were not utilizing the entire resource wrt CPU & Memory. So we came up with a new architecture which I am going to share.
  • #12 This Arch covers the complete Active/Active database Multi DC setup. GSS will direct the connections Module1-5 consist of C* drivers