Migrating Jive To The Cloud - Presentation Transcript
Sept 17, 2009 Migrating Jive to the Cloud:Practical Tips and TricksMatt TuckerCTO, Jive Software
Jive Overview Founded in 2001 – Series A with Sequoia in Fall ’07 Growing revenue 100% Y/Y and generating cash Industry analyst recognition as leader in our space Only vendor to bridge external and internal communities More than 2,500 customers Operations in 5 countries, new office in Palo Alto Over 150 employees (currently hiring +20)
Jive’s Cloud Evolution Summer of 2008 All deployments were installed software or ASP managed hosting Customers lacked easy, cheap way to start with Social Business Software Evaluated and selected Amazon EC2 environment October 2008 – February 2009 Dedicated, skunk-works team devoted to migrating Jive’s offering to the cloud First instances up and running in January 2009 – 3 months total development Summer 2009 Over 250 Customers on Jive Express All customer sandbox sites migrated to the EC2 cloud Costs to run a Jive Express environment are 1/10 the cost of ASP In process of launching additional products via the cloud
Advantages for Jive’s Cloud Offering ASP takes too long and costs too much 6 weeks to procure and install new servers Approximately $20K per installation, all up-front and no linear ramp No ability to turn off or manage capacity Easier to manage Minutes, not days or months to get up and running Instances spin up and down automatically Fail-over happens with admin tools in the background Vastly lower operational expense due to automation Easy deployment for customers Tools to manage and track adoption Customized wizards guide through typical use cases Customers can migrate from EC2 to ASP or their own SW deployments
The Cloud & Enterprise Enterprise Readiness Issues No SAS70 Type II certification for AWS Need to improve SLA for high end customers. On Jive side as well as AWS Enterprise security reviews have not caught up with the cloud yet. Standard evaluation criteria still focuses on things like hardware vs. virtualization, data center tours, etc Ramifications Cloud is generally for smaller or “starter” Jive implementations; low percentage of revenue but gets us into larger deals We’ve built an easy migration path to on-prem or ASP At least 2 years away from widespread enterprise cloud readiness, but trend is happening
PaaS or IaaS?
Key Technical Challenges Bring multi-tenant cost efficiency to a single-tenant app Jive is a “fat” application. How do we fit in a small EC2 instance?
Cut down app startup time from 10 mins to 2 mins, use small Java heap
Any customizations break easy/automated upgrading
Built new simplified admin console and did other simplifications via product overlay
Must eliminate per-instance manual labor
Invested in radical level of automation that maintains the environment with very little manual intervention
Architecture Overview S3 EC2 Instances EBS XMPP Controller Service Redirect Service Provisioning Site SQS
Trick: Scripting Java Install Basic tip: fully script the creation of your AMI! Ran into problem that install of Java can’t be automated # Install Sun JDK (messing with whiptail to avoid license prompt) mv /usr/bin/whiptail /usr/bin/whiptail.orig cat > /usr/bin/whiptail <<EOM exit 0 EOM chmod +x /usr/bin/whiptail apt-get install -y sun-java6-jdk rm /usr/bin/whiptail mv /usr/bin/whiptail.orig /usr/bin/whiptail export JAVA_HOME=/usr/lib/jvm/java-6-sun rm /usr/bin/java ln -s $JAVA_HOME/bin/java /usr/bin/java
Trick: Hibernation Further cut costs by automatically turning off instances that don’t get active use Trick is to use DNS redirect so that they can be turned back on within minutes via self-service Redirect Service Stale EC2 Instance Redirect DNSSet TTL to 60s Hibernate New EC2 Instance Redirect Service Redirect DNSSet Normal TTL Re-Awaken
Trick: Upgrades The trick: use elastic compute to do things you hadn’t imagined previously In ASP environment upgrades are run in-place and manually; requires multiple hours of scheduled downtime in case something goes wrong At EC2 we upgrade “alongside” rather than in-place Upgrades at EC2 are fully automated and performed en-masse Have achieved low 2% failure rate (fix generally only requires minor intervention)
Trick: Upgrades How upgrades are done: Make an instance read-only by putting up an upgrade message Take an EBS snapshot of instance data Create a NEW instance with NEW EBS volume from snapshot Run upgrade on new instance using scripts Run tests to ensure upgrade worked Change elastic IP from old to new instance Delete old EC2 instance and EBS volume If any step fails, remove maintenance message on existing instance and log error message. Failed attempts only cost $0.10
Trick: XMPP SQS is fantastic for asynchronous message processing; we use it to deliver things like hourly stats. But doesn’t solve all problems Use XMPP for real-time controller to instance communication Enables multi-step synchronous actions like creating a downloadable data backup Simpler and faster development than complicated web services
Tip: Reserved Instances Lower costs by >= 30% -- purchase reserved instances Updated provisioning code to ensure that we always use an availability zone that has reserved instances first
Tip: Retry AWS Calls We’ve found that 2-5% of AWS web services calls fail Work-around by adding re-try logic to critical code paths; retry of major functional actions has been easier than re-try of individual AWS calls (i.e., retry everything that goes into creating new instance and include robust cleanup code) Added reporting to track all “orphaned” resources for edge cases where cleanup isn’t perfect
Tip: Use Userdata Possible to pass in dynamic data to instance when booting as userdata Userdata has small size limit so we securely download full startup script from S3 then execute it $ export INSTANCESTARTUP_VERSION=instanceStartup-1.0.1.sh $ /usr/local/jive/bin/s3-curl/s3curl.pl --id $AWS_ACCESS_ID --key $AWS_SECRET_KEY -- -f --retry 5 --connect-timeout 10 -y 10 http://xxx.s3.amazonaws.com/$INSTANCESTARTUP_VERSION > $INSTANCESTARTUP_VERSION $ chmod +x $INSTANCESTARTUP_VERSION $ ./$INSTANCESTARTUP_VERSION
Tip: Handling Email Sending email from EC2 doesn’t work: reverse DNS won’t resolve it needs, big providers simply mark all of EC2 as SPAM Solution: relay mail to external server at trusted IP address. We use same infrastructure that ASP environment does. Large amount of email being sent = high sender score Also, check out Sendgrid (http://www.sendgrid.com)
0 comments
Post a comment