Cloud Tran At Cloud Crowd No Cats

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    Best seen in slide show mode ... Billed as CloudSave, now CloudTran.

    This is the answer ... before I start asking the questions.

    In early 2001, we were doing a lot of consultancy for BEA as architects and trouble-shooters and trainers on early J2EE systems in the UK and Europe for all sorts of companies, from small to large. The consultancy salesman approached us with the basic problem "Lots of companies mess up their applications." And that costs us money because it delays our deployments. Can you help solve it - e.g. with a master class We didn't think that would work, but we did think automating development would work. So we worked intimately with web applications and three/four-tier architectures and how they work in Java/J2EE. EJBs, Hibernate, Sprint, Struts, JSF, AJAX and its libraries... they all give a platform to make it easier.

    Tracy's story - hitting the wall with Ruby on Rails, followed by modularising and caching as much as possible. The pattern of successful applications is DB - quick and simple Cache the hell out of it Grid/cloud - in-memory database - rearchitecting along the way - which is a cost With CloudTran, you can start with a simpleapplication and scale out along the way. Used to be 100,000 online was monstrous (in 2000); now millions of users online is fairly standard. The "50,000 club" I hear a lot about the size of applications growing for the foreseeable future. The scalable and cost-effective way to do this is the IMDG.

    This is the business benefit side of fast response time. People are impatient and it's getting worse. The - 50% abandon after 2 seconds is a lot worse than four years ago - 28% abandon after 4 seconds. If you can keep customers, on average you're ahead. If you can make your site super fast, you'll get people wanting to stay because of the pleasant experience.

    A lot of this will be in iBatis, Hibernate, JPA - ORMs that take care of some of the database. So what does it take to build a finance or e-commerce web-site in GigaSpaces. Is it possible for it to go mainstream ... right now it's the preserve of leading programmers. The problems CloudTran solves for regular application developers is - the difficulty of distributed programming. You can't just say Customer.getOrders() - you have to worry about finding the orders, pulling them back, timeouts, transient failures. Every little thing becomes magnified - how best to di

    The sort of transactions I'm talking about are used in mission-critical applications where money changes hands or orders are taken. One node and one database is a common scenario for doing transactional work - but it's a well-understood scenario. The only thing we add there is faster time-to commit and buffering against slow databases. Things get more exciting - by which I mean difficult - when we [ to get the full effect here, run this as a slide show! ] - add a second machine - or a second disk - or go for the scalable solution where you can have as many nodes and as many database or persistence stores as you want. This is how you get scalability, but it raises additional issues - how to spread large datasets across the grid and how to coordinate all the data into an ordered update. The other thing I'm going to note is that this is for high-performance applications where the system of record - the master data - is in the In-Memory Data Grid. This makes the whole thing fit with GigaSpaces' strengths and SBA. Using private or public clouds is problematic right now if you want to do mission-critical transactions - there is no standard infrastructure that lets you do it quickly and robustly. Equally, for experienced applikcation prograamers, there is the problem of how to organise a distributed array of computers to quickly develop an application. Standard solutions like distributed transactions are too slow and unreliable in these situations. But this is what a lot of developers are interested in - as long as the speed and reliability are there. It's a big, technical complicated problem - but it's valuable too.

    To address mainstream development teams building BigApps: 1. This is what you'll be forced to do anyway - put the system of record in the IMDG.

    This is the positive side of using IMDG's. It's well worth doing the numbers on a given data volume to see if the key transactionality can be done in the cloud.

    So working with data in the cloud is a real challenge - everything you read from Amazon, EBay and so on says: you'll be sorry! - Standard Distributed Transactions are slow and unreliable at cloud level Machines go down - what are you going to do about that in a distributed transaction. Distributed application programming is really complicated and most application programmers won't get it.

    It took us three goes to get the design right. Dan Stone ("Dan's the Man) proved invaluable in destructing design versions 1 & 2 ... and then saving us when we thought V3 was no good - he remembered the magic sauce we had forgotten.

    There will be many people scoffing - people have tried to do distributed transactions that are fast and reliable, and the industry reaction is that they don't work. ( Isn't it incredible this hasn't been solved? !!! ) So we have to be clear we're not fixing distributed transactions - we 're redefining the problem. - Grid connected Helland specifically mentions that DTX's won't work - except maybe in "Tightly-connected cluster". That's what the cloud is! - System of Record in the grid - in other words, "IMDG" So, this adds value to the GigaSpaces' IMDG approach. The application knows if the data is all right - it owns it. This means there is no need for prepare/commit split - the "prepare" phase can be done as an object is put into the space. - saves time and reduces complexity - Commit to backed-up memory In other words, this is a pretty neat application of GigaSpaces. What we are relying on here is for the combination of backed-up memory to usually be good enough to store a transaction. A computer node typically goes down once every year, whereas a disk has an outage once every three years. So a pair of backed-up nodes is very unlikely to go down in the time it takes to get the information to the persistence source. However ... sometimes it will of course. So to cover that eventuality, we write a temporary record of the transaction to disk. When the transaction is persisted to all target datasources, the temporary log is deleted. (The number of logs clearly gets huge and performance is better if we get rid of them.) - Some of the problems that users experience are in understanding how to deploy an application into the spacegetting the Uses GigaSpaces to help solve the total problem. CloudSave shows developers how to easily use the GigaSpaces SBA pattern to achieve greate performance (or low cost). In th implementation we also use many GigaSpaces features. The complete list from memory is - Local Transactions - JavaSpaces/GigaSpaces - Automatic Back-up - SQL view of memory (Query) - SBA - Partitioned clusters.

    The TxB not only gives you good speed - it gives you the buffering if the persistent store goes down. Regular disk writes get you order of 200 (tiny) writes per second. The transaction buffer today runs at 2,000 (tiny) transactions per second on a single socket quad-core machine (i7 920 chip). The transaction buffer is at 60% CPU at that point With a higher spec CPU, 4 or 6 CPUs and a more tuning, we hope to get to 15-20,000 transactions per second.

    This is the standard today – you can use simple (native transactions) with no connections between the boxes

    Key points Even if you’ve got a single database and two machines, then you need a database. We’ve had to use this for customer projects. Here kitty, kitty...

    What a Java developer has to do to use distributed, scalable data.

    GigaSpaces local and distributed transactions work extremely well ... at the level of the grid. Cloudtran use local transactions for atomicity and isolation. The Mirror service also works well when you do not need strong transactionality for distributed operations (two or more DBs, or two or more spaces). This is where CloudTran comes in.

    Best seen in display view, with animation. Key features: - distribution of data across spaces, and across partitions - entity grouping; ORM gets the right groups - any number of hops to services on other nodes - arbitrarily scalable - number of nodes involved, numbers of rows, size of data, number of datasources or messages - default datasource is JDBC; plug-ins for other datasources

    This is about performance tuning and the sort of things you have to do to remove bottlenecks. Hopefully there's another 50% to come on our single-socket quad-core box.

    This is a bit of fun - this is the application of GigaSpaces and CloudTran that I dream about when I'm using LastMinute.com, which I just tried - it takes 28 seconds to lookup a flight from London to Boston. I'd like to see - eventually - companies having a presence in a shared cloud, and then using CloudTran as a federated transaction buffer. One commentator has called "The Twitch Generation" - because we're so impatient - this setup would allow companies like LastMinute to aggregate all their temporary reservations in the cloud very quickly ... and respond in milliseconds. And just as in finance, on WebSite, latency really matters. - Amazon found that every 100ms of latency cost them 1% in lost sales. So companies that want a competitive advantage could use GigaSpaces, In-memory databases and CloudTran for persistence to boost their performance.

    1 Favorite

    Cloud Tran At Cloud Crowd No Cats - Presentation Transcript

    1. Scalable Transactions in the Cloud Matthew Fowler, NT/e CloudTran CloudSave
    2. ?
      • And the answer is
        • platform for mainstream Java developers
        • to use IMDG
        • for scalable, commercial applications
        • without worry and minimal hassle
        • for commercial advantage
      • It's a lump of middleware
        • built on, adding value to GigaSpaces
    3. 2001
      • WebLogic/J2EE specialisation
      • One week training course
        • 4-point architecture for dummies
      • Messed-up architecture
        • revenue down
      • Automating server-side applications
        • J2EE/EJB
        • Spring/Hibernate
    4. 3-5...5-10...10-20...1,000,000
      • Tracy's story: the path of successful apps
        • Database
        • Caching
        • In-memory Data Grid
      • The 50,000 club
      • Application scale drivers
        • Mobile phone growth, iPhone Apps
        • Micropayments
        • e-commerce continued growth
    5. Get an edge with performance Please wait “ Latency really matters ... 100ms of latency costs 1% in sales.” Amazon ..................... “ ... almost half of visitors will abandon a site if they perceive a page or feature takes longer than 2 seconds to load. ” GetElastic “ An extra 0.5 seconds in search page generation time dropped traffic by 20%.” Google
    6. 6.5m, x10yrs, $400bn/yr
      • Mainstream Java developers
        • 6.5m
        • most have 5-10 years experience
        • 50 million man-years experience
      • Plain old application development market
        • $400bn/year
      • Can they build an IMDG application?
        • How can IMDG go mainstream?
    7. Explaining it to your Mom / Boss IMDG - SOR Persistent Storage
    8. Explaining it to a techie
      • System of Record in IMDG. Keep DB for
        • warehouse apps/BI
        • sleeping at night.
      • Catching the money:
        • ACID transactions
        • throughput, scalability, bullet-proof reliability
        • distributed, data + messaging
      • ORM - Object references, not foreign keys. Easy to program. Entity groups for performance.
    9. In-Memory Data Bases - Are You Crazy?
      • What's it worth:
        • Loss of sales, traffic - 5% vulnerable, saved by speed of IMDG
      • For $100m/year co:
        • $5m/year revenue for good behaviour
        • Customer/order/product data - 2million * 16Kb
        • 8 servers in grid for 32GB live data
          • 8 servers isn't a lot
          • Worth doing the numbers!
    10. Fear and loathing ... Low Reliability Complicated Programming Unintended Consequences of Unknowing Distributed Transactions
    11.  , 1, 2, 3, ... 
      • Other alternatives
        • forget transactions, forget databases
      • Dan's the Man
      • GoogleApps on V2 last we heard
    12. Distributed Cloud Transactions
      • Grid connected
        • Helland's get out clause
      • System of Record is in the grid
        • No voting - 1PC not 2PC
      • Commit to backed-up memory
      • Leverage the GigaSpaces platform
        • SBA/Entity Groups, Transactions, SQL Queries, Backups
      How is it possible? Redefining the problem
    13. 200/ ... 2,000 ... 20,000/second
    14. Transactions you can count on
    15. Transactions you can count on
    16. Herding Cats - Java Style
      • How to distribute data
      • How to find it
      • How to resolve references
        • IMDG versus user view: FK ↔ OO
      • Atomicity on failure
      • Timeouts
      • Scalability
      • Consistency and isolation
    17. The 'T' Word
      • GigaSpaces Local Transactions
      • GigaSpaces Distributed Transactions
      • Mirror service
        • see Cat-Herding 101
    18. How CloudTran ORM works Client TxB Gridsearch OL Data Data Order Service Partitioning (entity groups) Commit Confirm Confirm Commit Commit Datasources Tx Messaging
    19. 300 .. 700 .. 900 .. 2,100
      • Performance of transaction buffer
        • Tiny Transactions per second
    20. In-Cloud Federated Applications Virgin Airways LastMinute.com IMDG IMDG CloudTran - Federated Transaction Buffer
    21. Scalable transactions in the cloud?
        • platform for mainstream Java developers
        • to use IMDG
        • for scalable, commercial applications
        • without worry and minimal hassle
        • for commercial advantage
        • GigaSpaces
      Cloud Tran
    22. End
    SlideShare Zeitgeist 2009

    + jimliddlejimliddle Nominate

    custom

    131 views, 1 favs, 1 embeds more stats

    These are slides of the session that Matthew Fowler more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 131
      • 126 on SlideShare
      • 5 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 0
    Most viewed embeds
    • 5 views on http://vehera.jsn-server7.com

    more

    All embeds
    • 5 views on http://vehera.jsn-server7.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories