Scaling Blackboard Learn™ for High Performance and Delivery


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • When I got here in 2003, we were working on Bb 6.0 (6.0.11) was the first release that was put out within a few weeks of me getting here.Technology had shifted from a pure Perl app to a hybrid Java and Perl app. In fact, Bb was the largest PerlEx ISV in the world. Go figure!Way LM systems were being used, were very different back pre-2003 then today. Started to see transformation of usage and adoption. Customers were having issues, getting comfortable dealing with freedom to optimize Java, Oracle, SQL Server, etc…Went to first benchmark at Sun in early 2004. Tunathon…able to scale Bb Timeline visual from 1998 to present dayCustomers and VersionsTechnologies via call-outsBenchmarks
  • In late 2004, we started building the Ref Arch as a model for customersProved it out in benchmarks, as well as our own hosting facilitiesWe needed other players to come in and work with us to help us learn and validate a solutionKey to our success: aggressive port from Perl to Java, earliest adoption of technologies: Sun 10, Oracle 10, RHL 4 and 5, SQL 2005 and Java 5/6Willingness to adopt virtualization very early onWillingness to open our technology stack for affordable solutions such as NFS and CIFS
  • Need to put a visual of RefArch I which was mainly a server/storage/network approach w/ basic monitoring
  • Faster=Hotter Systems, High Frequency CPUs (scale out)Solaris Side = Bigger is better (scale-up)Fiber Storage = PerformanceLow Cost Storage = JBODsKeep the environment simple
  • RefArch II expanded to cover:Hardware/Storage/NetworkingClustering: App and DBVirtualizationUser Experience ManagementLB optimization techniquesServicesDRMEarly Social Media Integration
  • VirtualizationClusteringLow cost storage still can give performance: use storage for the right purpose (7200k for shared….10k to 15k for DB)Data Center consolidation64-bitCompression/CachingImage optimizationRUMEnterprise MonitoringMove to bladesThread computing (Niagara)Synthetic monitoring
  • RefArch III is a very different world:MobilitySecurityLoggingMonitoring ServicesWeb OptimizationALMCloud IntegrationBenchmarking/TestingSAAS App IntegrationProvision ManagementCDNs and Content Integration PointsData warehousingSynchronous Communication
  • Slide of definitions: Performance, Scalability and AvailabilityFormat: 2 column view: definition on left and example on rightPerformanceResponse times and expectations (immediate, instantaneouse)Visual of performanceScalabilityThroughput, workload and concurrencyVisual of PerformanceAvailabilitySystem and application up-timeChart of uptime metrics
  • Non-Functional Requirements: What are non-functional requirements and why they are important for Performance, Scalability and AvailabilityInstitutions need to define NFRs and SLAs for operating their environments. Can’t just achieve scale unless scale is tangibleScale has to be discrete, but also realisticWanting something doesn’t always mean it’s entirely possible
  • Great Paper:
  • Great Paper:
  • Great Paper:
  • Why Automated Provisioning:Simply routine of provisioning systemsMaster processes and reduce human errorBalance workloadsQuick recoveryPuppet/Chef, Dell AIM, VCenter
  • Group them over: Performance, Scalability and Availability categoriesInfrastructure Monitoring: Tools that manage hardware and network infrastructureUser Experience Monitoring: Measuring Response TimesALM: Deep insight into application containerDatabase Monitoring: Remote monitoring/services ToolsSynthetic MonitoringLog Management Tools
  • First need to help customers understand how to monitor our cachesWhat are caches and why they are importantInsight into the caches via the Admin Console and other JMX access pointsCaches can and should be controlled via the cache settingsNG of caches: pluggable caches and exploration of distributed cache
  • Scaling Blackboard Learn™ for High Performance and Delivery

    1. 1. Scaling Blackboard Learn™ for High Performance and Availability<br />Stephen Feldman<br />Sr. Director Performance, Security and Architecture<br />
    2. 2. Quick Bio<br />Blackboard since 2003<br />Performance Engineering from the start<br />Platform Architecture in 2005<br />Security Engineering in 2010<br />“Love my job…love my team. If you email me, I will respond.”<br />@seven_seconds<br /><br />
    3. 3. A Quick History Lesson of Bb…<br />First release was 6.0.11 launched within a few weeks of arriving. <br />Technology shift from Perl to Java through Release 5 and Release 8.<br />Blackboard was the largest PerlEx ISV in the world in 2003.<br />Customers were having issues with optimizing Java, Oracle and SQL Server<br />First benchmark was at Sun in 2004 called the Tunathon.<br />Learned that Blackboard Learn could scale and could scale to high-levels with a little TLC.<br />
    4. 4. As We Started Growing and Scaling<br />In late 2004, we started building the Ref Arch as a model for customers<br />Proved it out in benchmarks, as well as our own hosting facilities.<br />We needed other players to come in and work with us to help us learn and validate a solution<br />Key to our success: aggressive port from Perl to Java, earliest adoption of technologies: Solaris10, Oracle 10g, RHL 4 and 5, SQL 2005 and Java 5/6<br />Willingness to adopt virtualization very early on<br />Willingness to open our technology stack for affordable solutions such as NFS and CIFS<br />
    5. 5.
    6. 6. Where We Are Today<br />We have multiple customers supporting nearly 1 million users and dozens well over 250k live production users. <br />Our benchmarks have been successful supporting over 1 million users with greater than 100k simultaneous sessions with sub-3s response times.<br />The majority of our customers have benefitted from the Reference Architecture and have completely transformed their deployment to support the adoption and growth of the product.<br />
    7. 7.
    8. 8. In The Beginning: RefArch I<br />
    9. 9. Focus of RefArch I<br />Distribution of application and database<br />Need for load-balancing the application server<br />Early JVM clustering<br />Fiber Storage and High-Speed Disks<br />Low-cost option to use JBODs<br />Basic operational monitoring<br />Hardware, Network and Storage<br />Database <br />Keep it simple and you will succeed<br />
    10. 10.
    11. 11. A Few Years Later Came RefArch II<br />
    12. 12. …Then Marketing Got their Hands on It<br />
    13. 13. Focus of RefArch II<br />
    14. 14. What are we modeling today and future…<br />
    15. 15. Unified Approach Working Together<br />No Longer Center, but Parallel…<br />
    16. 16. Introducing RefArch III<br />Logging and Monitoring<br />Cloud Services<br />Secure Performance<br />Analytics<br />Identity & Access Management<br />Immunity<br />Mobility<br />Web Optimization<br />Virtualization <br />& Provisioning<br />Data Management<br />
    17. 17. Now Comes RefArchIII<br />
    18. 18.
    19. 19. Defining SLAs<br />
    20. 20. Defining SLAs<br />
    21. 21. What is Performance?<br />Performance is quantifiable and measureable<br />Performance is also perception<br />Mostly recognized from a cognitive perspective<br />Instantaneous<br />Immediate<br />Continuous<br />Captive<br />
    22. 22. What is Scalability?<br />
    23. 23. What is Availability?<br />High-availability offerings mask the effects of a system failure in order to minimize the impact of access and functional use of a system to a community of users.<br />Simple Definition:<br />Percentage of time the system is in its operational state. <br />You will often hear the concept of 3x9’s, 4x9’s or even 5x9’s<br />Planned versus Unplanned<br />Availability = (Total Units of Time – Downtime) / Total Units of Time<br />8760 hours in a year<br />Downtime = 10 hours<br />Availability = (8760 – 10)/8760 = 99.88%<br />
    24. 24. Quick View into Availability Statistics<br />
    25. 25.
    26. 26. Automated Provisioning<br />Simple routine of provisioning systems<br />Master processes and reduce human error<br />Balance workloads<br />Quick recovery<br />Emphasis on efficient computing<br />
    27. 27. Complete Monitoring and Logging Solutions<br />
    28. 28. Application Lifecycle Management <br />True application insight and visibility<br />Business processing mapping to transaction SLAs<br />Multi-layer correlation<br />Transaction workflow mapping<br />
    29. 29. Web Optimization Services<br />Typical Optimization Services<br />Compression<br />Domain Sharding<br />Minification<br />Consolidation<br />Inlining<br />Asynchronous JavaScript<br />Response Prediction<br />Browser Caching<br />
    30. 30. Present and Future of Caches<br />Caches are used throughout Blackboard Learn to manage the life and reuse of data.<br />Leveraging ehCache presently in Release 9.1<br />Caches can and should be controlled via the file<br />Insight into the caches can be achieved in the Admin Console and other JMX tools.<br />Next generation of caches: pluggable caches (use your own) and distributed caches<br />
    31. 31. Steve Feldman<br />@seven_seconds<br />
    32. 32. Please provide feedback for this session by <br />Scaling Blackboard Learn™<br />for High Performance and Delivery<br />