Your SlideShare is downloading. ×
Scaling Blackboard Learn™ for High Performance and Delivery
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Scaling Blackboard Learn™ for High Performance and Delivery


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • When I got here in 2003, we were working on Bb 6.0 (6.0.11) was the first release that was put out within a few weeks of me getting here.Technology had shifted from a pure Perl app to a hybrid Java and Perl app. In fact, Bb was the largest PerlEx ISV in the world. Go figure!Way LM systems were being used, were very different back pre-2003 then today. Started to see transformation of usage and adoption. Customers were having issues, getting comfortable dealing with freedom to optimize Java, Oracle, SQL Server, etc…Went to first benchmark at Sun in early 2004. Tunathon…able to scale Bb Timeline visual from 1998 to present dayCustomers and VersionsTechnologies via call-outsBenchmarks
  • In late 2004, we started building the Ref Arch as a model for customersProved it out in benchmarks, as well as our own hosting facilitiesWe needed other players to come in and work with us to help us learn and validate a solutionKey to our success: aggressive port from Perl to Java, earliest adoption of technologies: Sun 10, Oracle 10, RHL 4 and 5, SQL 2005 and Java 5/6Willingness to adopt virtualization very early onWillingness to open our technology stack for affordable solutions such as NFS and CIFS
  • Need to put a visual of RefArch I which was mainly a server/storage/network approach w/ basic monitoring
  • Faster=Hotter Systems, High Frequency CPUs (scale out)Solaris Side = Bigger is better (scale-up)Fiber Storage = PerformanceLow Cost Storage = JBODsKeep the environment simple
  • RefArch II expanded to cover:Hardware/Storage/NetworkingClustering: App and DBVirtualizationUser Experience ManagementLB optimization techniquesServicesDRMEarly Social Media Integration
  • VirtualizationClusteringLow cost storage still can give performance: use storage for the right purpose (7200k for shared….10k to 15k for DB)Data Center consolidation64-bitCompression/CachingImage optimizationRUMEnterprise MonitoringMove to bladesThread computing (Niagara)Synthetic monitoring
  • RefArch III is a very different world:MobilitySecurityLoggingMonitoring ServicesWeb OptimizationALMCloud IntegrationBenchmarking/TestingSAAS App IntegrationProvision ManagementCDNs and Content Integration PointsData warehousingSynchronous Communication
  • Slide of definitions: Performance, Scalability and AvailabilityFormat: 2 column view: definition on left and example on rightPerformanceResponse times and expectations (immediate, instantaneouse)Visual of performanceScalabilityThroughput, workload and concurrencyVisual of PerformanceAvailabilitySystem and application up-timeChart of uptime metrics
  • Non-Functional Requirements: What are non-functional requirements and why they are important for Performance, Scalability and AvailabilityInstitutions need to define NFRs and SLAs for operating their environments. Can’t just achieve scale unless scale is tangibleScale has to be discrete, but also realisticWanting something doesn’t always mean it’s entirely possible
  • Great Paper:
  • Great Paper:
  • Great Paper:
  • Why Automated Provisioning:Simply routine of provisioning systemsMaster processes and reduce human errorBalance workloadsQuick recoveryPuppet/Chef, Dell AIM, VCenter
  • Group them over: Performance, Scalability and Availability categoriesInfrastructure Monitoring: Tools that manage hardware and network infrastructureUser Experience Monitoring: Measuring Response TimesALM: Deep insight into application containerDatabase Monitoring: Remote monitoring/services ToolsSynthetic MonitoringLog Management Tools
  • First need to help customers understand how to monitor our cachesWhat are caches and why they are importantInsight into the caches via the Admin Console and other JMX access pointsCaches can and should be controlled via the cache settingsNG of caches: pluggable caches and exploration of distributed cache
  • Transcript

    • 1. Scaling Blackboard Learn™ for High Performance and Availability
      Stephen Feldman
      Sr. Director Performance, Security and Architecture
    • 2. Quick Bio
      Blackboard since 2003
      Performance Engineering from the start
      Platform Architecture in 2005
      Security Engineering in 2010
      “Love my job…love my team. If you email me, I will respond.”
    • 3. A Quick History Lesson of Bb…
      First release was 6.0.11 launched within a few weeks of arriving.
      Technology shift from Perl to Java through Release 5 and Release 8.
      Blackboard was the largest PerlEx ISV in the world in 2003.
      Customers were having issues with optimizing Java, Oracle and SQL Server
      First benchmark was at Sun in 2004 called the Tunathon.
      Learned that Blackboard Learn could scale and could scale to high-levels with a little TLC.
    • 4. As We Started Growing and Scaling
      In late 2004, we started building the Ref Arch as a model for customers
      Proved it out in benchmarks, as well as our own hosting facilities.
      We needed other players to come in and work with us to help us learn and validate a solution
      Key to our success: aggressive port from Perl to Java, earliest adoption of technologies: Solaris10, Oracle 10g, RHL 4 and 5, SQL 2005 and Java 5/6
      Willingness to adopt virtualization very early on
      Willingness to open our technology stack for affordable solutions such as NFS and CIFS
    • 5.
    • 6. Where We Are Today
      We have multiple customers supporting nearly 1 million users and dozens well over 250k live production users.
      Our benchmarks have been successful supporting over 1 million users with greater than 100k simultaneous sessions with sub-3s response times.
      The majority of our customers have benefitted from the Reference Architecture and have completely transformed their deployment to support the adoption and growth of the product.
    • 7.
    • 8. In The Beginning: RefArch I
    • 9. Focus of RefArch I
      Distribution of application and database
      Need for load-balancing the application server
      Early JVM clustering
      Fiber Storage and High-Speed Disks
      Low-cost option to use JBODs
      Basic operational monitoring
      Hardware, Network and Storage
      Keep it simple and you will succeed
    • 10.
    • 11. A Few Years Later Came RefArch II
    • 12. …Then Marketing Got their Hands on It
    • 13. Focus of RefArch II
    • 14. What are we modeling today and future…
    • 15. Unified Approach Working Together
      No Longer Center, but Parallel…
    • 16. Introducing RefArch III
      Logging and Monitoring
      Cloud Services
      Secure Performance
      Identity & Access Management
      Web Optimization
      & Provisioning
      Data Management
    • 17. Now Comes RefArchIII
    • 18.
    • 19. Defining SLAs
    • 20. Defining SLAs
    • 21. What is Performance?
      Performance is quantifiable and measureable
      Performance is also perception
      Mostly recognized from a cognitive perspective
    • 22. What is Scalability?
    • 23. What is Availability?
      High-availability offerings mask the effects of a system failure in order to minimize the impact of access and functional use of a system to a community of users.
      Simple Definition:
      Percentage of time the system is in its operational state.
      You will often hear the concept of 3x9’s, 4x9’s or even 5x9’s
      Planned versus Unplanned
      Availability = (Total Units of Time – Downtime) / Total Units of Time
      8760 hours in a year
      Downtime = 10 hours
      Availability = (8760 – 10)/8760 = 99.88%
    • 24. Quick View into Availability Statistics
    • 25.
    • 26. Automated Provisioning
      Simple routine of provisioning systems
      Master processes and reduce human error
      Balance workloads
      Quick recovery
      Emphasis on efficient computing
    • 27. Complete Monitoring and Logging Solutions
    • 28. Application Lifecycle Management
      True application insight and visibility
      Business processing mapping to transaction SLAs
      Multi-layer correlation
      Transaction workflow mapping
    • 29. Web Optimization Services
      Typical Optimization Services
      Domain Sharding
      Asynchronous JavaScript
      Response Prediction
      Browser Caching
    • 30. Present and Future of Caches
      Caches are used throughout Blackboard Learn to manage the life and reuse of data.
      Leveraging ehCache presently in Release 9.1
      Caches can and should be controlled via the file
      Insight into the caches can be achieved in the Admin Console and other JMX tools.
      Next generation of caches: pluggable caches (use your own) and distributed caches
    • 31. Steve Feldman
    • 32. Please provide feedback for this session by
      Scaling Blackboard Learn™
      for High Performance and Delivery