BBC Research & Development are in the process of deploying a department wide virtualization solution, catering for use cases including web development, machine learning, transcoding, media ingress and system testing. This talk discusses the implementation of a high performance Ceph storage backend and the challenges of virtualization in a broadcast research and development environment.
Bio:
Joel Merrick has been involved in system administration and engineering for well over a decade. He is the project lead for an internal VM platform for BBC Research and Development. First becoming involved in virtualisation more than 5 years ago, both professionally and working for the non-profit Sahana Foundation, whereby live production deployments of the software have been running under KVM ‘in the field’.
DSPy a system for AI to Write Prompts and Do Fine Tuning
OpenNebulaConf 2013 -Adventures in Research by Joel Merrick
1. Adventures in Research
Joel Merrick
BBC Research & Development
OpenNebula Conference 2013
Thursday, 26 September 13
2. About me
• From Manchester, UK
• Sysadmin by day, Project Lead for Internal Cloud by night
• Involved with Sahana Foundation in 2008, helping with
administration
• First production release running on KVM during 2010
Haiti Earthquake.
• It’s ready for prime-time
Thursday, 26 September 13
3. About BBC R&D
• Established in 1922, shortly after main organisation
• Initially 2 divisions, Research Department and Development
• Grew rapidly, moving homes several times
• Eventually settled at Kingswood Warren, Surrey
• Amalgamated to R&D in 1993
• Now 3 sites - Centre House, MediaCity UK, 1 Euston
Square
Thursday, 26 September 13
4. About BBC R&D
Kingswood Warren, Surrey
Centre
House,
London
MediaCity,
Manchester
1ES, London
Thursday, 26 September 13
5. Previous Technologies Developed
• Noise Cancelling Microphones
• Conversion from 405-line to 625 line
• Colour Television
• Transatlantic Cable & Satellite
• BBC Micro
• NICAM Stereo
• DAB Digital / DTV / Freeview
• YouView
Thursday, 26 September 13
6. Collaboration
SuperHiVision with NHK for London 2012 Olympic Games
http://www.bbc.co.uk/blogs/researchanddevelopment/2012/08/the-olympics-in-super-hi-visio.shtml
Thursday, 26 September 13
7. Areas of Research
Capture
This area covers learning how to recognise and isolate objects within audio and video files automatically, such
as individual sound sources or the motion of an actor or athlete, as well as how best to record and store
media so it is durable and compatible with other systems.
• Produce
Our research in this area helps keep costs down and make production more efficient by developing the kinds
of technology that might radically improve the way programmes are made in the future.
• Deliver
This research aims to develop new ways to distribute our programmes, while ensuring audiences receive them
in the best possible quality, wherever they are, whenever they want them and whatever device they are using.
• Discover
This area sees us experimenting with new types of programmes and, with the BBC about to open more than
70 years’ worth of archives, how audiences might find and interact with them.
• Experience
How our audiences experience BBC programmes is our focus here. In this area we anticipate their future
expectations and ensure new technology, however complex, is easy to use and accessible for everyone
Thursday, 26 September 13
8. Every Day is Different
• We don’t have one specific kind of
workload on the shared platform
• Make is as flexible as possible, but also keep
it performant
• Most users don’t really care about backend
technology, they just want a simple, yet
effective service.
Thursday, 26 September 13
9. Some Current Projects(not all, by any means!)
• IP Studio
• Object Based Audio
• Enhanced Subtitling
• World Service ArchiveVoice Analysis &
Scrubbing
Thursday, 26 September 13
10. Challenges
• Engineers left with flexibility to do their own thing
• Silos of knowledge, hinders cross-team interactions
• Time taken to provision
• Inconsistencies
• Harder to manage asset utilisation
• Demand for compute resources and storage will
only increase
Thursday, 26 September 13
11. Legacy
• Robust internal systems
• Virtualisation in use, but only really single nodes
and in ad hoc situations
• Each team had their favourite distribution
• Very little / no config management or deployment
tools in most project areas
Thursday, 26 September 13
12. A Different Approach
• Reduce the time drains
• Automate Everything (eventually!)
• Try to standardize where appropriate
• Take ownership of assets
• Make it easy to extend and reproduce the
platform
Thursday, 26 September 13
13. Early Stages
• Project been running for about 6 months
• Available to users for only 2 months
• 2 clusters currently online
• Project teams already committing to procurement
• Pan-BBC Interest
• Opportunity to develop best practice as well as better
interactions with other areas of the organisation
Thursday, 26 September 13
14. Current Uses
• Started hosting Internal Systems Infrastructure
• Build slaves
• Indexing (100GBVM!)
• General hosting
• Hacking on ideas!
Thursday, 26 September 13
15. Why Build a Cloud?
• We have ownership!
• We can be more confident in security policy
• Can be guaranteed of the execution venue, so legal
stipulations can be met
• Network access is much faster for users, latency
is a lot better
Thursday, 26 September 13
16. High Level ComponentView
• OpenNebula 4.2
• KVM
• Ceph (rbd forVMs) - Using snapshot layering driver
and custom libvirt
• Ubuntu 13.04 - may transition back to LTS
Thursday, 26 September 13
18. Network
• All hosts have 10Gbit
interconnectivity
• Intel Corporation 82599EB 10-Gigabit
SFP+
• Copper TwinAx
• Cisco Nexus 5020 ‘brains’
• FEX 2232 (Fabric Extender) as ToR
switch
Thursday, 26 September 13
19. OpenNebula Setup
• Currently running 4.2
• Main user interaction is via Sunstone
• Users authenticate against LDAP
• Default view for users is ‘cloud’
• Ceph RBD asVM block storage
• CephFS as System Datastore
• OpenVSwitch
Thursday, 26 September 13
20. Storage Node / Ceph Setup
“Ceph is a distributed object store and file system designed to provide
excellent performance, reliability and scalability.”
• XFS based OSDs (not btrfs)
• 12TB per node initial, growth to 24/48TB per node
• Around 1/8th Petabyte currently
• No SSDs
• Journals on Disk
• Deployed using ceph-deploy (much better now)
• RBD writeback caching (also writethough available)
• OSDs on all, MON’s on a small subset, MDS on inverse.
Thursday, 26 September 13
22. Ceph’s Future
• Can only get better!
• Better REST admin API’s
• 8x speed increase in CRC functions in testing
• OpenZFS to leverage journaling?
• Erasure encoding to reduce space requirements
• Mutli-site replication
• RBD client side SSD caching (specifically for OS deployment)
Thursday, 26 September 13
23. Deployment
• Generally Puppet Managed
• VM Image generated usingVeeWee
https://github.com/jedi4ever/veewee
"A great tool for creating and configuring lightweight,
reproducible, portable virtual machine environments
- often used with the addition of automation tools
such as Chef or Puppet."
Thursday, 26 September 13
24. Oversubscription
• Not allVMs have CPU intensive workloads
• Makes financial sense to over-commit resource
when applicable
• Shared resources have CPU over-commited by 4x
• Memory is not over-commited
• Project teams can manage their own level on
their own equipment
Thursday, 26 September 13
25. Future Work - OpenNebula
• Hypervisor-side SSD caching (B-cache, flash-cache,
enhanceio etc.)... possibly
• Better ceph integration (attach_disk etc)
• Multiple Ceph Pools for tiered storage
• SSD based local storage
• Leverage more of radosgw for S3 compliant storage
• IntegrateVM generator into Sunstone/ONE?
• Move to virtio-scsi
Thursday, 26 September 13
26. Future Work - Hardware Pools
• PCI Passthrough Pooling
• Mainly used for SR-IOV Network adapters
• Allow PCI capture devices to be bound to aVM
• Drive the SDI Matrix to attach a given soft-patch
• Other use cases?
Thursday, 26 September 13