• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
MIT/CSAIL OpenStack Use Cases - Hong Kong 2014
 

MIT/CSAIL OpenStack Use Cases - Hong Kong 2014

on

  • 260 views

Over view of OpenStack deployment at MIT Computer Science and Artificial Intelligence Lab with focus on specific research use cases. Delivered at OpenStack Hong Kong Summit November 2013

Over view of OpenStack deployment at MIT Computer Science and Artificial Intelligence Lab with focus on specific research use cases. Delivered at OpenStack Hong Kong Summit November 2013

Statistics

Views

Total Views
260
Views on SlideShare
259
Embed Views
1

Actions

Likes
0
Downloads
1
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Intro <br /> Sr. Tech. Architect <br /> Charged with ensuring researchers have the “Infrastructure of the Future” <br /> Also ensure today&apos;s architecture is deployed properly and yesterday&apos;s architecture doesn&apos;t collapse... <br /> OpenStack running just over 1yr fits in as “tomorrow” (between today & future) <br />
  • Approx 50 groups in diverse areas <br /> Robotics <br /> Machine Learning <br /> Biomedical <br /> Computer Architecture <br /> Etc... <br /> All solving different problems using different methods <br /> Highly Complex <br /> Highly Diverse <br /> Incredibly Open Environment <br />
  • This one Is just as read about 30sec on what we setup <br />
  • Again just 30sec on recent changes <br />
  • Significant change in use since switching to over commit, though we rarely reach 2:1 on any node and are very nearly 1:1 over all (have seen peaks at 3:1) <br /> Larger multicore instance types and longer runtimes <br /> Pre over commit: <br /> Avg runtime 3:45 <br /> Avg vcpu 1.8 <br /> Post over commit: <br /> Avg runtime 39:20 <br /> Avg vcpu 4.2 <br /> It is not clear if this is causal... <br />
  • Alfa: Scalable evolutionary algorithms, machine learning and frameworks for knowledge mining, prediction, analytics and optimization. <br />
  • TREC-KBA: the MIT team organizing the 2013 Text Retrieval Conference&apos;s Knowledge Base Acceleration competition which seeks to help humans expand knowledge bases like Wikipedia by automatically recommending edits based on incoming content streams. This open evaluation measures an automatic system&apos;s ability to filter a large stream of text for new knowledge about entities. <br />
  • Ubik(sanchez): HW Simulation Wazzit?- simulated ~1 quadrillion instructions total, and at its peak openstack doubled to tripled the capacity <br /> Ubik proposes new hardware and software techniques to achieve systems <br /> that provide very strict quality of service guarantees for <br /> latency-critical workloads, and high throughput for batch workloads. <br /> The main motivation is that datacenters burn terawatt-hours, but <br /> servers, which make up the buk of datacenter power, are run at 10-15% <br /> of capacity to guarantee QoS for critical services. At the same time, <br /> datacenters have a lot of non-critical computing (e.g., MapReduce, <br /> overcommitted openstack VMs, etc.). In Ubik, we&apos;re developing a number <br /> of techniques that enable colocating both types of compute in the same <br /> system, sharing resources between batch and latency-critical workloads <br /> to achieve maximum utilization of CPU, memory, etc, but safely <br /> protecting latency-critical workloads from any noticeably degradation. <br />
  • NMS: used the cluster to host a contest for students in 6.829 (MIT&apos;s graduate networking class) to develop the best congestion control algorithms, by running the students&apos; algorithms on pre-recorded traces of cellular networks. Also using it for heavy computation, running big machine learning problems to try to get computers to design new congestion control algorithms.. <br />
  • LIS: Learning & Inteligent Systems conduct interdisciplinary research aimed at discovering the principles underlying the design of artificially intelligent robots. <br />
  • LIS: Learning & Inteligent Systems conduct interdisciplinary research aimed at discovering the principles underlying the design of artificially intelligent robots. <br />
  • Julia: Julia is a VHLLs or very high level languages for parallel computing and now an MIT Licensed opensource project, this is the group at MIT that originally developed it <br /> https://ijulia.csail.mit.edu:8000 Cert protected frontend <br /> Alan&apos;s apaprently using this in 18.06 & possibly other classes http://web.mit.edu/18.06/www <br />
  • NMS – intelligent placement of VMs to reduce network congestion in the data center <br /> Scaling Note Bene for use with EdX (http://nb.mit.edu) and DetectMe new project similar to http://labelme.csail.mit.edu <br /> Consolidation isn&apos;t “exciting” but is a metric of stability and manageability <br /> OCPS (bit of a play on OLPC )idea has been around longer than our cloud, in this case each lab member would get a moderate quota allocation independent of any particular project just to hack around and do cool stuff. <br /> Factors preventing some users from taking advantage, need to access low level hardware, special coprocessing requirements. <br /> Cloud Desktops old idea coming round again thin client accessible from anywhere, suggestion from community testing various implementations. Will it be useful? Don&apos;t know... <br />

MIT/CSAIL OpenStack Use Cases - Hong Kong 2014 MIT/CSAIL OpenStack Use Cases - Hong Kong 2014 Presentation Transcript

  • CSAIL Computer Science and Artificial Intelligence Laboratory ● ● ● ● ● Largest Lab at MIT 50 Year Legacy 107 Primary Investigators 1,033 Members Total ??? Active Research Projects
  • OpenStack (beta) Essex 2012/07 Folsom 2012/09 ● ● ● ● 768 physical cores 64 nodes Ubuntu + Puppet Goal Rapid Deployment (adapt as we go) ● Running 90% by 2013/02
  • OpenStack (ga) Grizzly 2013/08 Havana soon... ● ● Move to Neutron Networking Split High CPU / Web Apps (host aggregates & instance types) ● Coming Soon ... – – Orchestration (Heat) Metering (Ceilometer)
  • So Who's Using It? ● ● ● 114 Users 38 Projects 468,866 Instances (lifetime total) ● ● 6,303,639 vCPU Hours As of 25 October ... ● ALFA http://groups.csail.mit.edu/EVO-DesignOpt ● TREC-KBA http://trec-kba.org ● Jigsaw http://people.csail.mit.edu/sanchez ● NMS http://nms.csail.mit.edu ● LIS http://lis.csail.mit.edu ● Julia http://julialang.org/
  • ALFA Anyscale Learning For All ● ● ● Scalable Evolutionary Algorithms Machine Learning Frameworks for knowledge mining, prediction, analytics and optimization http://groups.csail.mit.edu/EVO-DesignOpt/groupWebSite/
  • TREC-KBA Text Retrieval Conference's Knowledge Base Acceleration competition ● ● MIT is the Organizing Team Content Stream – – – 462M Texts, 40% English 4,973 hourly chunks of 100k doc/hr stream News, blogs, forums http://trec-kba.org
  • Ubik service guarantees for latency-critical workloads ● ● ● Hardware & Software System Simulation ~ 1 Quadrillion Instructions Saw 2x - 3x speed up using cloud in addition to dedicated hardware http://people.csail.mit.edu/sanchez
  • NMS Network & Mobile Systems ● ● ● Student contest for congestion control algorithms Machine learning to get computers to design congestion control algorithms Possible future work on intelligent VM placement in clouds http://nms.csail.mit.edu
  • LIS Learning & Intelligent Systems ● ● ● Artificially Intelligent Robots Cookie Baking Robot OpenStack used for design & simulation of object recognition systems http://lis.csail.mit.edu
  • LIS Learning & Intelligent Systems ● ● ● Artificially Intelligent Robots Cookie Baking Robot OpenStack used for design & simulation of object recognition systems http://lis.csail.mit.edu
  • Julia ● ● ● VHLL for Parallel Programing Now MIT Licensed Community project Serving IJulia cluster used for research and teaching http://julialang.org
  • Future Work ● ● ● ● ● ● ● Infrastructure as Research Platform Internet Facing Applications Consolidate Existing Virtualization “One Cloud Per Student” ? Bare Metal ? GPUs ? “Cloud Desktops” ?