Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica

1,514 views

Published on

A long-standing grand challenge in computing is to enable machines to act autonomously and intelligently: to rapidly and repeatedly take appropriate actions based on information in the world around them. To address this challenge, at UC Berkeley we are starting a new five year effort that focuses on the development of data-intensive systems that provide Real-Time Intelligence with Secure Execution (RISE). Following in the footsteps of AMPLab, RISELab is an interdisciplinary effort bringing together researchers across AI, robotics, security, and data systems. In this talk I’ll present our research vision and then discuss some of the applications that will be enabled by RISE technologies.

Published in: Data & Analytics
  • Hello! High Quality And Affordable Essays For You. Starting at $4.99 per page - Check our website! https://vk.cc/82gJD2
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica

  1. 1. RISELab: Enabling Intelligent Real-time Decisions Ion Stoica February 8, 2017
  2. 2. Berkeley’s AMPLab (2011-2016) 2 Algorithms Machines People Goal: Nextgeneration of open source data analytics stack for industry & academia BerkeleyDataAnalytics Stack (BDAS)
  3. 3. Berkeley’s AMPLab (2011-2016) 3 Algorithms Machines People Goal: Nextgeneration of open source data analytics stack for industry & academia BerkeleyDataAnalytics Stack (BDAS) …
  4. 4. RISE: Real-time Intelligent Secure Execution
  5. 5. From batch data to advanced analytics AMPLab 5 From live data to real-time decisions RISELab
  6. 6. RISE Lab (2017-2022) 12 faculty across AI, systems, security, and architectures 11 Founding sponsors 6
  7. 7. Why? Data only as valuable as the decisions it enables 7
  8. 8. Why? What does this mean? • Faster decisions better than slower decisions • Decisions on fresh data better than decisions on stale data • Decisions on personalized data better than on aggregate data 8 Data only as valuable as the decisions it enables
  9. 9. Goal Real-time decisions on live data with strong security 9 decide in ms the current stateof theenvironment privacy, confidentiality, integrity
  10. 10. Typical decision system Decision System Query Decision Environment + sensors & actuators Observations, Feedback Preprocess Intermediate data Decision Engine
  11. 11. Decision Query Decision Engine Preprocess Intermediate data Environment + sensors & actuators Typical decision system Decision System Observations, Feedback Live Update latency (e.g., ~1 seconds)
  12. 12. Decision Query Decision Engine Preprocess Intermediate data Environment + sensors & actuators Typical decision system Decision System Observations, Feedback Live Update latency (e.g., ~1 seconds) Secure Real-time decision latency (e.g., ~10 ms)
  13. 13. Example of decision systems Decision System Query Decision Training Models (diff.tradeoffs complexity/ accuracy) Model Serving Feedback Observations, Feedback ML Pipeline (e.g., Clipper + Spark/Tensorflow) Decision System Obs. Action Update Policy Policy obs à action Query Policy Observations, Rewards Reinforcement Learning Systems (e.g., Ray) Pre- process Interm- ediate Data Decision Engine
  14. 14. What else do we want from decisions? Intelligent:complexdecisions in uncertain environments Robust: handle complexnoise, unforeseeninputs, failures Explainable:ability to explainnon-obvious decisions
  15. 15. Goal Develop open source platforms, tools, and algorithms for intelligent real-time decisions on live-data
  16. 16. Some Proposed Research Secure Real-time Decisions Stack (SRDS) • Open source platform to develop of RISE apps • Secure from ground up • Reinforcement Learning (RL) as one of key app patterns Learning control hierarchies: speeduplearning, training Shared learning: learn over confidential data 16
  17. 17. Secure Real-time Decisions Stack (SRDS) • Open source platform to develop of RISE apps • Secure from ground up • Reinforcement Learning (RL) as one of key app patterns Learning control hierarchies: speeduplearning, training Shared learning: learn over confidential data 17 Some Proposed Research
  18. 18. Secure Real-time Decision Stack (SRDS) scheduler object store RISE μkernel Ray Clipper … Ground(data contextservice) Time Machine
  19. 19. scheduler object store RISE μkernel Ray Clipper … Ground(data contextservice) Time Machine Minimalist executionengine: • Support both data flow and task-parallel execution models • High-throughput, low-latency: ~ 1M tasks/sec @ ms latency Secure Real-time Decision Stack (SRDS)
  20. 20. scheduler object store RISE μkernel Ray Clipper … Ground(data contextservice) Time Machine Central repositoryfor models,APIs to capture the context in whichdata gets used and produced Status:ongoing project with industry partners Secure Real-time Decision Stack (SRDS)
  21. 21. scheduler object store RISE μkernel Ray Clipper … Ground(data contextservice) Time Machine Replaying of apps at fine granularity • Simplify development, debugging • Robustness: replay against perturbed inputs • Explainability: identify inputs causing decision • Security: confirm vulnerabilities, test security patches, compliance auditing Secure Real-time Decision Stack (SRDS)
  22. 22. scheduler object store RISE μkernel Ray Clipper … Ground(data contextservice) Time Machine Dramatically simplify development of RISE applications • Apache Spark: improve latency and security • Clipper: model serving for Apache Spark, Scikit learn, etc • Ray: framework for RL applications Secure Real-time Decision Stack (SRDS)
  23. 23. ImprovingApache Spark Drizzle • Decrease latency of Structured Streaming and ML algorithms by ~10x • Techniques: group scheduling, shared variables 23
  24. 24. StreamingLatency:YCSB benchmark 24 0 500 1000 1500 2000 1 2 4 8 12 16 20 24 MedianEventLatency (ms) Throughput (Million events/s) Spark Drizzle Flink Drizzle-Opt Drizzle-Opt: Reduce-by on mapper side
  25. 25. StreamingLatency:YCSB benchmark 25 0 500 1000 1500 2000 1 2 4 8 12 16 20 24 MedianEventLatency (ms) Throughput (Million events/s) Spark Drizzle Flink Drizzle-Opt Drizzle-Opt: Reduce-by on mapper side
  26. 26. StreamingLatency:YCSB benchmark 26 0 500 1000 1500 2000 1 2 4 8 12 16 20 24 MedianEventLatency (ms) Throughput (Million events/s) Spark Drizzle Flink Drizzle-Opt Drizzle-Opt: Reduce-by on mapper side
  27. 27. StreamingLatency:YCSB benchmark 27 0 500 1000 1500 2000 1 2 4 8 12 16 20 24 MedianEventLatency (ms) Throughput (Million events/s) Spark Drizzle Flink Drizzle-Opt Drizzle-Opt: Reduce-by on mapper side
  28. 28. StreamingLatency:YCSB benchmark 28 0 500 1000 1500 2000 1 2 4 8 12 16 20 24 MedianEventLatency (ms) Throughput (Million events/s) Spark Drizzle Flink Drizzle-Opt Drizzle-Opt: Reduce-by on mapper side 15x
  29. 29. MLlib: SGD Performance 29 0 20 40 60 4 8 16 32 64 128 Time/iter(ms) Machines Spark Drizzle
  30. 30. MLlib: SGD Performance 30 0 20 40 60 4 8 16 32 64 128 Time/iter(ms) Machines Spark Drizzle 6x
  31. 31. ImprovingApache Spark Drizzle • Decrease latency of Structured Streaming and ML algorithms by ~10x • Techniques: group scheduling, shared variables • Some of these techniques will make their way to Apache Spark Opaque • Full data encryption, authentication, and verification (Intel’s SGX) • Oblivious mode: hide data access pattern • Support most SparkSQL functionality • See Wenting’s talk later 31
  32. 32. RISELab Already promising results Expect muchmore over the next five years! 32 Goal: Develop open source platforms,tools, and algorithms for intelligent real-timedecisions on live- data
  33. 33. Thank you

×