Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)

4,849 views

Published on

Presentation at Spark Summit 2015

Published in: Data & Analytics
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)

  1. 1. Spark on YARN: The Road Ahead Marcelo Vanzin
  2. 2. What to expect... -  What is YARN? -  Why YARN for Spark? -  Current state and future direction for: -  Resource Management -  Security
  3. 3. What is YARN? Resource management and negotiation -  Manages vcores and memory -  Multiple users -  Multiple apps -  Advanced features
  4. 4. Why YARN for Spark? -  Advanced resource management features -  Dynamic allocation for Spark executors -  Security
  5. 5. Resource Management Lots of knobs to turn, lots of trial and error. yarn.nodemanager.resource.memory-­‐mb   Executor  Container   spark.yarn.executor.memoryOverhead  (10%)   spark.executor.memory   spark.shuffle.memoryFracBon  (0.4)   spark.storage.memoryFracBon  (0.6)  
  6. 6. RM (Cont’d) Dynamic allocation -  No need to specify number of executors! -  Application grows and shrinks based on outstanding task count -  Still need to specify everything else…
  7. 7. SparkContext Job 1 Task Task Executor 1 Task Task Executor 2 Task Dynamic Allocation
  8. 8. Dynamic Allocation (Cont’d) How to make dynamic allocation better? -  Reduce allocation latency -  Data locality hints -  Handle cached RDDs
  9. 9. Data Locality -  Allocate executors close to data -  More predictable performance -  SPARK-4352
  10. 10. Cached RDDs -  Short term: avoid discarding cached data -  Keep executors around. -  Long term: -  Cache rebalance -  Container resizing -  Off-heap caching
  11. 11. Memory Sizing How to make application sizing easier? -  Soft container limits (YARN-3119) -  Overhead memory -  Simplified parameters
  12. 12. Overhead Memory Everything that is not the Java heap. Very opaque, hard to size correctly. -  Better metrics to understand usage -  Better heuristics for automatic sizing
  13. 13. Future: Simplified Allocation Single parameter: “task size”. Spark figures out the rest (vcores, heap, overhead). Container sizes can change to match available resources, data locality, etc. task-size: s Container request: vcores = x memory = y
  14. 14. RM Summary “Dynamic allocation is not for everybody. It can only meet the needs of 90% of applications.”
  15. 15. Security: Kerberos -  User authentication via Kerberos -  Supported in all services (YARN, HDFS, HBase, Hive, etc) -  Uses per-service delegation tokens
  16. 16. Security: Delegation Tokens Delegation tokens allow for authentication without running into KDC limitations.
  17. 17. Tokens (Cont’d) Tokens have a TTL. This is a problem for long-lived applications -  Spark Streaming -  Thrift Server Solution: re-generate tokens periodically.
  18. 18. Tokens (Cont’d) Generating tokens requires a kerberos ticket -  Spark driver handles KDC login -  Need access to user credentials (keytab) Secure if coupled with encryption.
  19. 19. Security: Encryption -  HDFS supports wire encryption -  YARN supports wire encryption -  Spark not so much
  20. 20. Encryption (Cont’d) Where does Spark need encryption? -  Control plane -  File distribution -  Block Manager -  User UI / REST API -  Data-at-rest (shuffle files)
  21. 21. Encryption (Cont’d) Goal: easy to configure encryption. -  SSL for HTTP endpoints -  SASL everywhere else In 1.4: SASL available for shuffle data, SSL available for Akka / file distribution.
  22. 22. Spark UI Support SSL for HTTP UI / API -  SPARK-2750 -  Outstanding issues: -  Certificate distribution (cluster mode) -  Shared certificates
  23. 23. Spark RPC -  Control plane encryption -  Current: Akka-over-SSL -  Future: RPC-over-SASL -  SPARK-6028, SPARK-6230 -  File distribution -  Replace HTTP channel with RPC
  24. 24. Data-at-Rest Local shuffle files, mostly. -  Now: configure “local dirs” on encrypted volumes -  Later: Spark itself encrypts shuffle files (SPARK-5682).
  25. 25. Summary -  YARN provides lots of useful features -  Not always easy to use -  Focus on: -  Usability -  Security -  Performance
  26. 26. Questions?

×