Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 79

Introducing .NET Bindings for Apache Spark

2

Share

Download to read offline

We present a new, free, open-source framework aimed at making Spark accessible to millions of .NET developers. In this session we will provide a high level overview of the .NET bindings for Spark effort, demonstrate some key capabilities on how you can use and get involved with the effort, and also cover how you can use the .NET bindings for Spark with other .NET frameworks like ML.NET for building E2E real-time analytics solutions. This will be one fun session with demos galore, so come join us as we get started on the .NET bindings for Spark journey!

Introducing .NET Bindings for Apache Spark

  1. 1. WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
  2. 2. Rahul Potharaju, Terry Kim, Tyson Condie Microsoft Introducing .NET Bindings for Apache Spark #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  3. 3. https://github.com/dotnet/spark 30 hours into it… huge thanks! #DotNetForSpark #UnifiedAnalytics #SparkAISummit 521 https://github.com/dotnet/spark .NET for Apache Spark OSS Announcement @Rohan’s Keynote in Spark+AI Summit 2019 115,000 Twitter Impressions
  4. 4. https://github.com/dotnet/spark 30 hours into it… huge thanks! #DotNetForSpark #UnifiedAnalytics #SparkAISummit 521 https://github.com/dotnet/spark .NET for Apache Spark OSS Announcement @Rohan’s Keynote in Spark+AI Summit 2019 115,000 Twitter Impressions
  5. 5. https://github.com/dotnet/spark What is .NET? The Big Picture 44#DotNetForSpark #UnifiedAnalytics #SparkAISummit
  6. 6. https://github.com/dotnet/spark .NET – A unified platform #DotNetForSpark #UnifiedAnalytics #SparkAISummit LIBRARIES INFRASTRUCTURE .NET STANDARD DESKTOP WEB CLOUD MOBILE GAMING IoT AI
  7. 7. • C# is a simple, modern, object-oriented, and type- safe programming language • Its roots in the C family of languages makes C# immediately familiar to C, C++, Java, and JavaScript programmers • F# is a cross-platform, open-source, functional programming language for .NET • It also includes object-oriented and imperative programming • Visual Basic is an approachable language with a simple syntax for building type-safe, object- oriented apps
  8. 8. .NET Open Source & Cross-Platform .NET Core developers 750K New .NET developers in last year +1M
  9. 9. https://github.com/dotnet/spark Companies embracing .NET… #DotNetForSpark #UnifiedAnalytics #SparkAISummit dot.net/customers
  10. 10. https://github.com/dotnet/spark .NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  11. 11. https://github.com/dotnet/spark .NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  12. 12. https://github.com/dotnet/spark .NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but…
  13. 13. https://github.com/dotnet/spark .NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit … a lot of Big Data-usable business logic (millions of lines of code) is written in .NET! Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but…
  14. 14. https://github.com/dotnet/spark .NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit … a lot of Big Data-usable business logic (millions of lines of code) is written in .NET! Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but… In a recently conducted .NET Developer survey (> 1000 developers), more than 70% expressed interest in Apache Spark!
  15. 15. https://github.com/dotnet/spark Why Apache Spark should 💖 .NET Developers? #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  16. 16. https://github.com/dotnet/spark Why Apache Spark should 💖 .NET Developers? #DotNetForSpark #UnifiedAnalytics #SparkAISummit More people who learn Apache Spark
  17. 17. https://github.com/dotnet/spark Why Apache Spark should 💖 .NET Developers? #DotNetForSpark #UnifiedAnalytics #SparkAISummit More people who learn Apache Spark Solve harder challenges together =
  18. 18. https://github.com/dotnet/spark Why Apache Spark should 💖 .NET Developers? #DotNetForSpark #UnifiedAnalytics #SparkAISummit More people who learn Apache Spark Solve harder challenges together Make the world a better place! = =
  19. 19. https://github.com/dotnet/spark Restating Our Intent… #DotNetForSpark #UnifiedAnalytics #SparkAISummit Goal: .NET for Apache Spark is aimed at providing .NET developers a first-class experience when working with Apache Spark. Non-Goal: Converting existing Scala/Python/Java Spark developers.
  20. 20. https://github.com/dotnet/spark Who? Team and Commitment 59#DotNetForSpark #UnifiedAnalytics #SparkAISummit
  21. 21. https://github.com/dotnet/spark Microsoft is committed… #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  22. 22. https://github.com/dotnet/spark Microsoft is committed… #DotNetForSpark #UnifiedAnalytics #SparkAISummit • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers
  23. 23. https://github.com/dotnet/spark Microsoft is committed… #DotNetForSpark #UnifiedAnalytics #SparkAISummit • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code)
  24. 24. https://github.com/dotnet/spark Microsoft is committed… #DotNetForSpark #UnifiedAnalytics #SparkAISummit • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers • Performance benchmarking (cluster) • Production workloads • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code)
  25. 25. https://github.com/dotnet/spark Microsoft is committed… #DotNetForSpark #UnifiedAnalytics #SparkAISummit • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers • Technical Documentation, Blogs and Articles • End-to-end scenarios • Performance benchmarking (cluster) • Production workloads • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code)
  26. 26. https://github.com/dotnet/spark … and developing in the open! #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  27. 27. https://github.com/dotnet/spark … and developing in the open! #DotNetForSpark #UnifiedAnalytics #SparkAISummit .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark
  28. 28. https://github.com/dotnet/spark … and developing in the open! #DotNetForSpark #UnifiedAnalytics #SparkAISummit .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark Spark Project Improvement Proposals: • Interop Support for Spark Language Extensions: SPARK-26257 • .NET bindings for Apache Spark: SPARK-27006
  29. 29. https://github.com/dotnet/spark … and developing in the open! #DotNetForSpark #UnifiedAnalytics #SparkAISummit Contributions to foundational OSS projects: • Apache Arrow: ARROW-4997, ARROW-5019, ARROW-4839, ARROW- 4502, ARROW-4737, ARROW-4543, ARROW-4435 • Pyrolite (Pickling Library): Improve pickling/unpickling performance, Add a Strong Name to Pyrolite .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark Spark Project Improvement Proposals: • Interop Support for Spark Language Extensions: SPARK-26257 • .NET bindings for Apache Spark: SPARK-27006
  30. 30. https://github.com/dotnet/spark Thanks to everyone who made this possible! #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  31. 31. https://github.com/dotnet/spark .NET for Apache Spark Parity, Experience, Architecture, Demo and Performance 70#DotNetForSpark #UnifiedAnalytics #SparkAISummit
  32. 32. https://github.com/dotnet/spark .NET provides full spectrum Spark support #DotNetForSpark #UnifiedAnalytics #SparkAISummit Batch & Streaming (including Spark Structured Streaming and all Spark- supported data sources) Spark DataFrames (works with Spark v2.3.x/v2.4.[0/1] and includes ~300 SparkSQL functions) .NET Standard 2.0 (works with .NET Framework v4.6.1+ and .NET Core v2.1+ and includes C#/F# support) .NET Standard
  33. 33. https://github.com/dotnet/spark .NET for Apache Spark Programmability #DotNetForSpark #UnifiedAnalytics #SparkAISummit var spark = SparkSession.Builder().GetOrCreate(); var dataframe = spark.Read().Json(“input.json”); dataframe.Filter(df["age"] > 21) .Select(concat(df[“age”], df[“name”]).Show(); var concat = Udf<int?, string, string>((age, name)=>name+age);
  34. 34. https://github.com/dotnet/spark Submitting a Spark Application #DotNetForSpark #UnifiedAnalytics #SparkAISummit spark-submit ` --class <user-app-main-class> ` --master local ` <path-to-user-jar> <argument(s)-to-your-app> spark-submit (Scala) spark-submit ` --class org.apache.spark.deploy.DotnetRunner ` --master local ` <path-to-microsoft-spark-jar> ` <path-to-your-app-exe> <argument(s)-to-your-app> spark-submit (.NET) Provided by .NET for Apache Spark Library Provided by User & has business logic
  35. 35. https://github.com/dotnet/spark Demo 1: Getting Started #DotNetForSpark #UnifiedAnalytics #SparkAISummit {"name":"Michael"} {"name":"Andy", "age":30} {"name":"Justin", "age":19}
  36. 36. https://github.com/dotnet/spark Demo 2: Debugging a Spark .NET App #DotNetForSpark #UnifiedAnalytics #SparkAISummit {"name":"Michael"} {"name":"Andy", "age":30} {"name":"Justin", "age":19}
  37. 37. https://github.com/dotnet/spark Language Comparison: TPC-H Query 2 #DotNetForSpark #UnifiedAnalytics #SparkAISummit val europe = region.filter($"r_name" === "EUROPE") .join(nation, $"r_regionkey" === nation("n_regionkey")) .join(supplier, $"n_nationkey" === supplier("s_nationkey")) .join(partsupp, supplier("s_suppkey") === partsupp("ps_suppkey")) val brass = part.filter(part("p_size") === 15 && part("p_type").endsWith("BRASS")) .join(europe, europe("ps_partkey") === $"p_partkey") val minCost = brass.groupBy(brass("ps_partkey")) .agg(min("ps_supplycost").as("min")) brass.join(minCost, brass("ps_partkey") === minCost("ps_partkey")) .filter(brass("ps_supplycost") === minCost("min")) .select("s_acctbal", "s_name", "n_name", "p_partkey", "p_mfgr", "s_address", "s_phone", "s_comment") .sort($"s_acctbal".desc, $"n_name", $"s_name", $"p_partkey") .limit(100) .show() var europe = region.Filter(Col("r_name") == "EUROPE") .Join(nation, Col("r_regionkey") == nation["n_regionkey"]) .Join(supplier, Col("n_nationkey") == supplier["s_nationkey"]) .Join(partsupp, supplier["s_suppkey"] == partsupp["ps_suppkey"]); var brass = part.Filter(part["p_size"] == 15 & part["p_type"].EndsWith("BRASS")) .Join(europe, europe["ps_partkey"] == Col("p_partkey")); var minCost = brass.GroupBy(brass["ps_partkey"]) .Agg(Min("ps_supplycost").As("min")); brass.Join(minCost, brass["ps_partkey"] == minCost["ps_partkey"]) .Filter(brass["ps_supplycost"] == minCost["min"]) .Select("s_acctbal", "s_name", "n_name", "p_partkey", "p_mfgr", "s_address", "s_phone", "s_comment") .Sort(Col("s_acctbal").Desc(), Col("n_name"), Col("s_name"), Col("p_partkey")) .Limit(100) .Show(); Similar syntax – dangerously copy/paste friendly! $”col_name” vs. Col(“col_name”) Capitalization Scala C#
  38. 38. https://github.com/dotnet/spark Demo 3: Running your .NET App on the Cloud #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  39. 39. https://github.com/dotnet/spark What is happening when you write .NET Spark code? #DotNetForSpark #UnifiedAnalytics #SparkAISummit SparkSQL DataFrame .NET for Apache Spark .NET Program Did you define a .NET UDF? Regular execution path (no .NET runtime during execution) Register .NET UDF & leverage PySpark Execution semantics No Yes
  40. 40. https://github.com/dotnet/spark Submitting a Spark Application (recap) #DotNetForSpark #UnifiedAnalytics #SparkAISummit spark-submit ` --class <user-app-main-class> ` --master local ` <path-to-user-jar> <argument(s)-to-your-app> spark-submit (Scala) spark-submit ` --class org.apache.spark.deploy.DotnetRunner ` --master local ` <path-to-microsoft-spark-jar> ` <path-to-your-app-exe> <argument(s)-to-your-app> spark-submit (.NET) Provided by .NET for Apache Spark Library Provided by User & has business logic
  41. 41. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit
  42. 42. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit spark-submit
  43. 43. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit spark-submit Launch 1
  44. 44. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit JVM spark-submit DotnetRunner Launch 1
  45. 45. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit JVM spark-submit DotnetRunner Launch 1 2 Launch
  46. 46. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit JVM spark-submit DotnetRunner DotnetBackend Port XYZ Launch 1 2 Launch
  47. 47. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit CLRJVM spark-submit DotnetRunner DotnetBackend Port XYZ .NET App Launch .NET app with config (e.g., Port XYZ) 3Launch 1 2 Launch
  48. 48. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit CLRJVM spark-submit DotnetRunner DotnetBackend Port XYZ .NET App Launch .NET app with config (e.g., Port XYZ) 3Launch 1 Send commands via JVMBridge 4 SparkSession Dataframe SQL Streaming … .NET Proxies for JVM objects Create & Manage Proxy Objects 2 Launch
  49. 49. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit CLRJVM spark-submit DotnetRunner DotnetBackend Port XYZ .NET App Launch .NET app with config (e.g., Port XYZ) 3Launch 1 Send commands via JVMBridge 4 SparkSession Dataframe SQL Streaming … .NET Proxies for JVM objects Create & Manage Proxy Objects 2 Launch SparkSession Dataframe SQL Streaming … JVM objects Create & Manage JVM Objects by mirroring .NET operations
  50. 50. https://github.com/dotnet/spark Spark Driver-side Workflow #DotNetForSpark #UnifiedAnalytics #SparkAISummit CLRJVM spark-submit DotnetRunner DotnetBackend Port XYZ .NET App Launch .NET app with config (e.g., Port XYZ) 3Launch 1 .NET objects hold references to JVM objects Send commands via JVMBridge 4 SparkSession Dataframe SQL Streaming … .NET Proxies for JVM objects Create & Manage Proxy Objects 2 Launch SparkSession Dataframe SQL Streaming … JVM objects Create & Manage JVM Objects by mirroring .NET operations
  51. 51. https://github.com/dotnet/spark What happens when you define a .NET UDF? #DotNetForSpark #UnifiedAnalytics #SparkAISummit var df = spark.Read().Schema(…) .Json(@"people.json"); var addition = Udf<int?, string, string>( (age, name) => name + age); df.Select(addition(df["age"], df["name"])) .Explain(true); User code with UDF Registers UDF with Spark Serialize .NET UDF Wrap as PythonFunction & set executable=Python Microsoft.Spark.Worker Create a UserDefinedPythonFunction Piggyback on PySpark Physical Execution Operator
  52. 52. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit User Spark Library Legend:
  53. 53. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor User Spark Library Legend:
  54. 54. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Run a task with a UDF 1 User Spark Library Legend:
  55. 55. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Microsoft.Spark.Worker Spark Worker Node CLR Run a task with a UDF 1 Launch worker executable2 User Spark Library Legend:
  56. 56. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Microsoft.Spark.Worker Spark Worker Node CLR Run a task with a UDF 1 Launch worker executable2 3 Serialize UDFs & data User Spark Library Legend: Interop (Scala) Interop (.NET)
  57. 57. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Microsoft.Spark.Worker Spark Worker Node CLR Run a task with a UDF 1 Launch worker executable2 3 Serialize UDFs & data .NET UDF Library 4 Execute user- defined operations User Spark Library Legend: Interop (Scala) Interop (.NET)
  58. 58. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Microsoft.Spark.Worker Spark Worker Node CLR Run a task with a UDF 1 Launch worker executable2 3 Serialize UDFs & data .NET UDF Library 4 Execute user- defined operations 5 Write serialized result rows User Spark Library Legend: Interop (Scala) Interop (.NET)
  59. 59. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Microsoft.Spark.Worker Spark Worker Node CLR Run a task with a UDF 1 Launch worker executable2 3 Serialize UDFs & data .NET UDF Library 4 Execute user- defined operations 5 Write serialized result rows User Spark Library Legend: Interop (Scala) Interop (.NET) Challenge: How to serialize data from JVM <-> CLR?
  60. 60. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Microsoft.Spark.Worker Spark Worker Node CLR Run a task with a UDF 1 Launch worker executable2 3 Serialize UDFs & data .NET UDF Library 4 Execute user- defined operations 5 Write serialized result rows User Spark Library Legend: Interop (Scala) Interop (.NET) Challenge: How to serialize data from JVM <-> CLR? Pickling Row-oriented
  61. 61. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Microsoft.Spark.Worker Spark Worker Node CLR Run a task with a UDF 1 Launch worker executable2 3 Serialize UDFs & data .NET UDF Library 4 Execute user- defined operations 5 Write serialized result rows User Spark Library Legend: Interop (Scala) Interop (.NET) Challenge: How to serialize data from JVM <-> CLR? Pickling Row-oriented Apache Arrow Column-oriented
  62. 62. https://github.com/dotnet/spark Worker-side Interop #DotNetForSpark #UnifiedAnalytics #SparkAISummit Spark Worker Node JVM Spark Executor Microsoft.Spark.Worker Spark Worker Node CLR Run a task with a UDF 1 Launch worker executable2 3 Serialize UDFs & data .NET UDF Library 4 Execute user- defined operations 5 Write serialized result rows User Spark Library Legend: Interop (Scala) Interop (.NET) Challenge: How to serialize data from JVM <-> CLR? Pickling Row-oriented Apache Arrow Column-oriented DONE
  63. 63. https://github.com/dotnet/spark#DotNetForSpark #UnifiedAnalytics #SparkAISummit Performance – Warm Cluster Runs for Pickling Serialization
  64. 64. https://github.com/dotnet/spark#DotNetForSpark #UnifiedAnalytics #SparkAISummit Performance – Warm Cluster Runs for Pickling Serialization
  65. 65. https://github.com/dotnet/spark#DotNetForSpark #UnifiedAnalytics #SparkAISummit Performance – Warm Cluster Runs for Pickling Serialization
  66. 66. https://github.com/dotnet/spark#DotNetForSpark #UnifiedAnalytics #SparkAISummit Performance – Warm Cluster Runs for Pickling Serialization
  67. 67. https://github.com/dotnet/spark#DotNetForSpark #UnifiedAnalytics #SparkAISummit Performance – Warm Cluster Runs for Pickling Serialization Takeaway 1: Where UDF performance does not matter, .NET is on- par with Python
  68. 68. https://github.com/dotnet/spark#DotNetForSpark #UnifiedAnalytics #SparkAISummit Performance – Warm Cluster Runs for Pickling Serialization Takeaway 1: Where UDF performance does not matter, .NET is on- par with Python
  69. 69. https://github.com/dotnet/spark#DotNetForSpark #UnifiedAnalytics #SparkAISummit Performance – Warm Cluster Runs for Pickling Serialization Takeaway 1: Where UDF performance does not matter, .NET is on- par with Python Takeaway 2: Where UDF performance is critical, .NET is ~2x faster than Python!
  70. 70. https://github.com/dotnet/spark Next steps for benchmarking… #DotNetForSpark #UnifiedAnalytics #SparkAISummit Benchmark with Apache Arrow TPC- DS? Follow the discussion here: https://github.com/dotnet/spark/issues/45 .NET for Apache Arrow Chris Hutchinson Initial .NET Implementation Work with Community TPC-H Dataset Generation Eric Erhardt Performance Optimizations ARROW-4997, ARROW-5019, ARROW-4839, ARROW-4502, ARROW- 4737, ARROW-4543, ARROW-4435 Improvise
  71. 71. https://github.com/dotnet/spark What’s next after next? #DotNetForSpark #UnifiedAnalytics #SparkAISummit Programming & Idiomatic experiences in .NET (UDAF, UDT support) Spark data connectors in .NET (e.g., Apache Kafka, Azure Blob Store, Azure Data Lake) Tooling experiences for .NET Developers (e.g., Jupyter, VS Code, Visual Studio, others?)
  72. 72. Demo 4: Jupyter Notebook for C# (Preview) {"name":"Michael"} {"name":"Andy", "age":30} {"name":"Justin", "age":19}
  73. 73. https://github.com/dotnet/spark Conclusion Let’s recap… 112#DotNetForSpark #UnifiedAnalytics #SparkAISummit
  74. 74. https://github.com/dotnet/spark Open Sourced .NET for Apache Spark #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://dot.net/spark | https://github.com/dotnet/spark Give you first-class experience in scaling out your .NET over large amounts of data using Apache Spark .NET Standard
  75. 75. https://github.com/dotnet/spark Works everywhere! #DotNetForSpark #UnifiedAnalytics #SparkAISummit Cross Platform Cross Cloud Windows Ubuntu Azure & AWS Databricks macOS AWS EMR Spark Azure HDI Spark
  76. 76. https://github.com/dotnet/spark Call to Action: Engage, Use & Guide Us! #DotNetForSpark #UnifiedAnalytics #SparkAISummit Useful Links: http://github.com/dotnet/spark Website: https://dot.net/spark Available as out-of-box on Azure HDInsight Spark For other clouds – https://aka.ms/InstallDotNetForSpark
  77. 77. https://github.com/dotnet/spark Contribution Model #DotNetForSpark #UnifiedAnalytics #SparkAISummit • Play with .NET Bindings • Contribute PRs to close existing issues • Submit a GitHub issue • Verify fixes for bugs • Submit a code fix for a bug • Submit a new feature request • Submit a unit test • Code review pending PRs/bug fixes • Tell others about the .NET Bindings
  78. 78. THANK YOU! https://github.com/dotnet/spark DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT
  79. 79. Have questions about this session? I’ll be at the Microsoft Booth #200 from xx:xxam/pm to xx:xxam/pm. Grab some SWAG! For more details visit: https://databricks.co m/sparkaisummit

×