SlideShare a Scribd company logo
1 of 18
Download to read offline
Nilesh Gule
@nileshgule | www.HandsOnArchitect.com
Big Data for .Net Devs
with
Apache Spark
$whoami
{
“name” : “Nilesh Gule”,
“website” : “https://www.HandsOnArchitect.com",
“github” : “https://github.com/NileshGule"
“twitter” : “@nileshgule”,
“linkedin” : “https://www.linkedin.com/in/nileshgule”,
“likes” : “Technical Evangelism, Cricket”,
“co-organizer” : “Azure Singapore UG”
}
What is Apache Spark
https://spark.apache.org/
Apache Spark Data Sources
https://posts.specterops.io/threat-hunting-with-
jupyter-notebooks-part-3-querying-elasticsearch-
via-apache-spark-670054cd9d47
Benefits of using Apache Spark
• Speed
• Up to 100x faster compared to Map Reduce
• Ease of use
• Easy to use API’s
• Multi language support
• 100+ operators
• Unified engine
• Higher level libraries & support for SQL Queries,
streaming data, machine learning and graph
processing
• Runs everywhere
• Hadoop, standalone, Mesos, Kubernetes, cloud
https://databricks.com/blog/2014/11/05/spark-officially-sets-a-new-record-in-large-scale-sorting.html
Apache Spark Components
• Dataset, DataFrame, RDD
• Distributed collection of data
• SparkSession
• Entry point into Spark API
• SparkContext, SQLContext, StreamingContext unified
into one
• Executors
• Handles distributed processing
• Transformations & Actions
• Transformations – lazy operations that returns
immutable data structures
• Actions – apply operations and return value or write
data to external storage
Spark Common Transformations
• map
• flatMap
• filter
• Distinct
• Sample(withReplacement, ..)
• Union
• Intersection
• Subtract
• cartesian
• reduceByKey
• groupByKey
• sortByKey
• Join
• repartition
Spark Common Actions
• collect
• count
• countByValue
• Take(num)
• Top(num)
• Reduce(func)
• Fold(zero)(func)
• saveAsTextFile(path)
• saveAsSequenceFile(path)
• countByKey()
What is .Net for Apache Spark
• .Net bindings for Spark written on
Spark interop layer
• Provides high performance bindings
for C# and F#
• Compliant with .Net standard
https://devblogs.microsoft.com/dotnet/introducing-net-for-apache-spark/#performance
Demo
• MovieLens Datatset
• CSV files in Azure Data Lake Storage
• Spark pools using Azure Synapse analytics
Summary
• Apache Spark is great for Big Data Analytics
• .Net for Apache Spark provides .Net language bindings
to Spark
• Azure Synapse Analytics has native support for C#
 Apache Spark
 .Net for Apache Spark
 MovieLens datasets
 Azure Synapse Analytics
https://youtu.be/KhMKXQkIzKw https://channel9.msdn.com/Series/NET-for-Apache-Spark-101
Thank you very much
Code with Passion and Strive for Excellence
https://www.slideshare.net/nileshgule/presentations
https://speakerdeck.com/nileshgule/
Nilesh Gule
ARCHITECT | MICROSOFT MVP
“Code with Passion and
Strive for Excellence”
nileshgule @nileshgule Nilesh Gule
NileshGule
www.handsonarchitect.com
Q&A

More Related Content

What's hot

Event driven autoscaling with KEDA
Event driven autoscaling with KEDAEvent driven autoscaling with KEDA
Event driven autoscaling with KEDANilesh Gule
 
Autoscaling containers with event driven workloads
Autoscaling containers with event driven workloadsAutoscaling containers with event driven workloads
Autoscaling containers with event driven workloadsNilesh Gule
 
Improve monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss toolsImprove monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss toolsNilesh Gule
 
Scaling containers with keda
Scaling containers  with kedaScaling containers  with keda
Scaling containers with kedaNilesh Gule
 
Building cloud native apps with .net core 3.0 and kubernetes
Building cloud native apps with .net core 3.0 and kubernetesBuilding cloud native apps with .net core 3.0 and kubernetes
Building cloud native apps with .net core 3.0 and kubernetesNilesh Gule
 
Improve Monitoring and Observability for Kubernetes with OSS tools
Improve Monitoring and Observability for Kubernetes with OSS toolsImprove Monitoring and Observability for Kubernetes with OSS tools
Improve Monitoring and Observability for Kubernetes with OSS toolsNilesh Gule
 
Resillient microservices with AKS
Resillient microservices with AKSResillient microservices with AKS
Resillient microservices with AKSNilesh Gule
 
Cncf event driven autoscaling with keda
Cncf   event driven autoscaling with kedaCncf   event driven autoscaling with keda
Cncf event driven autoscaling with kedaJurajHantk
 
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...Codit
 
Tu non puoi passare! Policy compliance con OPA Gatekeeper | Niccolò Raspa
Tu non puoi passare! Policy compliance con OPA Gatekeeper | Niccolò RaspaTu non puoi passare! Policy compliance con OPA Gatekeeper | Niccolò Raspa
Tu non puoi passare! Policy compliance con OPA Gatekeeper | Niccolò RaspaKCDItaly
 
Azuresatpn19 - An Introduction To Azure Data Factory
Azuresatpn19 - An Introduction To Azure Data FactoryAzuresatpn19 - An Introduction To Azure Data Factory
Azuresatpn19 - An Introduction To Azure Data FactoryRiccardo Perico
 
Building an intelligent big data application in 30 minutes
Building an intelligent big data application in 30 minutesBuilding an intelligent big data application in 30 minutes
Building an intelligent big data application in 30 minutesClaudiu Barbura
 
AZUG Lightning Talk - Application autoscaling on Kubernetes with Kubernetes E...
AZUG Lightning Talk - Application autoscaling on Kubernetes with Kubernetes E...AZUG Lightning Talk - Application autoscaling on Kubernetes with Kubernetes E...
AZUG Lightning Talk - Application autoscaling on Kubernetes with Kubernetes E...Tom Kerkhove
 
Global Azure Virtual - Application Autoscaling with KEDA
Global Azure Virtual - Application Autoscaling with KEDAGlobal Azure Virtual - Application Autoscaling with KEDA
Global Azure Virtual - Application Autoscaling with KEDATom Kerkhove
 
Migrating SSIS to the cloud
Migrating SSIS to the cloudMigrating SSIS to the cloud
Migrating SSIS to the cloudKoenVerbeeck
 
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...Tom Kerkhove
 
Container orchestration k8s azure kubernetes services
Container orchestration  k8s azure kubernetes servicesContainer orchestration  k8s azure kubernetes services
Container orchestration k8s azure kubernetes servicesRajesh Kolla
 

What's hot (20)

Event driven autoscaling with KEDA
Event driven autoscaling with KEDAEvent driven autoscaling with KEDA
Event driven autoscaling with KEDA
 
Autoscaling containers with event driven workloads
Autoscaling containers with event driven workloadsAutoscaling containers with event driven workloads
Autoscaling containers with event driven workloads
 
Improve monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss toolsImprove monitoring and observability for kubernetes with oss tools
Improve monitoring and observability for kubernetes with oss tools
 
Scaling containers with keda
Scaling containers  with kedaScaling containers  with keda
Scaling containers with keda
 
Building cloud native apps with .net core 3.0 and kubernetes
Building cloud native apps with .net core 3.0 and kubernetesBuilding cloud native apps with .net core 3.0 and kubernetes
Building cloud native apps with .net core 3.0 and kubernetes
 
Improve Monitoring and Observability for Kubernetes with OSS tools
Improve Monitoring and Observability for Kubernetes with OSS toolsImprove Monitoring and Observability for Kubernetes with OSS tools
Improve Monitoring and Observability for Kubernetes with OSS tools
 
KEDA Overview
KEDA OverviewKEDA Overview
KEDA Overview
 
Resillient microservices with AKS
Resillient microservices with AKSResillient microservices with AKS
Resillient microservices with AKS
 
Cncf event driven autoscaling with keda
Cncf   event driven autoscaling with kedaCncf   event driven autoscaling with keda
Cncf event driven autoscaling with keda
 
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...
 
Tu non puoi passare! Policy compliance con OPA Gatekeeper | Niccolò Raspa
Tu non puoi passare! Policy compliance con OPA Gatekeeper | Niccolò RaspaTu non puoi passare! Policy compliance con OPA Gatekeeper | Niccolò Raspa
Tu non puoi passare! Policy compliance con OPA Gatekeeper | Niccolò Raspa
 
Azuresatpn19 - An Introduction To Azure Data Factory
Azuresatpn19 - An Introduction To Azure Data FactoryAzuresatpn19 - An Introduction To Azure Data Factory
Azuresatpn19 - An Introduction To Azure Data Factory
 
Building an intelligent big data application in 30 minutes
Building an intelligent big data application in 30 minutesBuilding an intelligent big data application in 30 minutes
Building an intelligent big data application in 30 minutes
 
AZUG Lightning Talk - Application autoscaling on Kubernetes with Kubernetes E...
AZUG Lightning Talk - Application autoscaling on Kubernetes with Kubernetes E...AZUG Lightning Talk - Application autoscaling on Kubernetes with Kubernetes E...
AZUG Lightning Talk - Application autoscaling on Kubernetes with Kubernetes E...
 
Global Azure Virtual - Application Autoscaling with KEDA
Global Azure Virtual - Application Autoscaling with KEDAGlobal Azure Virtual - Application Autoscaling with KEDA
Global Azure Virtual - Application Autoscaling with KEDA
 
Migrating SSIS to the cloud
Migrating SSIS to the cloudMigrating SSIS to the cloud
Migrating SSIS to the cloud
 
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...
 
Container orchestration k8s azure kubernetes services
Container orchestration  k8s azure kubernetes servicesContainer orchestration  k8s azure kubernetes services
Container orchestration k8s azure kubernetes services
 
Tokyo Azure Meetup #29 AKS
Tokyo Azure Meetup #29 AKSTokyo Azure Meetup #29 AKS
Tokyo Azure Meetup #29 AKS
 
TIAD : Automate everything with Google Cloud
TIAD : Automate everything with Google CloudTIAD : Automate everything with Google Cloud
TIAD : Automate everything with Google Cloud
 

Similar to Apache Spark for .Net Devs

Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and JujuMining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Jujuseoul_engineer
 
Getting Started With Azure Container Apps.pdf
Getting Started With Azure Container Apps.pdfGetting Started With Azure Container Apps.pdf
Getting Started With Azure Container Apps.pdfNilesh Gule
 
Build Secure Portable Applications using AKS and its ecosystem
Build Secure Portable Applications using AKS and its ecosystemBuild Secure Portable Applications using AKS and its ecosystem
Build Secure Portable Applications using AKS and its ecosystemNilesh Gule
 
DevSecCon Singapore 2018 - in graph we trust By Imran Mohammed
DevSecCon Singapore 2018 - in graph we trust By Imran MohammedDevSecCon Singapore 2018 - in graph we trust By Imran Mohammed
DevSecCon Singapore 2018 - in graph we trust By Imran MohammedDevSecCon
 
In graph we trust: Microservices, GraphQL and security challenges
In graph we trust: Microservices, GraphQL and security challengesIn graph we trust: Microservices, GraphQL and security challenges
In graph we trust: Microservices, GraphQL and security challengesMohammed A. Imran
 
Portable Multi-cloud Microservices with Dapr .pptx
Portable Multi-cloud Microservices with Dapr .pptxPortable Multi-cloud Microservices with Dapr .pptx
Portable Multi-cloud Microservices with Dapr .pptxNilesh Gule
 
Portable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdfPortable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdfNilesh Gule
 
Big data workloads using Apache Sparkon HDInsight
Big data workloads using Apache Sparkon HDInsightBig data workloads using Apache Sparkon HDInsight
Big data workloads using Apache Sparkon HDInsightNilesh Gule
 
Portable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdfPortable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdfNilesh Gule
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapseNilesh Gule
 
Why contribute to open source projects
Why contribute to open source projectsWhy contribute to open source projects
Why contribute to open source projectsKranti Parisa
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Timothy Spann
 
Autoscale applications based on external events with KEDA.pdf
Autoscale applications based on external events with KEDA.pdfAutoscale applications based on external events with KEDA.pdf
Autoscale applications based on external events with KEDA.pdfNilesh Gule
 
Modern Data Warehouse using Azure.pdf
Modern Data Warehouse using Azure.pdfModern Data Warehouse using Azure.pdf
Modern Data Warehouse using Azure.pdfNilesh Gule
 
Building a Dev/Test Cloud with Apache CloudStack
Building a Dev/Test Cloud with Apache CloudStackBuilding a Dev/Test Cloud with Apache CloudStack
Building a Dev/Test Cloud with Apache CloudStackke4qqq
 
Put iOS and Android on the same Wavelength with Serverless Microservices
Put iOS and Android on the same Wavelength with Serverless MicroservicesPut iOS and Android on the same Wavelength with Serverless Microservices
Put iOS and Android on the same Wavelength with Serverless MicroservicesNeil Power
 
CI CD with Docker and Kubernetes
CI CD with Docker and Kubernetes CI CD with Docker and Kubernetes
CI CD with Docker and Kubernetes Nilesh Gule
 
Spark Hsinchu meetup
Spark Hsinchu meetupSpark Hsinchu meetup
Spark Hsinchu meetupYung-An He
 
ApacheCon NA 2019 : Customer segmentation and personalization using apache unomi
ApacheCon NA 2019 : Customer segmentation and personalization using apache unomiApacheCon NA 2019 : Customer segmentation and personalization using apache unomi
ApacheCon NA 2019 : Customer segmentation and personalization using apache unomiSerge Huber
 

Similar to Apache Spark for .Net Devs (20)

Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and JujuMining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
 
Getting Started With Azure Container Apps.pdf
Getting Started With Azure Container Apps.pdfGetting Started With Azure Container Apps.pdf
Getting Started With Azure Container Apps.pdf
 
Build Secure Portable Applications using AKS and its ecosystem
Build Secure Portable Applications using AKS and its ecosystemBuild Secure Portable Applications using AKS and its ecosystem
Build Secure Portable Applications using AKS and its ecosystem
 
DevSecCon Singapore 2018 - in graph we trust By Imran Mohammed
DevSecCon Singapore 2018 - in graph we trust By Imran MohammedDevSecCon Singapore 2018 - in graph we trust By Imran Mohammed
DevSecCon Singapore 2018 - in graph we trust By Imran Mohammed
 
In graph we trust: Microservices, GraphQL and security challenges
In graph we trust: Microservices, GraphQL and security challengesIn graph we trust: Microservices, GraphQL and security challenges
In graph we trust: Microservices, GraphQL and security challenges
 
Portable Multi-cloud Microservices with Dapr .pptx
Portable Multi-cloud Microservices with Dapr .pptxPortable Multi-cloud Microservices with Dapr .pptx
Portable Multi-cloud Microservices with Dapr .pptx
 
Portable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdfPortable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdf
 
Big data workloads using Apache Sparkon HDInsight
Big data workloads using Apache Sparkon HDInsightBig data workloads using Apache Sparkon HDInsight
Big data workloads using Apache Sparkon HDInsight
 
Portable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdfPortable Multi-cloud Microservices with Dapr .pdf
Portable Multi-cloud Microservices with Dapr .pdf
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure Synapse
 
Why contribute to open source projects
Why contribute to open source projectsWhy contribute to open source projects
Why contribute to open source projects
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
 
Autoscale applications based on external events with KEDA.pdf
Autoscale applications based on external events with KEDA.pdfAutoscale applications based on external events with KEDA.pdf
Autoscale applications based on external events with KEDA.pdf
 
Modern Data Warehouse using Azure.pdf
Modern Data Warehouse using Azure.pdfModern Data Warehouse using Azure.pdf
Modern Data Warehouse using Azure.pdf
 
Building a Dev/Test Cloud with Apache CloudStack
Building a Dev/Test Cloud with Apache CloudStackBuilding a Dev/Test Cloud with Apache CloudStack
Building a Dev/Test Cloud with Apache CloudStack
 
Neos CMS and SEO
Neos CMS and SEONeos CMS and SEO
Neos CMS and SEO
 
Put iOS and Android on the same Wavelength with Serverless Microservices
Put iOS and Android on the same Wavelength with Serverless MicroservicesPut iOS and Android on the same Wavelength with Serverless Microservices
Put iOS and Android on the same Wavelength with Serverless Microservices
 
CI CD with Docker and Kubernetes
CI CD with Docker and Kubernetes CI CD with Docker and Kubernetes
CI CD with Docker and Kubernetes
 
Spark Hsinchu meetup
Spark Hsinchu meetupSpark Hsinchu meetup
Spark Hsinchu meetup
 
ApacheCon NA 2019 : Customer segmentation and personalization using apache unomi
ApacheCon NA 2019 : Customer segmentation and personalization using apache unomiApacheCon NA 2019 : Customer segmentation and personalization using apache unomi
ApacheCon NA 2019 : Customer segmentation and personalization using apache unomi
 

More from Nilesh Gule

Code Creativity and Customers- Navigating the Generative AI Landscape.pdf
Code Creativity and Customers- Navigating the Generative AI Landscape.pdfCode Creativity and Customers- Navigating the Generative AI Landscape.pdf
Code Creativity and Customers- Navigating the Generative AI Landscape.pdfNilesh Gule
 
Improve Monitoring And Observability for Kubernetes with OSS tools.pdf
Improve Monitoring And Observability for Kubernetes with OSS tools.pdfImprove Monitoring And Observability for Kubernetes with OSS tools.pdf
Improve Monitoring And Observability for Kubernetes with OSS tools.pdfNilesh Gule
 
Modular Architecturs for Resilience and Adaptability.pdf
Modular Architecturs for Resilience and Adaptability.pdfModular Architecturs for Resilience and Adaptability.pdf
Modular Architecturs for Resilience and Adaptability.pdfNilesh Gule
 
Singapore JUG - Open Telemetry.pdf
Singapore JUG - Open Telemetry.pdfSingapore JUG - Open Telemetry.pdf
Singapore JUG - Open Telemetry.pdfNilesh Gule
 
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdfCloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdfNilesh Gule
 
Cloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdfCloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdfNilesh Gule
 
Cloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdfCloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdfNilesh Gule
 
Modular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdfModular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdfNilesh Gule
 
Modular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdfModular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdfNilesh Gule
 
Cloud Native Ninja - PT7 - Containerize Go apps.pdf
Cloud Native Ninja - PT7 - Containerize Go apps.pdfCloud Native Ninja - PT7 - Containerize Go apps.pdf
Cloud Native Ninja - PT7 - Containerize Go apps.pdfNilesh Gule
 
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdf
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdfCloud Native Ninja - PT6 - Containerize Spring Boot apps.pdf
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdfNilesh Gule
 
Cloud Native Ninja - PT5 - Publish container images.pdf
Cloud Native Ninja - PT5 - Publish container images.pdfCloud Native Ninja - PT5 - Publish container images.pdf
Cloud Native Ninja - PT5 - Publish container images.pdfNilesh Gule
 
Manage Multi Container Apps with Docker Compose.pdf
Manage Multi Container Apps with Docker Compose.pdfManage Multi Container Apps with Docker Compose.pdf
Manage Multi Container Apps with Docker Compose.pdfNilesh Gule
 
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdf
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdfCloud Native Ninja - PT3 - Containerize DOTNET apps.pdf
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdfNilesh Gule
 
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdf
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdfCloud Native Ninja - Distributed Microservices with Dapr - part 2.pdf
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdfNilesh Gule
 
Distributed Event Driven Systems with KEDA.pdf
Distributed Event Driven Systems with KEDA.pdfDistributed Event Driven Systems with KEDA.pdf
Distributed Event Driven Systems with KEDA.pdfNilesh Gule
 
Cloud Native Ninja - Getting Started with Containers - Part 1.pdf
Cloud Native Ninja - Getting Started with Containers - Part 1.pdfCloud Native Ninja - Getting Started with Containers - Part 1.pdf
Cloud Native Ninja - Getting Started with Containers - Part 1.pdfNilesh Gule
 
15-factor-apps.pdf
15-factor-apps.pdf15-factor-apps.pdf
15-factor-apps.pdfNilesh Gule
 
Cloud Native Ninja - kickoff.pdf
Cloud Native Ninja - kickoff.pdfCloud Native Ninja - kickoff.pdf
Cloud Native Ninja - kickoff.pdfNilesh Gule
 
FestiveTechCalendar2022 - Getting Started with Azure Container Apps.pdf
FestiveTechCalendar2022 - Getting Started with Azure Container Apps.pdfFestiveTechCalendar2022 - Getting Started with Azure Container Apps.pdf
FestiveTechCalendar2022 - Getting Started with Azure Container Apps.pdfNilesh Gule
 

More from Nilesh Gule (20)

Code Creativity and Customers- Navigating the Generative AI Landscape.pdf
Code Creativity and Customers- Navigating the Generative AI Landscape.pdfCode Creativity and Customers- Navigating the Generative AI Landscape.pdf
Code Creativity and Customers- Navigating the Generative AI Landscape.pdf
 
Improve Monitoring And Observability for Kubernetes with OSS tools.pdf
Improve Monitoring And Observability for Kubernetes with OSS tools.pdfImprove Monitoring And Observability for Kubernetes with OSS tools.pdf
Improve Monitoring And Observability for Kubernetes with OSS tools.pdf
 
Modular Architecturs for Resilience and Adaptability.pdf
Modular Architecturs for Resilience and Adaptability.pdfModular Architecturs for Resilience and Adaptability.pdf
Modular Architecturs for Resilience and Adaptability.pdf
 
Singapore JUG - Open Telemetry.pdf
Singapore JUG - Open Telemetry.pdfSingapore JUG - Open Telemetry.pdf
Singapore JUG - Open Telemetry.pdf
 
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdfCloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
 
Cloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdfCloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdf
 
Cloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdfCloud Native Ninja - PT8 - Containerize React app.pdf
Cloud Native Ninja - PT8 - Containerize React app.pdf
 
Modular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdfModular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdf
 
Modular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdfModular Architecturs for resilience and Adaptability.pdf
Modular Architecturs for resilience and Adaptability.pdf
 
Cloud Native Ninja - PT7 - Containerize Go apps.pdf
Cloud Native Ninja - PT7 - Containerize Go apps.pdfCloud Native Ninja - PT7 - Containerize Go apps.pdf
Cloud Native Ninja - PT7 - Containerize Go apps.pdf
 
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdf
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdfCloud Native Ninja - PT6 - Containerize Spring Boot apps.pdf
Cloud Native Ninja - PT6 - Containerize Spring Boot apps.pdf
 
Cloud Native Ninja - PT5 - Publish container images.pdf
Cloud Native Ninja - PT5 - Publish container images.pdfCloud Native Ninja - PT5 - Publish container images.pdf
Cloud Native Ninja - PT5 - Publish container images.pdf
 
Manage Multi Container Apps with Docker Compose.pdf
Manage Multi Container Apps with Docker Compose.pdfManage Multi Container Apps with Docker Compose.pdf
Manage Multi Container Apps with Docker Compose.pdf
 
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdf
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdfCloud Native Ninja - PT3 - Containerize DOTNET apps.pdf
Cloud Native Ninja - PT3 - Containerize DOTNET apps.pdf
 
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdf
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdfCloud Native Ninja - Distributed Microservices with Dapr - part 2.pdf
Cloud Native Ninja - Distributed Microservices with Dapr - part 2.pdf
 
Distributed Event Driven Systems with KEDA.pdf
Distributed Event Driven Systems with KEDA.pdfDistributed Event Driven Systems with KEDA.pdf
Distributed Event Driven Systems with KEDA.pdf
 
Cloud Native Ninja - Getting Started with Containers - Part 1.pdf
Cloud Native Ninja - Getting Started with Containers - Part 1.pdfCloud Native Ninja - Getting Started with Containers - Part 1.pdf
Cloud Native Ninja - Getting Started with Containers - Part 1.pdf
 
15-factor-apps.pdf
15-factor-apps.pdf15-factor-apps.pdf
15-factor-apps.pdf
 
Cloud Native Ninja - kickoff.pdf
Cloud Native Ninja - kickoff.pdfCloud Native Ninja - kickoff.pdf
Cloud Native Ninja - kickoff.pdf
 
FestiveTechCalendar2022 - Getting Started with Azure Container Apps.pdf
FestiveTechCalendar2022 - Getting Started with Azure Container Apps.pdfFestiveTechCalendar2022 - Getting Started with Azure Container Apps.pdf
FestiveTechCalendar2022 - Getting Started with Azure Container Apps.pdf
 

Recently uploaded

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Recently uploaded (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Apache Spark for .Net Devs

  • 1. Nilesh Gule @nileshgule | www.HandsOnArchitect.com Big Data for .Net Devs with Apache Spark
  • 2. $whoami { “name” : “Nilesh Gule”, “website” : “https://www.HandsOnArchitect.com", “github” : “https://github.com/NileshGule" “twitter” : “@nileshgule”, “linkedin” : “https://www.linkedin.com/in/nileshgule”, “likes” : “Technical Evangelism, Cricket”, “co-organizer” : “Azure Singapore UG” }
  • 3.
  • 4. What is Apache Spark https://spark.apache.org/
  • 5. Apache Spark Data Sources https://posts.specterops.io/threat-hunting-with- jupyter-notebooks-part-3-querying-elasticsearch- via-apache-spark-670054cd9d47
  • 6. Benefits of using Apache Spark • Speed • Up to 100x faster compared to Map Reduce • Ease of use • Easy to use API’s • Multi language support • 100+ operators • Unified engine • Higher level libraries & support for SQL Queries, streaming data, machine learning and graph processing • Runs everywhere • Hadoop, standalone, Mesos, Kubernetes, cloud https://databricks.com/blog/2014/11/05/spark-officially-sets-a-new-record-in-large-scale-sorting.html
  • 7. Apache Spark Components • Dataset, DataFrame, RDD • Distributed collection of data • SparkSession • Entry point into Spark API • SparkContext, SQLContext, StreamingContext unified into one • Executors • Handles distributed processing • Transformations & Actions • Transformations – lazy operations that returns immutable data structures • Actions – apply operations and return value or write data to external storage
  • 8. Spark Common Transformations • map • flatMap • filter • Distinct • Sample(withReplacement, ..) • Union • Intersection • Subtract • cartesian • reduceByKey • groupByKey • sortByKey • Join • repartition
  • 9. Spark Common Actions • collect • count • countByValue • Take(num) • Top(num) • Reduce(func) • Fold(zero)(func) • saveAsTextFile(path) • saveAsSequenceFile(path) • countByKey()
  • 10. What is .Net for Apache Spark • .Net bindings for Spark written on Spark interop layer • Provides high performance bindings for C# and F# • Compliant with .Net standard https://devblogs.microsoft.com/dotnet/introducing-net-for-apache-spark/#performance
  • 11. Demo • MovieLens Datatset • CSV files in Azure Data Lake Storage • Spark pools using Azure Synapse analytics
  • 12. Summary • Apache Spark is great for Big Data Analytics • .Net for Apache Spark provides .Net language bindings to Spark • Azure Synapse Analytics has native support for C#
  • 13.  Apache Spark  .Net for Apache Spark  MovieLens datasets  Azure Synapse Analytics
  • 15.
  • 16. Thank you very much Code with Passion and Strive for Excellence https://www.slideshare.net/nileshgule/presentations https://speakerdeck.com/nileshgule/
  • 17. Nilesh Gule ARCHITECT | MICROSOFT MVP “Code with Passion and Strive for Excellence” nileshgule @nileshgule Nilesh Gule NileshGule www.handsonarchitect.com
  • 18. Q&A