Topic Apache SparkApache Storm Stream Analytics
Input Sources Event Hubs, IoT Hubs,
Cosmos DB, Blob Storage,
ADLS, Kafka, Bring-Your-
Own-Source
Event Hubs, IoT Hubs, Kafka,
Bring-Your-Own-Source
Event Hubs, IoT Hubs, Blob
Storage
Language of Choice SQL, Scala, Python, Java, R Java, Python, C#, Bring-
your-own-language
SAQL
Customization/UDF
Language
Scala, Java, Python Java, Python, C#, Bring-
your-own-language
JavaScript
Output Sinks Bring-Your-Own-Sink Bring-Your-Own-Sink Fixed Number of Sinks
supported
File Formats Supported JSON, CSV, Parquet, Avro,
ORC, Bring-your-own-
format
JSON, CSV, Parquet, Avro,
ORC, Bring-your-own-
format
JSON, CSV, Avro
On-Premises Compatibility Yes Yes No
Data Encoding Formats Bring-Your-Own-Encoding-
Format
Bring-Your-Own-Encoding-
Format
UTF-8
Reference Data Bring-Your-Own-Data-
Source
Bring-Your-Own-Data-
Source
Blob Storage (up to 100MB)
Compression Support Yes Yes No
Offering in Azure SaaS ; PaaS ; IaaS PaaS ; IaaS PaaS
For more information, consult the Azure Data Architecture Guide
26.
Comparison Topic AzureHDInsight Azure Databricks
Type of Offering Platform as a Service Software as a Service ;
Serverless
Microsoft Support Yes Yes
Spark Flavor Hortonworks Databricks
Hadoop Components
Support
Yes No
On-Premises
Compatibility
Yes Yes
Comparing the different Spark
Offerings
27.
How does itwork in Spark
Structured Streaming?
Stores as a
“Data Frame”
28.
What is aDataframe?
Data Frame
=
“tabular data”
=
Data structure consisting of rows of data
=
Data structure that contains “SQL-like” table