Big Data Processing | Apache Spark

The content focuses on the applications and functionalities of Apache Spark and related technologies in big data processing and analytics. It covers workshops, real-time data pipeline implementations, challenges in data engineering, and the integration of machine learning with Spark. The resources also discuss the transformation of batch to streaming data, automation in feature extraction, and the migration from Spark to other frameworks. Users can explore scalability, performance, and practical applications of data streaming solutions in various industries.

[Open Lakehouse + AI] Declarative Pipelines in Apache Spark

byOpen Source Events

Fundamentos de Big Data com Python: Tecnologia e Aplicações Praticas

byVagner Oliveira

Big Data Management and NoSQL Strategies in AI-Driven E-Commerce Systems.pdf

byErandika Lakmali

Extraction of association rules in a diabetic dataset using parallel FP-growth algorithm under apache spark

byIJICTJOURNAL

Unlock faster data-driven business decisions with Azure Databricks - Infographic

byPrincipled Technologies

Unlock faster insights with Azure Databricks

byPrincipled Technologies

BigData - Apache Spark Sqoop Introduce Basic

byluandnh1998

If You Use Databricks, You Definitely Need FME

bySafe Software

Apache Sparkに対するKubernetesのNUMAノードを意識したリソース割り当ての性能効果 (Open Source Conference 2025 Tokyo/Spring 発表資料)

byNTT DATA Technology & Innovation

FEATURE EXTRACTION AND FEATURE SELECTION: REDUCING DATA COMPLEXITY WITH APACHE SPARK

byIJNSA Journal

data pipelines complexity human expertise and LLM era

byRim Moussa

scalable air quality analytics with apache spark and apache sedona

byRim Moussa

A quick presentation on Big Data Analytics and Applications

byAbdulaziz Awwad

Top 5 Data Science Tools You Should Be Using in 2024 | IABAC

byIABAC

Amazon's Exabyte-Scale Migration from Spark to Ray

byAll Things Open

Introduction to Big Data Engineering.pdf

byjashwanthmuthumula

Real-Time Data Analytics with Apache Kafka and Spark.pptx

bywjcpmnwgqk

A Master Guide To Apache Spark Application And Versatile Uses.pdf

byDataSpace Academy

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide

byChristina Lin

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, PostgreSQL, Redpanda, Debezium, and Benthos to master building advanced real-time data pipelines.

byChristina Lin