Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark

•

17 likes•7,432 views

CEO of Databricks, Ion Stoica, talks about the impact of Apache Spark in the enterprise.

Revolutionizing Big Data
in the Enterprise with
Spark
Ion Stoica
October 28,2015

We Have Seen a Lot
Worked with 100s companies to run Spark in production over five years
Collaboratewith all major Hadoop and Big Data vendors
2

How Does Spark Change Enterprise Big Data?
• Unifying data sources
• Unifying data processing
3

4
Unifying Data Sources

Need to process data from
• Multiple sources
• Different data stores and locations
• Different formats
Traditional solutions: ETL data into
data warehouse, …
Traditional Data Warehouses
ETL
Slow to access and combine data
Data Warehouse

6
Just-In-Time (JIT)
Data Warehouse

Process data in place or stream it
• No need to wait for data to be
ETLed
7
JIT Data Warehouse
ETL
Data Warehouse

Process data in place or stream it
• No need to wait for data to be
ETLed
Cachedata in memory or SSDs
8
JIT Data Warehouse
Low latency and easy to combine data: value!

Analogy
9
Stream/cache &
Play
Download &
Play

Analogy
10
ETL & Query
Data
Source A
ETL
Data Warehouse
Data
Source B
Data
Source B
Data
Source A
Data
Source B
Data
Source B
Stream/Cache + Query

12
Unifying Data Processing

Unified supportfor
• Batch
• Streaming
• ML/Graphs
• …
13
Spark: Unified Engine
GraphXMLlib
Core
Spark
Streaming
SparkSQL SparkR
Easy to manage, learn, and combine functionality

Analogy
First cellular
phones
Unified device
(smartphone)
Specialized
devices
Better Games Better GPSBetter Phone

Analogy
Batch processing Unified systemSpecialized systems
Real-time
analytics
Instant fraud
detection
Better Apps

Large On-line Service Company
Leverages
• Interactive query processing
• ML
and combines data from S3, Redshift, and HBase to provide
• data analyticsfor productmanagementteam
• advanced predictive analyticsto delivernew services(e.g.,
customized inventory displaystailored to each user)
16

17
Demo

Demo Setting
18
MLlib
Core
Spark
Streaming
SparkSQL
HDFS RedShift

More Related Content

What's hot

Insights into Customer Behavior from Clickstream Data by Ronald Nowling

Insights into Customer Behavior from Clickstream Data by Ronald Nowling

Insights into Customer Behavior from Clickstream Data by Ronald Nowling

PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future

PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future

PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future

Spark's Role in the Big Data Ecosystem (Spark Summit 2014)

Spark's Role in the Big Data Ecosystem (Spark Summit 2014)

Spark's Role in the Big Data Ecosystem (Spark Summit 2014)

Big Telco - Yousun Jeong

Big Telco - Yousun Jeong

Big Telco - Yousun Jeong

This session will cover a series of problems that are adequately solved with Apache Spark, as well as those that are require additional technologies to implement correctly. Here’s an example outline of some of the topics that will be covered in the talk: Problems that are perfectly solved with Apache Spark: 1) Analyzing a large set of data files. 2) Doing ETL of a large amount of data. 3) Applying Machine Learning & Data Science to a large dataset. 4) Connecting BI/Visualization tools to Apache Spark to analyze large datasets internally. By Vida Ha at Spark Summit East 2016.

Not Your Father's Database: How to Use Apache Spark Properly in Your Big Data...

Not Your Father's Database: How to Use Apache Spark Properly in Your Big Data...

Not Your Father's Database: How to Use Apache Spark Properly in Your Big Data...

What's new in pandas and the SciPy stack for financial users

What's new in pandas and the SciPy stack for financial users

What's new in pandas and the SciPy stack for financial users

Apache Arrow at DataEngConf Barcelona 2018

Apache Arrow at DataEngConf Barcelona 2018

Apache Arrow at DataEngConf Barcelona 2018

Apache Arrow: Open Source Standard Becomes an Enterprise Necessity

Apache Arrow: Open Source Standard Becomes an Enterprise Necessity

Apache Arrow: Open Source Standard Becomes an Enterprise Necessity

Clinical Suspecting at Scale Using PySpark

Clinical Suspecting at Scale Using PySpark

Clinical Suspecting at Scale Using PySpark

Building a Virtual Data Lake with Apache Arrow

Building a Virtual Data Lake with Apache Arrow

Building a Virtual Data Lake with Apache Arrow

Dremio Corporation

Improving data interoperability in Python and R

Improving data interoperability in Python and R

Improving data interoperability in Python and R

This talk discusses integrating common data science tools like Python pandas, scikit-learn, and R with MLlib, Spark’s distributed Machine Learning (ML) library. Integration is simple; migration to distributed ML can be done lazily; and scaling to big data can significantly improve accuracy. We demonstrate integration with a simple data science workflow. Data scientists often encounter scaling bottlenecks with single-machine ML tools. Yet the overhead in migrating to a distributed workflow can seem daunting. In this talk, we demonstrate such a migration, taking advantage of Spark and MLlib’s integration with common ML libraries. We begin with a small dataset which runs on a single machine. Increasing the size, we hit bottlenecks in various parts of the workflow: hyperparameter tuning, then ETL, and eventually the core learning algorithm. As we hit each bottleneck, we parallelize that part of the workflow using Spark and MLlib. As we increase the dataset and model size, we can see significant gains in accuracy. We end with results demonstrating the impressive scalability of MLlib algorithms. With accuracy comparable to traditional ML libraries, combined with state-of-the-art distributed scalability, MLlib is a valuable new tool for the modern data scientist.

Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R

Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R

Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R

Data scientists spend too much of their time collecting, cleaning and wrangling data as well as curating and enriching it. Some of this work is inevitable due to the variety of data sources, but there are tools and frameworks that help automate many of these non-creative tasks. A unifying feature of these tools is support for rich metadata for data sets, jobs, and data policies. In this talk, I will introduce state-of-the-art tools for automating data science and I will show how you can use metadata to help automate common tasks in Data Science. I will also introduce a new architecture for extensible, distributed metadata in Hadoop, called Hops (Hadoop Open Platform-as-a-Service), and show how tinker-friendly metadata (for jobs, files, users, and projects) opens up new ways to build smarter applications.

Data Science with the Help of Metadata

Data Science with the Help of Metadata

Data Science with the Help of Metadata

Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...

Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...

Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...

Big Data Processing with Spark and .NET - Microsoft Ignite 2019

Big Data Processing with Spark and .NET - Microsoft Ignite 2019

Big Data Processing with Spark and .NET - Microsoft Ignite 2019

The Future of Real-Time in Spark

The Future of Real-Time in Spark

The Future of Real-Time in Spark

PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"

PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"

PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"

Spark Meetup at Uber

Spark Meetup at Uber

Spark Meetup at Uber

How we, at eXelate, built an ETL pipeline for Elasticsearch using Spark, including : * Processing the data using Spark. * Indexing the processed data directly into Elasticsearch using elasticsearch-hadoop plugin-in for Spark. * Managing the flow using some of the services provided by AWS (EMR, Data Pipeline, etc.). The presentation includes some tips and discusses some of the pitfalls we encountered while setting-up this process.

Building an ETL pipeline for Elasticsearch using Spark

Building an ETL pipeline for Elasticsearch using Spark

Building an ETL pipeline for Elasticsearch using Spark

What's hot (19)

Insights into Customer Behavior from Clickstream Data by Ronald Nowling

Insights into Customer Behavior from Clickstream Data by Ronald Nowling

Insights into Customer Behavior from Clickstream Data by Ronald Nowling

PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future

PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future

PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future

Spark's Role in the Big Data Ecosystem (Spark Summit 2014)

Spark's Role in the Big Data Ecosystem (Spark Summit 2014)

Spark's Role in the Big Data Ecosystem (Spark Summit 2014)

Big Telco - Yousun Jeong

Big Telco - Yousun Jeong

Big Telco - Yousun Jeong

Not Your Father's Database: How to Use Apache Spark Properly in Your Big Data...

Not Your Father's Database: How to Use Apache Spark Properly in Your Big Data...

Not Your Father's Database: How to Use Apache Spark Properly in Your Big Data...

What's new in pandas and the SciPy stack for financial users

What's new in pandas and the SciPy stack for financial users

What's new in pandas and the SciPy stack for financial users

Apache Arrow at DataEngConf Barcelona 2018

Apache Arrow at DataEngConf Barcelona 2018

Apache Arrow at DataEngConf Barcelona 2018

Apache Arrow: Open Source Standard Becomes an Enterprise Necessity

Apache Arrow: Open Source Standard Becomes an Enterprise Necessity

Apache Arrow: Open Source Standard Becomes an Enterprise Necessity

Clinical Suspecting at Scale Using PySpark

Clinical Suspecting at Scale Using PySpark

Clinical Suspecting at Scale Using PySpark

Building a Virtual Data Lake with Apache Arrow

Building a Virtual Data Lake with Apache Arrow

Building a Virtual Data Lake with Apache Arrow

Improving data interoperability in Python and R

Improving data interoperability in Python and R

Improving data interoperability in Python and R

Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R

Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R

Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and R

Data Science with the Help of Metadata

Data Science with the Help of Metadata

Data Science with the Help of Metadata

Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...

Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...

Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...

Big Data Processing with Spark and .NET - Microsoft Ignite 2019

Big Data Processing with Spark and .NET - Microsoft Ignite 2019

Big Data Processing with Spark and .NET - Microsoft Ignite 2019

The Future of Real-Time in Spark

The Future of Real-Time in Spark

The Future of Real-Time in Spark

PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"

PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"

PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"

Spark Meetup at Uber

Spark Meetup at Uber

Spark Meetup at Uber

Building an ETL pipeline for Elasticsearch using Spark

Building an ETL pipeline for Elasticsearch using Spark

Building an ETL pipeline for Elasticsearch using Spark

Similar to Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark

The term "Data Lake" has become almost as overused and undescriptive as "Big Data". Many believe that centralizing datasets in HDFS makes a data lake, but then they struggle to realize any tangible value. This talk will redefine the "Data Lake" by describing four specific, key characteristics that we at Koverse have learned are crucial to successful enterprise data lake deployments. These characteristics are 1) indexing and search across all data sets, 2) interactive access for all users in the enterprise, 3) multi-level access control, and 4) integration with data science tools. These characteristics define a system that lets people realize value from their data versus getting lost in the hype. The talk will go on to provide a technical description of how we have integrated several projects, namely Apache Accumulo, Hadoop, and Spark, to implement an enterprise data lake with these key features.

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

The term "Data Lake" has become almost as overused and undescriptive as "Big Data". Many believe that centralizing datasets in HDFS makes a data lake, but then they struggle to realize any tangible value. This talk will redefine the "Data Lake" by describing four specific, key characteristics that we at Koverse have learned are crucial to successful enterprise data lake deployments. These characteristics are 1) indexing and search across all data sets, 2) interactive access for all users in the enterprise, 3) multi-level access control, and 4) integration with data science tools. These characteristics define a system that lets people realize value from their data versus getting lost in the hype. The talk will go on to provide a technical description of how we have integrated several projects, namely Apache Accumulo, Hadoop, and Spark, to implement an enterprise data lake with these key features.

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

Mark Lewis, Senior MArketing Director EMEA, Cloudera. Hadoop and the Future of Data Management. As Hadoop takes the data management market by storm, organisations are evolving the role it plays in the modern data centre. Explore how this disruptive technology is quickly transforming an industry and how you can leverage it today, in combination with MongoDB, to drive meaningful change in your business.

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...

Nov 2014 talk to SW Data Meetup by Mike Olson, co-founder and chairman of Cloudera. In business, we often deal with hype around trends in society, politics, economy and technology. We know we need to take claims of the next big thing with a grain of salt and that we should be careful not to set expectations too high. However, with Big Data analytics, the opposite is true. The hype that accompanies it actually conceals the enormity of its impact on the way we do business. In this talk I’ll discuss how new 'Data Driven' economies are emerging through relentless innovation across the public and private sectors. Mike (co-founded Cloudera in 2008 and served as its CEO until 2013 when he took on his current role of chief strategy officer (CSO.) As CSO, Mike is responsible for Cloudera’s product strategy, open source leadership, engineering alignment and direct engagement with customers. Prior to Cloudera Mike was CEO of Sleepycat Software, makers of Berkeley DB, the open source embedded database engine. Mike spent two years at Oracle Corporation as vice president for Embedded Technologies after Oracle’s acquisition of Sleepycat in 2006. Prior to joining Sleepycat, Mike held technical and business positions at database vendors Britton Lee, Illustra Information Technologies and Informix Software. Mike has a Bachelor’s and a Master’s Degree in Computer Science from the University of California, Berkeley.

Ask bigger questions

Ask bigger questions

Ask bigger questions

South West Data Meetup

Data Lakehouse, Data Mesh, and Data Fabric (r1)

Data Lakehouse, Data Mesh, and Data Fabric (r1)

Data Lakehouse, Data Mesh, and Data Fabric (r1)

Join Cloudian, Hortonworks and 451 Research for a panel-style Q&A discussion about the latest trends and technology innovations in Big Data and Analytics. Matt Aslett, Data Platforms and Analytics Research Director at 451 Research, John Kreisa, Vice President of Strategic Marketing at Hortonworks, and Paul Turner, Chief Marketing Officer at Cloudian, will answer your toughest questions about data storage, data analytics, log data, sensor data and the Internet of Things. Bring your questions or just come and listen!

Cloudian 451-hortonworks - webinar

Cloudian 451-hortonworks - webinar

Cloudian 451-hortonworks - webinar

Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...

Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...

Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...

Hadoop as a Data Hub

Hadoop as a Data Hub

Hadoop as a Data Hub

Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

This topic describes the use of Spark and SequoiaDB in the Operational Data Lake of China’s financial industry, including how to use SequoiaDB to provide online high concurrent services and how to use Spark for data processing and machine learning. China has the world’s largest population, and also the world’s second largest economy. Many of the best technologies used in the United States and Europe are difficult to play effectively in China. This topic will show you how Spark and SequoiaDB are able to provide online financial services to billions of population.

Building Operational Data Lake using Spark and SequoiaDB with Yang Peng

Building Operational Data Lake using Spark and SequoiaDB with Yang Peng

Building Operational Data Lake using Spark and SequoiaDB with Yang Peng

The Briefing Room with Rick van der Lans and Think Big, a Teradata Company Live Webcast on June 16, 2015 Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=197f8106531874cc5c14081ca214eaff Hadoop is arguably one of the most disruptive technologies of the last decade. Once lauded solely for its ability to transform the speed of batch processing, it has marched steadily forward and promulgated an array of performance-enhancing accessories, notably Spark and YARN. Hadoop has evolved into much more than a file system and batch processor, and it now promises to stand as the data management and analytics backbone for enterprises. Register for this episode of The Briefing Room to learn from veteran Analyst Rick van der Lans, as he discusses the emerging roles of Hadoop within the analytics ecosystem. He’ll be briefed by Ron Bodkin of Think Big, a Teradata Company, who will explore Hadoop’s maturity spectrum, from typical entry use cases all the way up the value chain. He’ll show how enterprises that already use Hadoop in production are finding new ways to exploit its power and build creative, dynamic analytics environments. Visit InsideAnalysis.com for more information.

The Maturity Model: Taking the Growing Pains Out of Hadoop

The Maturity Model: Taking the Growing Pains Out of Hadoop

The Maturity Model: Taking the Growing Pains Out of Hadoop

Inside Analysis

The Briefing Room with Dr. Robin Bloor and Think Big, a Teradata Company Live Webcast April 7, 2015 Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=4114b87441ab7b2b4c52f6b24776e5a1 The more things change in Big Data, the more they stay the same. Indeed, there are many similarities between a Hadoop-based Data Lake and today’s modern Data Warehouse. Regardless of platform, information workers must still be able to turn their assets into action quickly, without taking a hit on governance or downstream performance. Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he explains the challenges facing organizations who endeavor on Big Data projects. He’ll be briefed by Rick Stellwagen of Think Big, a Teradata Company, who will outline his company’s approach to handling Big Data implementations. Rick will discuss the role of the data lake, and how timely response of queries is critical for reporting and analysis. Visit InsideAnalysis.com for more information.

The Great Lakes: How to Approach a Big Data Implementation

The Great Lakes: How to Approach a Big Data Implementation

The Great Lakes: How to Approach a Big Data Implementation

Inside Analysis

Learn about the promise of data lakes: - Store all types of data in its raw format - Create refined, standardized, trusted datasets for various use cases - Store data for longer periods of time to enable historical analysis - Query and Access the data using a variety of methods - Manage streaming and batch data in a converged platform - Provide shorter time-to-insight with proper data management and governance

Strata San Jose 2017 - Ben Sharma Presentation

Strata San Jose 2017 - Ben Sharma Presentation

Strata San Jose 2017 - Ben Sharma Presentation

Splice machine-bloor-webinar-data-lakes

Splice machine-bloor-webinar-data-lakes

Splice machine-bloor-webinar-data-lakes

Edgar Alejandro Villegas

Information and data relevance to business

Information and data relevance to business

Information and data relevance to business

Warehousing Your Hits - The Why and How of Owning Your Data

Warehousing Your Hits - The Why and How of Owning Your Data

Warehousing Your Hits - The Why and How of Owning Your Data

Scott Arbeitman

Presentation at Data Summit 2015 in NYC. Elliott Cordo shared real-world insights across a range of topics, including the evolving best practices for building a data warehouse on Hadoop that also coexists with multiple processing frameworks and additional non-Hadoop storage platforms, the place for massively parallel-processing and relational databases in analytic architectures, and the ways in which the cloud offers the ability to quickly and cost-effectively establish a scalable platform for your Big Data warehouse. For more information, visit www.casertaconcepts.com

Hadoop and Your Data Warehouse

Hadoop and Your Data Warehouse

Hadoop and Your Data Warehouse

Dw Concepts

Big data 101

Paresh Motiwala, PMP®

Modern data warehouse presentation

Modern data warehouse presentation

Modern data warehouse presentation

Similar to Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark (20)

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

One Large Data Lake, Hold the Hype

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...

Ask bigger questions

Ask bigger questions

Ask bigger questions

Data Lakehouse, Data Mesh, and Data Fabric (r1)

Data Lakehouse, Data Mesh, and Data Fabric (r1)

Data Lakehouse, Data Mesh, and Data Fabric (r1)

Cloudian 451-hortonworks - webinar

Cloudian 451-hortonworks - webinar

Cloudian 451-hortonworks - webinar

Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...

Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...

Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...

Hadoop as a Data Hub

Hadoop as a Data Hub

Hadoop as a Data Hub

Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Building Operational Data Lake using Spark and SequoiaDB with Yang Peng

Building Operational Data Lake using Spark and SequoiaDB with Yang Peng

Building Operational Data Lake using Spark and SequoiaDB with Yang Peng

The Maturity Model: Taking the Growing Pains Out of Hadoop

The Maturity Model: Taking the Growing Pains Out of Hadoop

The Maturity Model: Taking the Growing Pains Out of Hadoop

The Great Lakes: How to Approach a Big Data Implementation

The Great Lakes: How to Approach a Big Data Implementation

The Great Lakes: How to Approach a Big Data Implementation

Strata San Jose 2017 - Ben Sharma Presentation

Strata San Jose 2017 - Ben Sharma Presentation

Strata San Jose 2017 - Ben Sharma Presentation

Splice machine-bloor-webinar-data-lakes

Splice machine-bloor-webinar-data-lakes

Splice machine-bloor-webinar-data-lakes

Information and data relevance to business

Information and data relevance to business

Information and data relevance to business

Warehousing Your Hits - The Why and How of Owning Your Data

Warehousing Your Hits - The Why and How of Owning Your Data

Warehousing Your Hits - The Why and How of Owning Your Data

Hadoop and Your Data Warehouse

Hadoop and Your Data Warehouse

Hadoop and Your Data Warehouse

Dw Concepts

Big data 101

Modern data warehouse presentation

Modern data warehouse presentation

Modern data warehouse presentation

More from Databricks

DW Migration Webinar-March 2022.pptx

DW Migration Webinar-March 2022.pptx

DW Migration Webinar-March 2022.pptx

The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse. Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today. Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow. This is an educational event. Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.

Data Lakehouse Symposium | Day 1 | Part 1

Data Lakehouse Symposium | Day 1 | Part 1

Data Lakehouse Symposium | Day 1 | Part 1

The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse. Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today. Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow. This is an educational event. Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.

Data Lakehouse Symposium | Day 1 | Part 2

Data Lakehouse Symposium | Day 1 | Part 2

Data Lakehouse Symposium | Day 1 | Part 2

The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse. Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today. Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow. This is an educational event. Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.

Data Lakehouse Symposium | Day 2

Data Lakehouse Symposium | Day 2

Data Lakehouse Symposium | Day 2

The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse. Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today. Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow. This is an educational event. Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.

Data Lakehouse Symposium | Day 4

Data Lakehouse Symposium | Day 4

Data Lakehouse Symposium | Day 4

In this session, learn how to quickly supplement your on-premises Hadoop environment with a simple, open, and collaborative cloud architecture that enables you to generate greater value with scaled application of analytics and AI on all your data. You will also learn five critical steps for a successful migration to the Databricks Lakehouse Platform along with the resources available to help you begin to re-skill your data teams.

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop

Bad data leads to bad decisions and broken customer experiences. Organizations depend on complete and accurate data to power their business, maintain efficiency, and uphold customer trust. With thousands of datasets and pipelines running, how do we ensure that all data meets quality standards, and that expectations are clear between producers and consumers? Investing in shared, flexible components and practices for monitoring data health is crucial for a complex data organization to rapidly and effectively scale. At Zillow, we built a centralized platform to meet our data quality needs across stakeholders. The platform is accessible to engineers, scientists, and analysts, and seamlessly integrates with existing data pipelines and data discovery tools. In this presentation, we will provide an overview of our platform’s capabilities, including: Giving producers and consumers the ability to define and view data quality expectations using a self-service onboarding portal Performing data quality validations using libraries built to work with spark Dynamically generating pipelines that can be abstracted away from users Flagging data that doesn’t meet quality standards at the earliest stage and giving producers the opportunity to resolve issues before use by downstream consumers Exposing data quality metrics alongside each dataset to provide producers and consumers with a comprehensive picture of health over time

Democratizing Data Quality Through a Centralized Platform

Democratizing Data Quality Through a Centralized Platform

Democratizing Data Quality Through a Centralized Platform

Data scientists face numerous challenges throughout the data science workflow that hinder productivity. As organizations continue to become more data-driven, a collaborative environment is more critical than ever — one that provides easier access and visibility into the data, reports and dashboards built against the data, reproducibility, and insights uncovered within the data.. Join us to hear how Databricks’ open and collaborative platform simplifies data science by enabling you to run all types of analytics workloads, from data preparation to exploratory analysis and predictive analytics, at scale — all on one unified platform.

Learn to Use Databricks for Data Science

Learn to Use Databricks for Data Science

Learn to Use Databricks for Data Science

Application performance monitoring (APM) has become the cornerstone of software engineering allowing engineering teams to quickly identify and remedy production issues. However, as the world moves to intelligent software applications that are built using machine learning, traditional APM quickly becomes insufficient to identify and remedy production issues encountered in these modern software applications. As a lead software engineer at NewRelic, my team built high-performance monitoring systems including Insights, Mobile, and SixthSense. As I transitioned to building ML Monitoring software, I found the architectural principles and design choices underlying APM to not be a good fit for this brand new world. In fact, blindly following APM designs led us down paths that would have been better left unexplored. In this talk, I draw upon my (and my team’s) experience building an ML Monitoring system from the ground up and deploying it on customer workloads running large-scale ML training with Spark as well as real-time inference systems. I will highlight how the key principles and architectural choices of APM don’t apply to ML monitoring. You’ll learn why, understand what ML Monitoring can successfully borrow from APM, and hear what is required to build a scalable, robust ML Monitoring architecture.

Why APM Is Not the Same As ML Monitoring

Why APM Is Not the Same As ML Monitoring

Why APM Is Not the Same As ML Monitoring

Autonomy and ownership are core to working at Stitch Fix, particularly on the Algorithms team. We enable data scientists to deploy and operate their models independently, with minimal need for handoffs or gatekeeping. By writing a simple function and calling out to an intuitive API, data scientists can harness a suite of platform-provided tooling meant to make ML operations easy. In this talk, we will dive into the abstractions the Data Platform team has built to enable this. We will go over the interface data scientists use to specify a model and what that hooks into, including online deployment, batch execution on Spark, and metrics tracking and visualization.

The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix

The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix

The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix

In this talk, I will dive into the stage level scheduling feature added to Apache Spark 3.1. Stage level scheduling extends upon Project Hydrogen by improving big data ETL and AI integration and also enables multiple other use cases. It is beneficial any time the user wants to change container resources between stages in a single Apache Spark application, whether those resources are CPU, Memory or GPUs. One of the most popular use cases is enabling end-to-end scalable Deep Learning and AI to efficiently use GPU resources. In this type of use case, users read from a distributed file system, do data manipulation and filtering to get the data into a format that the Deep Learning algorithm needs for training or inference and then sends the data into a Deep Learning algorithm. Using stage level scheduling combined with accelerator aware scheduling enables users to seamlessly go from ETL to Deep Learning running on the GPU by adjusting the container requirements for different stages in Spark within the same application. This makes writing these applications easier and can help with hardware utilization and costs. There are other ETL use cases where users want to change CPU and memory resources between stages, for instance there is data skew or perhaps the data size is much larger in certain stages of the application. In this talk, I will go over the feature details, cluster requirements, the API and use cases. I will demo how the stage level scheduling API can be used by Horovod to seamlessly go from data preparation to training using the Tensorflow Keras API using GPUs. The talk will also touch on other new Apache Spark 3.1 functionality, such as pluggable caching, which can be used to enable faster dataframe access when operating from GPUs.

Stage Level Scheduling Improving Big Data and AI Integration

Stage Level Scheduling Improving Big Data and AI Integration

Stage Level Scheduling Improving Big Data and AI Integration

In this talk, I would like to introduce an open-source tool built by our team that simplifies the data conversion from Apache Spark to deep learning frameworks. Imagine you have a large dataset, say 20 GBs, and you want to use it to train a TensorFlow model. Before feeding the data to the model, you need to clean and preprocess your data using Spark. Now you have your dataset in a Spark DataFrame. When it comes to the training part, you may have the problem: How can I convert my Spark DataFrame to some format recognized by my TensorFlow model? The existing data conversion process can be tedious. For example, to convert an Apache Spark DataFrame to a TensorFlow Dataset file format, you need to either save the Apache Spark DataFrame on a distributed filesystem in parquet format and load the converted data with third-party tools such as Petastorm, or save it directly in TFRecord files with spark-tensorflow-connector and load it back using TFRecordDataset. Both approaches take more than 20 lines of code to manage the intermediate data files, rely on different parsing syntax, and require extra attention for handling vector columns in the Spark DataFrames. In short, all these engineering frictions greatly reduced the data scientists’ productivity. The Databricks Machine Learning team contributed a new Spark Dataset Converter API to Petastorm to simplify these tedious data conversion process steps. With the new API, it takes a few lines of code to convert a Spark DataFrame to a TensorFlow Dataset or a PyTorch DataLoader with default parameters. In the talk, I will use an example to show how to use the Spark Dataset Converter to train a Tensorflow model and how simple it is to go from single-node training to distributed training on Databricks.

Simplify Data Conversion from Spark to TensorFlow and PyTorch

Simplify Data Conversion from Spark to TensorFlow and PyTorch

Simplify Data Conversion from Spark to TensorFlow and PyTorch

There is no doubt Kubernetes has emerged as the next generation of cloud native infrastructure to support a wide variety of distributed workloads. Apache Spark has evolved to run both Machine Learning and large scale analytics workloads. There is growing interest in running Apache Spark natively on Kubernetes. By combining the flexibility of Kubernetes and scalable data processing with Apache Spark, you can run any data and machine pipelines on this infrastructure while effectively utilizing resources at disposal. In this talk, Rajesh Thallam and Sougata Biswas will share how to effectively run your Apache Spark applications on Google Kubernetes Engine (GKE) and Google Cloud Dataproc, orchestrate the data and machine learning pipelines with managed Apache Airflow on GKE (Google Cloud Composer). Following topics will be covered: – Understanding key traits of Apache Spark on Kubernetes- Things to know when running Apache Spark on Kubernetes such as autoscaling- Demonstrate running analytics pipelines on Apache Spark orchestrated with Apache Airflow on Kubernetes cluster.

Scaling your Data Pipelines with Apache Spark on Kubernetes

Scaling your Data Pipelines with Apache Spark on Kubernetes

Scaling your Data Pipelines with Apache Spark on Kubernetes

Pipelines have become ubiquitous, as the need for stringing multiple functions to compose applications has gained adoption and popularity. Common pipeline abstractions such as “fit” and “transform” are even shared across divergent platforms such as Python Scikit-Learn and Apache Spark. Scaling pipelines at the level of simple functions is desirable for many AI applications, however is not directly supported by Ray’s parallelism primitives. In this talk, Raghu will describe a pipeline abstraction that takes advantage of Ray’s compute model to efficiently scale arbitrarily complex pipeline workflows. He will demonstrate how this abstraction cleanly unifies pipeline workflows across multiple platforms such as Scikit-Learn and Spark, and achieves nearly optimal scale-out parallelism on pipelined computations. Attendees will learn how pipelined workflows can be mapped to Ray’s compute model and how they can both unify and accelerate their pipelines with Ray.

Scaling and Unifying SciKit Learn and Apache Spark Pipelines

Scaling and Unifying SciKit Learn and Apache Spark Pipelines

Scaling and Unifying SciKit Learn and Apache Spark Pipelines

In this talk about zipline, we will introduce a new type of windowing construct called a sawtooth window. We will describe various properties about sawtooth windows that we utilize to achieve online-offline consistency, while still maintaining high-throughput, low-read latency and tunable write latency for serving machine learning features.We will also talk about a simple deployment strategy for correcting feature drift – due operations that are not “abelian groups”, that operate over change data.

Sawtooth Windows for Feature Aggregations

Sawtooth Windows for Feature Aggregations

Sawtooth Windows for Feature Aggregations

We want to present multiple anti patterns utilizing Redis in unconventional ways to get the maximum out of Apache Spark.All examples presented are tried and tested in production at Scale at Adobe. The most common integration is spark-redis which interfaces with Redis as a Dataframe backing Store or as an upstream for Structured Streaming. We deviate from the common use cases to explore where Redis can plug gaps while scaling out high throughput applications in Spark. Niche 1 : Long Running Spark Batch Job – Dispatch New Jobs by polling a Redis Queue · Why? o Custom queries on top a table; We load the data once and query N times · Why not Structured Streaming · Working Solution using Redis Niche 2 : Distributed Counters · Problems with Spark Accumulators · Utilize Redis Hashes as distributed counters · Precautions for retries and speculative execution · Pipelining to improve performance

Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink

Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink

Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink

In the era of microservices, decentralized ML architectures and complex data pipelines, data quality has become a bigger challenge than ever. When data is involved in complex business processes and decisions, bad data can, and will, affect the bottom line. As a result, ensuring data quality across the entire ML pipeline is both costly, and cumbersome while data monitoring is often fragmented and performed ad hoc. To address these challenges, we built whylogs, an open source standard for data logging. It is a lightweight data profiling library that enables end-to-end data profiling across the entire software stack. The library implements a language and platform agnostic approach to data quality and data monitoring. It can work with different modes of data operations, including streaming, batch and IoT data. In this talk, we will provide an overview of the whylogs architecture, including its lightweight statistical data collection approach and various integrations. We will demonstrate how the whylogs integration with Apache Spark achieves large scale data profiling, and we will show how users can apply this integration into existing data and ML pipelines.

Re-imagine Data Monitoring with whylogs and Spark

Re-imagine Data Monitoring with whylogs and Spark

Re-imagine Data Monitoring with whylogs and Spark

Machine learning (ML) models are typically part of prediction queries that consist of a data processing part (e.g., for joining, filtering, cleaning, featurization) and an ML part invoking one or more trained models. In this presentation, we identify significant and unexplored opportunities for optimization. To the best of our knowledge, this is the first effort to look at prediction queries holistically, optimizing across both the ML and SQL components. We will present Raven, an end-to-end optimizer for prediction queries. Raven relies on a unified intermediate representation that captures both data processing and ML operators in a single graph structure. This allows us to introduce optimization rules that (i) reduce unnecessary computations by passing information between the data processing and ML operators (ii) leverage operator transformations (e.g., turning a decision tree to a SQL expression or an equivalent neural network) to map operators to the right execution engine, and (iii) integrate compiler techniques to take advantage of the most efficient hardware backend (e.g., CPU, GPU) for each operator. We have implemented Raven as an extension to Spark’s Catalyst optimizer to enable the optimization of SparkSQL prediction queries. Our implementation also allows the optimization of prediction queries in SQL Server. As we will show, Raven is capable of improving prediction query performance on Apache Spark and SQL Server by up to 13.1x and 330x, respectively. For complex models, where GPU acceleration is beneficial, Raven provides up to 8x speedup compared to state-of-the-art systems. As part of the presentation, we will also give a demo showcasing Raven in action.

Raven: End-to-end Optimization of ML Prediction Queries

Raven: End-to-end Optimization of ML Prediction Queries

Raven: End-to-end Optimization of ML Prediction Queries

Semantic segmentation is the classification of every pixel in an image/video. The segmentation partitions a digital image into multiple objects to simplify/change the representation of the image into something that is more meaningful and easier to analyze [1][2]. The technique has a wide variety of applications ranging from perception in autonomous driving scenarios to cancer cell segmentation for medical diagnosis. Exponential growth in the datasets that require such segmentation is driven by improvements in the accuracy and quality of the sensors generating the data extending to 3D point cloud data. This growth is further compounded by exponential advances in cloud technologies enabling the storage and compute available for such applications. The need for semantically segmented datasets is a key requirement to improve the accuracy of inference engines that are built upon them. Streamlining the accuracy and efficiency of these systems directly affects the value of the business outcome for organizations that are developing such functionalities as a part of their AI strategy. This presentation details workflows for labeling, preprocessing, modeling, and evaluating performance/accuracy. Scientists and engineers leverage domain-specific features/tools that support the entire workflow from labeling the ground truth, handling data from a wide variety of sources/formats, developing models and finally deploying these models. Users can scale their deployments optimally on GPU-based cloud infrastructure to build accelerated training and inference pipelines while working with big datasets. These environments are optimized for engineers to develop such functionality with ease and then scale against large datasets with Spark-based clusters on the cloud.

Processing Large Datasets for ADAS Applications using Apache Spark

Processing Large Datasets for ADAS Applications using Apache Spark

Processing Large Datasets for ADAS Applications using Apache Spark

At Adobe Experience Platform, we ingest TBs of data every day and manage PBs of data for our customers as part of the Unified Profile Offering. At the heart of this is a bunch of complex ingestion of a mix of normalized and denormalized data with various linkage scenarios power by a central Identity Linking Graph. This helps power various marketing scenarios that are activated in multiple platforms and channels like email, advertisements etc. We will go over how we built a cost effective and scalable data pipeline using Apache Spark and Delta Lake and share our experiences. What are we storing? Multi Source – Multi Channel Problem Data Representation and Nested Schema Evolution Performance Trade Offs with Various formats Go over anti-patterns used (String FTW) Data Manipulation using UDFs Writer Worries and How to Wipe them Away Staging Tables FTW Datalake Replication Lag Tracking Performance Time!

Massive Data Processing in Adobe Using Delta Lake

Massive Data Processing in Adobe Using Delta Lake

Massive Data Processing in Adobe Using Delta Lake

More from Databricks (20)

DW Migration Webinar-March 2022.pptx

DW Migration Webinar-March 2022.pptx

DW Migration Webinar-March 2022.pptx

Data Lakehouse Symposium | Day 1 | Part 1

Data Lakehouse Symposium | Day 1 | Part 1

Data Lakehouse Symposium | Day 1 | Part 1

Data Lakehouse Symposium | Day 1 | Part 2

Data Lakehouse Symposium | Day 1 | Part 2

Data Lakehouse Symposium | Day 1 | Part 2

Data Lakehouse Symposium | Day 2

Data Lakehouse Symposium | Day 2

Data Lakehouse Symposium | Day 2

Data Lakehouse Symposium | Day 4

Data Lakehouse Symposium | Day 4

Data Lakehouse Symposium | Day 4

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop

5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop

Democratizing Data Quality Through a Centralized Platform

Democratizing Data Quality Through a Centralized Platform

Democratizing Data Quality Through a Centralized Platform

Learn to Use Databricks for Data Science

Learn to Use Databricks for Data Science

Learn to Use Databricks for Data Science

Why APM Is Not the Same As ML Monitoring

Why APM Is Not the Same As ML Monitoring

Why APM Is Not the Same As ML Monitoring

The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix

The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix

The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix

Stage Level Scheduling Improving Big Data and AI Integration

Stage Level Scheduling Improving Big Data and AI Integration

Stage Level Scheduling Improving Big Data and AI Integration

Simplify Data Conversion from Spark to TensorFlow and PyTorch

Simplify Data Conversion from Spark to TensorFlow and PyTorch

Simplify Data Conversion from Spark to TensorFlow and PyTorch

Scaling your Data Pipelines with Apache Spark on Kubernetes

Scaling your Data Pipelines with Apache Spark on Kubernetes

Scaling your Data Pipelines with Apache Spark on Kubernetes

Scaling and Unifying SciKit Learn and Apache Spark Pipelines

Scaling and Unifying SciKit Learn and Apache Spark Pipelines

Scaling and Unifying SciKit Learn and Apache Spark Pipelines

Sawtooth Windows for Feature Aggregations

Sawtooth Windows for Feature Aggregations

Sawtooth Windows for Feature Aggregations

Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink

Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink

Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink

Re-imagine Data Monitoring with whylogs and Spark

Re-imagine Data Monitoring with whylogs and Spark

Re-imagine Data Monitoring with whylogs and Spark

Raven: End-to-end Optimization of ML Prediction Queries

Raven: End-to-end Optimization of ML Prediction Queries

Raven: End-to-end Optimization of ML Prediction Queries

Processing Large Datasets for ADAS Applications using Apache Spark

Processing Large Datasets for ADAS Applications using Apache Spark

Processing Large Datasets for ADAS Applications using Apache Spark

Massive Data Processing in Adobe Using Delta Lake

Massive Data Processing in Adobe Using Delta Lake

Massive Data Processing in Adobe Using Delta Lake

Recently uploaded

ADR, or Architecture Decision Record, is a valuable tool in software development for several reasons. It provides a centralized location for documenting and tracking architectural decisions, aiding both current and future team members. ADRs enhance communication among team members by documenting the rationale behind architectural decisions, especially beneficial during onboarding of new team members or when revisiting decisions. They serve as a knowledge base, enabling teams to learn from past decisions and refine their decision-making process. Additionally, ADRs contribute to transparency by helping stakeholders understand the reasons behind specific architectural choices. As with any other tool or process, introducing them into an organization can face several obstacles, and overcoming these challenges is crucial for successful implementation. In this talk I go through some common problems and our way of solving them.

Architecture decision records - How not to get lost in the past

Architecture decision records - How not to get lost in the past

Architecture decision records - How not to get lost in the past

Papp Krisztián

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pretoria ● Abortion Pills For Sale In Pretoria ● Pretoria 🏥🚑!! Abortion Clinic Near Me Cost, Price, Women's Clinic Near Me, Abortion Clinic Near, Abortion Doctors Near me, Abortion Services Near Me, Abortion Pills Over The Counter, Abortion Pill Doctors' Offices, Abortion Clinics, Abortion Places Near Me, Cheap Abortion Places Near Me, Medical Abortion & Surgical Abortion, approved cyctotec pills and womb cleaning pills too plus all the instructions needed This Discrete women’s Termination Clinic offers same day services that are safe and pain free, we use approved pills and we clean the womb so that no side effects are present. Our main goal is that of preventing unintended pregnancies and unwanted births every day to enable more women to have children by choice, not chance. We offer Terminations by Pill and The Morning After Pill.” Our Private VIP Abortion Service offers the ultimate in privacy, efficiency and discretion. we do safe and same day termination and we do also womb cleaning as well its done from 1 week up to 28 weeks. We do delivery of our services world wide SAFE ABORTION CLINICS/PILLS ON SALE WE DO DELIVERY OF PILLS ALSO Abortion clinic at very low costs, 100% Guaranteed and it’s safe, pain free and a same day service. It Is A 45 Minutes Procedure, we use tested abortion pills and we do womb cleaning as well. Alternatively the medical abortion pill and womb cleansing !!!

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...

Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg

Craft an AI & Machine Learning Pitch with our Editable Professional PowerPoint Template. Ignite your AI & Machine Learning pitch with our cutting-edge PowerPoint template tailored for the industry. Perfect for AI conferences, investor presentations, sales pitches to tech-focused companies, training sessions, and educational programs. - 20+ editable slides: Get a variety of options to choose from for your presentation. - Time-saving solution: Download, replace text/images with a few clicks. - User-friendly customization: Easy to use and personalize. - Modern and attractive design: Captivating visuals, sleek layout. - Tailored to your requirements: Fully alterable for customization. - Well-organized slides: Complete control over content. - Thematic specificity: Reflects healthcare industry with relevant graphics. - Showcase your business idea: Communicate value proposition effectively.

AI & Machine Learning Presentation Template

AI & Machine Learning Presentation Template

AI & Machine Learning Presentation Template

Presentation.STUDIO

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...

Announcing Codolex 2.0 from GDK Software

Announcing Codolex 2.0 from GDK Software

Announcing Codolex 2.0 from GDK Software

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

+971565801893 Mtp-Kit (500MG) Prices » Dubai [(+971565801893**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Leen Whatsapp +971565801893 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971565801893''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971565801893' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Clinic in Abu Dhabi, United Arab Emirates.+971565801893

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

Artyushina_Guest lecture_YorkU CS May 2024.pptx

Artyushina_Guest lecture_YorkU CS May 2024.pptx

Artyushina_Guest lecture_YorkU CS May 2024.pptx

AnnaArtyushina1

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain

We specialize in Psychic Readings, Psychic Love Spells, Binding Love Spells, Obsession Spells, Voodoo Spells, Lottery Spells, Marriage Spells, Black Magic Spells, Palm Readings & much more. Are you depressed? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? Do u need to solve any relationship problem? Contact the powerful spells caster chief kule with love spells that work overnight and love spells that really work. Have you found yourself infatuated with a special someone you think could be the one? Are you looking for a spell to provide them with a nudge in the right direction? Or maybe the spell you cast didn’t achieve the results you were hoping for? Whether you’re new or versed in the ways of spell casting, we’re here to help. Today we’re going to provide you with a detailed guide on the types of love spells to cast. Not only that but there’s something for those who wish to find outside advice from more advanced spell casters. We’re also going to provide you with the top sites available to help you with your dilemma. Let’s begin our journey by educating ourselves on love magic and what a real love caster looks like. Love Magic and Love Casters Love magic made its first appearance back in Ancient Egypt and has been an active practice since. This type of magic is a branch of traditional magic and can be practiced in various ways. Typically the more common use of love magic is through the work of spells, but other methods look like Charms Rituals-LOVE Potions-Dolls and even Amulets If you are interested in becoming a love caster, be prepared for what’s to come. A genuine love caster knows that the art of love casting is no easy feat and shouldn’t be done casually. You should know that not only does it require you to be gifted spiritually, but you must be ready to serve others. Someone who is considered a real love caster has experience in all manner of spells, no matter the difficulty. Training yourself in attraction, commitment, and marriage spells is an excellent place to start. But this by no means will make you a professional. Practice your craft and expand your knowledge; understand that you will possess the ability to help others in time truly. Types of Love Spells What better way to start broadening your experiences with love spells than by learning more about them? These spells work like just about any other spell. Simply apply your intention, use a medium (sigils, mantras, candles, or charm bags), and top it off with establishing the belief that you will receive what you want. So what kind of spells are available and which ones suit your needs the best? Let’s take a look at the many options you have at your disposal.

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...

WSO2CON 2024 Slides - Open Source to SaaS

WSO2CON 2024 Slides - Open Source to SaaS

WSO2CON 2024 Slides - Open Source to SaaS

We specialize in Psychic Readings, Psychic Love Spells, Binding Love Spells, Obsession Spells, Voodoo Spells, Lottery Spells, Marriage Spells, Black Magic Spells, Palm Readings & much more. Are you depressed? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? Do u need to solve any relationship problem? Contact the powerful spells caster chief kule with love spells that work overnight and love spells that really work. Have you found yourself infatuated with a special someone you think could be the one? Are you looking for a spell to provide them with a nudge in the right direction? Or maybe the spell you cast didn’t achieve the results you were hoping for? Whether you’re new or versed in the ways of spell casting, we’re here to help. Today we’re going to provide you with a detailed guide on the types of love spells to cast. Not only that but there’s something for those who wish to find outside advice from more advanced spell casters. We’re also going to provide you with the top sites available to help you with your dilemma. Let’s begin our journey by educating ourselves on love magic and what a real love caster looks like. Love Magic and Love Casters Love magic made its first appearance back in Ancient Egypt and has been an active practice since. This type of magic is a branch of traditional magic and can be practiced in various ways. Typically the more common use of love magic is through the work of spells, but other methods look like Charms Rituals-LOVE Potions-Dolls and even Amulets If you are interested in becoming a love caster, be prepared for what’s to come. A genuine love caster knows that the art of love casting is no easy feat and shouldn’t be done casually. You should know that not only does it require you to be gifted spiritually, but you must be ready to serve others. Someone who is considered a real love caster has experience in all manner of spells, no matter the difficulty. Training yourself in attraction, commitment, and marriage spells is an excellent place to start. But this by no means will make you a professional. Practice your craft and expand your knowledge; understand that you will possess the ability to help others in time truly. Types of Love Spells What better way to start broadening your experiences with love spells than by learning more about them? These spells work like just about any other spell. Simply apply your intention, use a medium (sigils, mantras, candles, or charm bags), and top it off with establishing the belief that you will receive what you want. So what kind of spells are available and which ones suit your needs the best? Let’s take a look at the many options you have at your disposal

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...

tonesoftg

Recently uploaded (20)

Architecture decision records - How not to get lost in the past

Architecture decision records - How not to get lost in the past

Architecture decision records - How not to get lost in the past

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...

AI & Machine Learning Presentation Template

AI & Machine Learning Presentation Template

AI & Machine Learning Presentation Template

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...

Announcing Codolex 2.0 from GDK Software

Announcing Codolex 2.0 from GDK Software

Announcing Codolex 2.0 from GDK Software

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

Artyushina_Guest lecture_YorkU CS May 2024.pptx

Artyushina_Guest lecture_YorkU CS May 2024.pptx

Artyushina_Guest lecture_YorkU CS May 2024.pptx

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...

WSO2CON 2024 Slides - Open Source to SaaS

WSO2CON 2024 Slides - Open Source to SaaS

WSO2CON 2024 Slides - Open Source to SaaS

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...

tonesoftg

Spark Summit EU 2015: Revolutionizing Big Data in the Enterprise with Spark

1. Revolutionizing Big Data in the Enterprise with Spark Ion Stoica October 28,2015

2. We Have Seen a Lot Worked with 100s companies to run Spark in production over five years Collaboratewith all major Hadoop and Big Data vendors 2

3. How Does Spark Change Enterprise Big Data? • Unifying data sources • Unifying data processing 3

4. 4 Unifying Data Sources

5. Need to process data from • Multiple sources • Different data stores and locations • Different formats Traditional solutions: ETL data into data warehouse, … Traditional Data Warehouses ETL Slow to access and combine data Data Warehouse

6. 6 Just-In-Time (JIT) Data Warehouse

7. Process data in place or stream it • No need to wait for data to be ETLed 7 JIT Data Warehouse ETL Data Warehouse

8. Process data in place or stream it • No need to wait for data to be ETLed Cachedata in memory or SSDs 8 JIT Data Warehouse Low latency and easy to combine data: value!

9. Analogy 9 Stream/cache & Play Download & Play

10. Analogy 10 ETL & Query Data Source A ETL Data Warehouse Data Source B Data Source B Data Source A Data Source B Data Source B Stream/Cache + Query

11. Top-3 Media Company Data sources • Traditional data warehouse:Customer transaction and profile data • S3: Clickstream and historical logs • Elasticsearch: User-submitted reviewsand comments • Kafka: Streaming online eventdata Build Spark-basedJIT Data Warehouseto perform real-time analytics 11

12. 12 Unifying Data Processing

13. Unified supportfor • Batch • Streaming • ML/Graphs • … 13 Spark: Unified Engine GraphXMLlib Core Spark Streaming SparkSQL SparkR Easy to manage, learn, and combine functionality

14. Analogy First cellular phones Unified device (smartphone) Specialized devices Better Games Better GPSBetter Phone

15. Analogy Batch processing Unified systemSpecialized systems Real-time analytics Instant fraud detection Better Apps

16. Large On-line Service Company Leverages • Interactive query processing • ML and combines data from S3, Redshift, and HBase to provide • data analyticsfor productmanagementteam • advanced predictive analyticsto delivernew services(e.g., customized inventory displaystailored to each user) 16

18. Demo Setting 18 MLlib Core Spark Streaming SparkSQL HDFS RedShift