SlideShare a Scribd company logo
Complex Data Transformations Made Easy
Building Data Apps for Large Scale transformations
Nikita Ogievetsky
Head Of Solutions Engineering
October 25 2020
LA Data Conference
Xcalar Application Platform
Xcalar Value Proposition
• 10x cost savings
• 20x improvement in time-to-value
• 100s of nodes, 1000s of users
• Petabyte scale
• Open standards: SQL + Python
• No vendor lock-in
• Billions of parallel ML ops
• Real-time data
Enterprise Grade Features
• Full memory hierarchy, in-memory
performance
• Crash consistent & recoverable
• Snapshots, checkpoints
• ACID transactions
• Point-in-Time rollbacks
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.2
“Accelerate development & operationalization of cloud-scale data applications with
unprecedented simplicity, speed and scale.”
Vikram Joshi - Founder’s vision
• What: Modern cloud-scale data compute platform
• Declarative + imperative programming paradigm, open standards, no vendor lock-in
• Industry’s first IDE for data apps
• CI/CD: Fastest Dev -> Test -> Operationalize cycle in industry
• Who: Goal oriented data engineers
• Good at SQL, some knowledge of Python or other scripting languages
• Working with large volumes of data inside or across large scale data systems
• Why:
• Reduce time to market for analyzing large volumes of data and building enterprise-
level complex data compute applications
• Minimize dependency on highly skilled backend system engineers
• Optimize performance SLA in prod and reduce infrastructure and software costs.
Xcalar Application Platform – Who/What/Why
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.3
Xcalar Architecture
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.4
Source Tables User-Generated
Data Marts
Connectors
& Targets
Editors/IDEs
Jupyter
Notebook
VS Code Zeplin PyCharm Xcalar Design
BI Tools
DWs
Oracle
Teradata
Spark Connector
Datasets for ML
Data-Driven
Data Lake
DWs
Oracle
Teradata
Spark Connector
Datasets for ML
Data Lake
Business Logic
• Xcalar has created the industry’s first IDE for data apps
• 5x dev productivity
• Iterative, imperative programming with eager execution
• Visual analysis, monitoring and debug for every operation
• High confidence production results as you develop
• Work with your entire datasets (Terabytes)
• SQL + Python programming paradigm (Also, Scala, Java, JavaScript, C++)
• Focus on Business Logic
• CI/CD
• Fast Dev -> Test -> Operationalize cycles
• Zero Changes for Production Deployment
Xcalar IDE - Better, Stronger, Faster
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.5
Case Study
Accelerating Complex, Real-time Insights at a
Top Global Investment Bank
6 Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.
Preexisting Technology Challenges with Spark
Problem Statement
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.7
• Very Slow
• Daily batch run: 4.5 hours (2 TB of data)
• Long time to value, long development and troubleshooting cycles
• Last revision: 6 person month
• Dependency on scarce expert programmers
• 25,000 lines of Java/Spark code
• Lack of dev IDE for distributed applications
• High TCO
Transparent Modular Xcalar IDE
25,000 lines of Java code -> 12 visual dataflows
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.8
Data Lake
Xcalar Application Platform
Kafka Streaming
Kafka Broker 1
TopicX ds1
Kafka Broker 2
TopicX ds1
Kafka Streaming
Kafka Broker 1
Topic 1
ds1
ds2
Topic 2
ds3
ds4
Kafka Broker 2
Topic 3
ds5
ds6
Topic 4
ds7
ds8
Xcalar Solution replaced Spark
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.9
Develop Business Logic
Troubleshoot & Maintain
Operationalize
Develop Distributed Compute
Develop Business Logic
Troubleshoot & Maintain
Develop Distributed Compute
20x improvement in time-to-value
10x TCO reduction & performance boost
5x Developer efficiency: Intuitive IDE, open standards, no niche skills
All code scales naturally, no need to use SDK/API for distributed compute
Zero changes for product deployment and operationalization
Easy troubleshooting, immediate feedback, simple administration
Better, Stronger, Faster
Case Study
Analyzing Xcalar with Xcalar
(at a top Wall Street bank)
10 Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.
Streaming Applications Platform
Complexity that needs to be managed
• Initial sources
• Kafka streams (topics, partitions, offsets, etc.)
• Applications configurations and metadata
• App regions, sources and dependencies
• Batches, batch versions
• Batch stages & microservices
• load, ingest, imd, execute, export, snapshot, recovery
• Table lifecycles (microbatch row counts, sizes, versions)
• Transactions and their metadata
• Errors and warning if any
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.11
Use Xcalar SaaS to analyze Xcalar logs
Powered by my.xcalar.cloud
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.12
1. Register and Login
2. Start my instance
3. Click on Xcalar Load or
start working on your
Notebook projects
Xcalar Load
Load log files as is, leverage power of SQL to parse, analyze and compute
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.13
1a. Navigate to my S3 bucket
1b. Upload my log file
and select it
1c. Select single column CSV
as input format
2. Review schema
3. Load table and
navigate to the notebook project
Parserless Parsing
I used SQL and Scalar Function to parse row-by-row (all CPUs work in parallel)
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.14
Load
1-column
temp table
Parse each
row in
parallel
Extract
syntax
Extract
semantics
Share
temp table
No predefined log parsers
Interactive development
Parallel parsing – all CPUs
Data Application Development Paradigm
I used Declarative + Imperative approach to build a Data Mart
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.15
Business
Logic
Transform
Build App
Data Mart
BI &
Visualizations
SQL
Scalar
Functions
Analyze
Profile
Test
SQL
Table
Functions
Modules
Data
Model
Profile
SQL
Declarative
Programming
Imperative
Programming
Load & Parse
Data Application
End-to-end
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.16
1. Log Parsing
2. Data Prep
& Enrich
3. Compute
& Normalize
Data
Mart
4. Visualization
Data Cube
Table Function Example
get_incremetns() Table Function
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.17
Explode list of
table-infos into
rows
Extract size,
row counts and
other features
Keep only info
about user
tables
Input
Interface
Ouput
Interface
For log entries containing arrays of
JSON dictionaries describing
streaming table increments:
• Explode each dictionary into a row
• Extract features
3. Compute
& Normalize
Operationalized Data Application
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.18
Schedule DW updates as a Microservice
BI Integration
Tableau, Qlik, Looker, … Jupyter
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.19
Xcalar Data Mart
SQL & BI Processing
Jupyter Notebook
Visualization
ODBC
REST
Review
20 Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.
Xcalar Programming Model
Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.21
• SQL SELECT statements + Python code to develop apps
• Lowers technical barrier to entry for developers
• Enables SQL + Python developers to build distributed apps
• No need to learn scale-out programming or machine architecture
• Xcalar Data Apps naturally scale
• No iterators to worry about
• Users write code to process row/column/matrix only
• Xcalar’s runtime does underlying distribution and scheduling
Cloud-Scale Platform for Data Apps
Learn More
xcalar.com
Free Developer Sign Up
my.xcalar.cloud

More Related Content

What's hot

Kyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewKyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An Overview
SamanthaBerlant
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
Databricks
 
Northwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudNorthwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to Cloud
Databricks
 
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
Deepak Chandramouli
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
Jesus Rodriguez
 
OAC - From Cloud Entry to Data Engineering to Data Science
OAC - From Cloud Entry to Data Engineering to Data ScienceOAC - From Cloud Entry to Data Engineering to Data Science
OAC - From Cloud Entry to Data Engineering to Data Science
Christian Berg
 
Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientists
datamantra
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Data Con LA
 
Accelerate Data Science Initiatives: Databricks & Privacera
Accelerate Data Science Initiatives: Databricks & PrivaceraAccelerate Data Science Initiatives: Databricks & Privacera
Accelerate Data Science Initiatives: Databricks & Privacera
Databricks
 
Unified Data Access with Gimel
Unified Data Access with GimelUnified Data Access with Gimel
Unified Data Access with Gimel
Alluxio, Inc.
 
The Convergence of Reporting and Interactive BI on Hadoop
The Convergence of Reporting and Interactive BI on HadoopThe Convergence of Reporting and Interactive BI on Hadoop
The Convergence of Reporting and Interactive BI on Hadoop
DataWorks Summit
 
Data Driven Decisions at Scale
Data Driven Decisions at ScaleData Driven Decisions at Scale
Data Driven Decisions at Scale
Databricks
 
TOUG Big Data Challenge and Impact
TOUG Big Data Challenge and ImpactTOUG Big Data Challenge and Impact
TOUG Big Data Challenge and Impact
Toronto-Oracle-Users-Group
 
Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stack
Michel Tricot
 
Deploying Big Data Platforms
Deploying Big Data PlatformsDeploying Big Data Platforms
Deploying Big Data Platforms
Chris Kernaghan
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
Jeffrey T. Pollock
 
Introducing MLflow for End-to-End Machine Learning on Databricks
Introducing MLflow for End-to-End Machine Learning on DatabricksIntroducing MLflow for End-to-End Machine Learning on Databricks
Introducing MLflow for End-to-End Machine Learning on Databricks
Databricks
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
Databricks
 
Microservices Patterns with GoldenGate
Microservices Patterns with GoldenGateMicroservices Patterns with GoldenGate
Microservices Patterns with GoldenGate
Jeffrey T. Pollock
 

What's hot (20)

Kyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewKyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An Overview
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
 
Northwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudNorthwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to Cloud
 
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
 
OAC - From Cloud Entry to Data Engineering to Data Science
OAC - From Cloud Entry to Data Engineering to Data ScienceOAC - From Cloud Entry to Data Engineering to Data Science
OAC - From Cloud Entry to Data Engineering to Data Science
 
Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientists
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
 
Accelerate Data Science Initiatives: Databricks & Privacera
Accelerate Data Science Initiatives: Databricks & PrivaceraAccelerate Data Science Initiatives: Databricks & Privacera
Accelerate Data Science Initiatives: Databricks & Privacera
 
Unified Data Access with Gimel
Unified Data Access with GimelUnified Data Access with Gimel
Unified Data Access with Gimel
 
The Convergence of Reporting and Interactive BI on Hadoop
The Convergence of Reporting and Interactive BI on HadoopThe Convergence of Reporting and Interactive BI on Hadoop
The Convergence of Reporting and Interactive BI on Hadoop
 
Data Driven Decisions at Scale
Data Driven Decisions at ScaleData Driven Decisions at Scale
Data Driven Decisions at Scale
 
TOUG Big Data Challenge and Impact
TOUG Big Data Challenge and ImpactTOUG Big Data Challenge and Impact
TOUG Big Data Challenge and Impact
 
Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stack
 
Deploying Big Data Platforms
Deploying Big Data PlatformsDeploying Big Data Platforms
Deploying Big Data Platforms
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
 
Introducing MLflow for End-to-End Machine Learning on Databricks
Introducing MLflow for End-to-End Machine Learning on DatabricksIntroducing MLflow for End-to-End Machine Learning on Databricks
Introducing MLflow for End-to-End Machine Learning on Databricks
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
 
Microservices Patterns with GoldenGate
Microservices Patterns with GoldenGateMicroservices Patterns with GoldenGate
Microservices Patterns with GoldenGate
 

Similar to Complex Data Transformations Made Easy

APEX Alpe Adria Mike Hichwa Keynote April 11th 2019- Zagreb
APEX Alpe Adria Mike Hichwa Keynote April 11th 2019- ZagrebAPEX Alpe Adria Mike Hichwa Keynote April 11th 2019- Zagreb
APEX Alpe Adria Mike Hichwa Keynote April 11th 2019- Zagreb
Michael Hichwa
 
TawenKan_092015
TawenKan_092015TawenKan_092015
TawenKan_092015
Tawen Kan
 
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
Getting value from IoT, Integration and Data Analytics
 
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
Lucas Jellema
 
Harikrishna yaddanapudi
Harikrishna yaddanapudiHarikrishna yaddanapudi
Harikrishna yaddanapudi
Triniti Corporation
 
Oracle Database 19c - poslední z rodiny 12.2 a co přináší nového
Oracle Database 19c - poslední z rodiny 12.2 a co přináší novéhoOracle Database 19c - poslední z rodiny 12.2 a co přináší nového
Oracle Database 19c - poslední z rodiny 12.2 a co přináší nového
MarketingArrowECS_CZ
 
Apex ace update
Apex ace updateApex ace update
Apex ace update
Ayesha Fayyaz
 
Tips and Tricks for Toad
Tips and Tricks for ToadTips and Tricks for Toad
Tips and Tricks for Toad
Aflex Distribution
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
DataScienceConferenc1
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
DataStax
 
Reducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleReducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off Oracle
EDB
 
Serverless patterns
Serverless patternsServerless patterns
Serverless patterns
Jesse Butler
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Databricks
 
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with KafkaAvoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
HostedbyConfluent
 
2018: State of the Dolphin, MySQL Keynote at Percona Live Europe 2018, Frankf...
2018: State of the Dolphin, MySQL Keynote at Percona Live Europe 2018, Frankf...2018: State of the Dolphin, MySQL Keynote at Percona Live Europe 2018, Frankf...
2018: State of the Dolphin, MySQL Keynote at Percona Live Europe 2018, Frankf...
Geir Høydalsvik
 
PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)
Stratebi
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Qubole
 
Zakir_Hussain_cv
Zakir_Hussain_cvZakir_Hussain_cv
Zakir_Hussain_cv
zakir hussain
 
Building Cloud Native Applications with Oracle Autonomous Database.
Building Cloud Native Applications with Oracle Autonomous Database.Building Cloud Native Applications with Oracle Autonomous Database.
Building Cloud Native Applications with Oracle Autonomous Database.
Oracle Developers
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 

Similar to Complex Data Transformations Made Easy (20)

APEX Alpe Adria Mike Hichwa Keynote April 11th 2019- Zagreb
APEX Alpe Adria Mike Hichwa Keynote April 11th 2019- ZagrebAPEX Alpe Adria Mike Hichwa Keynote April 11th 2019- Zagreb
APEX Alpe Adria Mike Hichwa Keynote April 11th 2019- Zagreb
 
TawenKan_092015
TawenKan_092015TawenKan_092015
TawenKan_092015
 
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
AMIS Oracle OpenWorld en Code One Review 2018 - Pillar 2: Custom Application ...
 
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
AMIS Oracle OpenWorld & CodeOne Review - Pillar 2 - Custom Application Develo...
 
Harikrishna yaddanapudi
Harikrishna yaddanapudiHarikrishna yaddanapudi
Harikrishna yaddanapudi
 
Oracle Database 19c - poslední z rodiny 12.2 a co přináší nového
Oracle Database 19c - poslední z rodiny 12.2 a co přináší novéhoOracle Database 19c - poslední z rodiny 12.2 a co přináší nového
Oracle Database 19c - poslední z rodiny 12.2 a co přináší nového
 
Apex ace update
Apex ace updateApex ace update
Apex ace update
 
Tips and Tricks for Toad
Tips and Tricks for ToadTips and Tricks for Toad
Tips and Tricks for Toad
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
Reducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleReducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off Oracle
 
Serverless patterns
Serverless patternsServerless patterns
Serverless patterns
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with KafkaAvoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
 
2018: State of the Dolphin, MySQL Keynote at Percona Live Europe 2018, Frankf...
2018: State of the Dolphin, MySQL Keynote at Percona Live Europe 2018, Frankf...2018: State of the Dolphin, MySQL Keynote at Percona Live Europe 2018, Frankf...
2018: State of the Dolphin, MySQL Keynote at Percona Live Europe 2018, Frankf...
 
PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
 
Zakir_Hussain_cv
Zakir_Hussain_cvZakir_Hussain_cv
Zakir_Hussain_cv
 
Building Cloud Native Applications with Oracle Autonomous Database.
Building Cloud Native Applications with Oracle Autonomous Database.Building Cloud Native Applications with Oracle Autonomous Database.
Building Cloud Native Applications with Oracle Autonomous Database.
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 

More from Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
Data Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
Data Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
Data Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
osoyvvf
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
exukyp
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
yuvarajkumar334
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
agdhot
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
nhutnguyen355078
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
nhero3888
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
Vietnam Cotton & Spinning Association
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
TeukuEriSyahputra
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
Bisnar Chase Personal Injury Attorneys
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 

Recently uploaded (20)

一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
 
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 

Complex Data Transformations Made Easy

  • 1. Complex Data Transformations Made Easy Building Data Apps for Large Scale transformations Nikita Ogievetsky Head Of Solutions Engineering October 25 2020 LA Data Conference
  • 2. Xcalar Application Platform Xcalar Value Proposition • 10x cost savings • 20x improvement in time-to-value • 100s of nodes, 1000s of users • Petabyte scale • Open standards: SQL + Python • No vendor lock-in • Billions of parallel ML ops • Real-time data Enterprise Grade Features • Full memory hierarchy, in-memory performance • Crash consistent & recoverable • Snapshots, checkpoints • ACID transactions • Point-in-Time rollbacks Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.2 “Accelerate development & operationalization of cloud-scale data applications with unprecedented simplicity, speed and scale.” Vikram Joshi - Founder’s vision
  • 3. • What: Modern cloud-scale data compute platform • Declarative + imperative programming paradigm, open standards, no vendor lock-in • Industry’s first IDE for data apps • CI/CD: Fastest Dev -> Test -> Operationalize cycle in industry • Who: Goal oriented data engineers • Good at SQL, some knowledge of Python or other scripting languages • Working with large volumes of data inside or across large scale data systems • Why: • Reduce time to market for analyzing large volumes of data and building enterprise- level complex data compute applications • Minimize dependency on highly skilled backend system engineers • Optimize performance SLA in prod and reduce infrastructure and software costs. Xcalar Application Platform – Who/What/Why Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.3
  • 4. Xcalar Architecture Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.4 Source Tables User-Generated Data Marts Connectors & Targets Editors/IDEs Jupyter Notebook VS Code Zeplin PyCharm Xcalar Design BI Tools DWs Oracle Teradata Spark Connector Datasets for ML Data-Driven Data Lake DWs Oracle Teradata Spark Connector Datasets for ML Data Lake Business Logic
  • 5. • Xcalar has created the industry’s first IDE for data apps • 5x dev productivity • Iterative, imperative programming with eager execution • Visual analysis, monitoring and debug for every operation • High confidence production results as you develop • Work with your entire datasets (Terabytes) • SQL + Python programming paradigm (Also, Scala, Java, JavaScript, C++) • Focus on Business Logic • CI/CD • Fast Dev -> Test -> Operationalize cycles • Zero Changes for Production Deployment Xcalar IDE - Better, Stronger, Faster Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.5
  • 6. Case Study Accelerating Complex, Real-time Insights at a Top Global Investment Bank 6 Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.
  • 7. Preexisting Technology Challenges with Spark Problem Statement Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.7 • Very Slow • Daily batch run: 4.5 hours (2 TB of data) • Long time to value, long development and troubleshooting cycles • Last revision: 6 person month • Dependency on scarce expert programmers • 25,000 lines of Java/Spark code • Lack of dev IDE for distributed applications • High TCO
  • 8. Transparent Modular Xcalar IDE 25,000 lines of Java code -> 12 visual dataflows Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.8 Data Lake Xcalar Application Platform Kafka Streaming Kafka Broker 1 TopicX ds1 Kafka Broker 2 TopicX ds1 Kafka Streaming Kafka Broker 1 Topic 1 ds1 ds2 Topic 2 ds3 ds4 Kafka Broker 2 Topic 3 ds5 ds6 Topic 4 ds7 ds8
  • 9. Xcalar Solution replaced Spark Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.9 Develop Business Logic Troubleshoot & Maintain Operationalize Develop Distributed Compute Develop Business Logic Troubleshoot & Maintain Develop Distributed Compute 20x improvement in time-to-value 10x TCO reduction & performance boost 5x Developer efficiency: Intuitive IDE, open standards, no niche skills All code scales naturally, no need to use SDK/API for distributed compute Zero changes for product deployment and operationalization Easy troubleshooting, immediate feedback, simple administration Better, Stronger, Faster
  • 10. Case Study Analyzing Xcalar with Xcalar (at a top Wall Street bank) 10 Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.
  • 11. Streaming Applications Platform Complexity that needs to be managed • Initial sources • Kafka streams (topics, partitions, offsets, etc.) • Applications configurations and metadata • App regions, sources and dependencies • Batches, batch versions • Batch stages & microservices • load, ingest, imd, execute, export, snapshot, recovery • Table lifecycles (microbatch row counts, sizes, versions) • Transactions and their metadata • Errors and warning if any Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.11
  • 12. Use Xcalar SaaS to analyze Xcalar logs Powered by my.xcalar.cloud Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.12 1. Register and Login 2. Start my instance 3. Click on Xcalar Load or start working on your Notebook projects
  • 13. Xcalar Load Load log files as is, leverage power of SQL to parse, analyze and compute Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.13 1a. Navigate to my S3 bucket 1b. Upload my log file and select it 1c. Select single column CSV as input format 2. Review schema 3. Load table and navigate to the notebook project
  • 14. Parserless Parsing I used SQL and Scalar Function to parse row-by-row (all CPUs work in parallel) Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.14 Load 1-column temp table Parse each row in parallel Extract syntax Extract semantics Share temp table No predefined log parsers Interactive development Parallel parsing – all CPUs
  • 15. Data Application Development Paradigm I used Declarative + Imperative approach to build a Data Mart Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.15 Business Logic Transform Build App Data Mart BI & Visualizations SQL Scalar Functions Analyze Profile Test SQL Table Functions Modules Data Model Profile SQL Declarative Programming Imperative Programming Load & Parse
  • 16. Data Application End-to-end Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.16 1. Log Parsing 2. Data Prep & Enrich 3. Compute & Normalize Data Mart 4. Visualization Data Cube
  • 17. Table Function Example get_incremetns() Table Function Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.17 Explode list of table-infos into rows Extract size, row counts and other features Keep only info about user tables Input Interface Ouput Interface For log entries containing arrays of JSON dictionaries describing streaming table increments: • Explode each dictionary into a row • Extract features 3. Compute & Normalize
  • 18. Operationalized Data Application Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.18 Schedule DW updates as a Microservice
  • 19. BI Integration Tableau, Qlik, Looker, … Jupyter Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.19 Xcalar Data Mart SQL & BI Processing Jupyter Notebook Visualization ODBC REST
  • 20. Review 20 Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.
  • 21. Xcalar Programming Model Xcalar Inc. Proprietary and Confidential. © 2020, Xcalar Inc. All rights reserved.21 • SQL SELECT statements + Python code to develop apps • Lowers technical barrier to entry for developers • Enables SQL + Python developers to build distributed apps • No need to learn scale-out programming or machine architecture • Xcalar Data Apps naturally scale • No iterators to worry about • Users write code to process row/column/matrix only • Xcalar’s runtime does underlying distribution and scheduling
  • 22. Cloud-Scale Platform for Data Apps Learn More xcalar.com Free Developer Sign Up my.xcalar.cloud