SlideShare a Scribd company logo
Driving Datascience at
scale using Postgresql,
Greenplum and Dataiku
PostgresConf 2019
Nicolas GAKRELIDZ
Partner Solution Architect
Dataiku DSS is:
• Collaborative,
• For all profiles,
• Polyglot,
• Production ready
End-to-end Enterprise AI platform
Dataiku DSS
End-to-end Enterprise AI platform
Dataiku DSS
Supporting the Enterprise AI Journey of
Manufacturing Financial Services
Services Consumer Goods
Technology Consulting
E-Retail Media
Healthcare Travel
Global Presence
A WIDE USER BASE
POWERED BY A STRONG ORGANIZATION
Dataikers
220
BACKED BY MAJOR PARTNERS
Customers
220+
Users
20,000
+ of customers expand
usage after first year
80%
Raised so far
$146M
Customers Across Industries
POWERING INDUSTRY LEADERS
The “Tower of Babel” Effect of Data Projects
The Classic Data Project Silos
Business
Analyst
DATA PREPARATION ML MODELING ML DEPLOYMENT
Data Preparation
Data Science Notebooks
& API Platforms
AutoML
Solutions
Data Scientist
Data Engineer
Bring Business Analysts, Engineers, and Scientists Together
Share a common environment to have an impact
DATA PREPARATION ML MODELING ML DEPLOYMENT
Business
Analyst
Data Engineer
Data Scientist
Single Collaborative, Governable and Auditable Environment
Leverage existing skills
and secure sustained
availability
Maximise usage of most
up-to-date technologies
Extend based on current
and future operating
requirements
Get Results Today, Build for Tomorrow
Future proof your data effort
Use your current
infrastructure and be
ready for tomorrow’s
Bokeh
Fortune 500 Customer Rockets through Acceleration Phase
Customer Testimony
Quarterly Evolution of Dataiku Users
Analytics
Leader
10 Projects Leaders
Scale their team to
deliver
10x Projects / Briefs /
Models / ...
Business
Analyst
500 Business Analysts
Leverage Large and
Complex Data Sources
Independent to Deliver
New Projects Accelerate
by leveraging tools
packaged by Data
Scientists
100 Data
Scientists
Focus On Complex
Data Processing
Deliver Code and
Plugins for Reuse
Data
Scientist
20 Data Engineers
Ensure availability of data
infrastructures
Operationalize, monitor
and maintain data
projects
Data
Engineer
Delivering 1,000s of analysis, insights,
models and optimized business
processes
Enable Self-Service Analytics and Operationalize ML
The Two Key Modes of Data Innovation
SSA
Quick answers to
unformulated questions
Directly by the end-users
Pervasive
Agile and instantaneous
Limited integration
High volume
o16n
Robust solutions to
business challenges
Organization-driven
Focused
Longer term
Fine integration
High value projects
How a Major Software Player Auto-Deploys 12,000 Models
Customer Testimony
Design complex recommendation
engines combining price, content and
demand logics (the final models actually
combine 3 predictive models)
Automatically generate
such recommendation engines based on
each of its seller’s data and data models
Operate models in real time and
update them with no down time, scaling
up on a fully managed platform on top of
Kubernetes An AI-enabled Layer on top of
an an existing product
Powered by Dataiku
Dataiku Customer provides a sales management software platform to 4,000 B2B clients
(including several Fortune 100 companies), and has deployed Dataiku in order to:
Leverage your full stack and skills
Dataiku Solution Overview: Architecture
LINUX SERVER
ON PREMISE OR MANAGED
CLOUD
CENTRALIZED
OR AD-HOC
DATA SOURCES,
DATABASES,
DATA LAKE
AVAILABLE OR SPUN-UP
PROCESSING RESOURCES
Leveraging best
storage and
compute
resources
Dataiku deployment servers for
enterprise grade
operationalization
PRODUCTION
SYSTEMS
Centralized server to
facilitate
access to data, ressources,
Browser
based
interface
VISUAL DEVELOPMENT
COMPLETE
CODING
ENVIRONMENTS
VISUALIZATIO
N
COLLABORATION AND
PROJECT
MANAGEMENT
AUDIT,
MONITORING
AND
SCHEDULING
User/task specific
interaction modes
4 components
Dataiku DSS Public API
Dataiku DSS components
Data Scientist Business Analyst Data Engineer
Machine Learning Model DeploymentData Management
MADlib
In-database
machine learning
Graph
Relationship
Analytics
Greenplum
Integrated and cleansed data,
parallel SQL processing
GPText
Fast index,
search, text
analytics
PostGIS
Location analytics
Enable In-Database Analytics & Operationalized ML
Dataiku & Pivotal® Greenplum’s Value
High-Performance Analytics at Petabyte Scale
▪ Dataiku leverages Pivotal® Greenplum for in-database parallel
processing of complex queries, visual analysis and charts.
Simplify Collaboration across Data Teams
▪ End-to-end project collaboration for data scientists and
engineers
▪ Self-service access to data sources
▪ Visual Development experience for building comprehensive
analytics pipelines
Mature Your Data Analytics Operations
▪ Enable self-service analytics of large datasets stored in
Pivotal® Greenplum
▪ Enforce data governance between roles and teams
▪ Enable comprehensive of machine learning pipelines and
models.
Solution Features
Dataiku & Pivotal® Greenplum’s Value
Dataiku + Postgres and Greenplum (example)
Order
Data
Movements
(if compatible)
Dataiku Datasets:
● Index definitions
● Incremental
SQL push
back: Charts using
SQL
Pushback
…
Storage
©2019 dataiku, Inc. | dataiku.com | contact@dataiku.com | @dataiku

More Related Content

What's hot

What's hot (20)

A 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with SnowflakeA 30 day plan to start ending your data struggle with Snowflake
A 30 day plan to start ending your data struggle with Snowflake
 
Dataiku Data Science Studio (datasheet)
Dataiku Data Science Studio (datasheet)Dataiku Data Science Studio (datasheet)
Dataiku Data Science Studio (datasheet)
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Building A Bi Strategy
Building A Bi StrategyBuilding A Bi Strategy
Building A Bi Strategy
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data Solution
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
Databricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI InitiativesDatabricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI Initiatives
 

Similar to Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenplum Summit 2019

Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
Pactera_US
 
SIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess QlikSIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess Qlik
Bardess Group
 
Business Discovery PPT
Business Discovery PPTBusiness Discovery PPT
Business Discovery PPT
pdalalau
 
PROG_UntoldStory ISV eBook_0706c FINAL
PROG_UntoldStory ISV eBook_0706c FINALPROG_UntoldStory ISV eBook_0706c FINAL
PROG_UntoldStory ISV eBook_0706c FINAL
SolarWinds MSP
 

Similar to Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenplum Summit 2019 (20)

Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
 
New Delhi Cloud Summit 05 26-11
New Delhi Cloud Summit 05 26-11New Delhi Cloud Summit 05 26-11
New Delhi Cloud Summit 05 26-11
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
 
Revolution in Business Analytics-Zika Virus Example
Revolution in Business Analytics-Zika Virus ExampleRevolution in Business Analytics-Zika Virus Example
Revolution in Business Analytics-Zika Virus Example
 
SIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess QlikSIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess Qlik
 
Business Discovery PPT
Business Discovery PPTBusiness Discovery PPT
Business Discovery PPT
 
Business Discovery
Business DiscoveryBusiness Discovery
Business Discovery
 
Business Discovery Ppt
Business Discovery PptBusiness Discovery Ppt
Business Discovery Ppt
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDS
 
Cloudera and Qlik: Big Data Analytics for Business
Cloudera and Qlik: Big Data Analytics for BusinessCloudera and Qlik: Big Data Analytics for Business
Cloudera and Qlik: Big Data Analytics for Business
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric Introduction
 
PROG_UntoldStory ISV eBook_0706c FINAL
PROG_UntoldStory ISV eBook_0706c FINALPROG_UntoldStory ISV eBook_0706c FINAL
PROG_UntoldStory ISV eBook_0706c FINAL
 
About CDAP
About CDAPAbout CDAP
About CDAP
 
Scaling Legacy
Scaling LegacyScaling Legacy
Scaling Legacy
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 

More from VMware Tanzu

More from VMware Tanzu (20)

Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14
 
What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 

Recently uploaded

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
Alluxio, Inc.
 

Recently uploaded (20)

De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Breaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfBreaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdf
 
iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by Skilrock
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024Top Mobile App Development Companies 2024
Top Mobile App Development Companies 2024
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 

Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenplum Summit 2019

  • 1. Driving Datascience at scale using Postgresql, Greenplum and Dataiku PostgresConf 2019 Nicolas GAKRELIDZ Partner Solution Architect
  • 2. Dataiku DSS is: • Collaborative, • For all profiles, • Polyglot, • Production ready End-to-end Enterprise AI platform Dataiku DSS
  • 3. End-to-end Enterprise AI platform Dataiku DSS
  • 4. Supporting the Enterprise AI Journey of Manufacturing Financial Services Services Consumer Goods Technology Consulting E-Retail Media Healthcare Travel Global Presence A WIDE USER BASE POWERED BY A STRONG ORGANIZATION Dataikers 220 BACKED BY MAJOR PARTNERS Customers 220+ Users 20,000 + of customers expand usage after first year 80% Raised so far $146M Customers Across Industries POWERING INDUSTRY LEADERS
  • 5. The “Tower of Babel” Effect of Data Projects The Classic Data Project Silos Business Analyst DATA PREPARATION ML MODELING ML DEPLOYMENT Data Preparation Data Science Notebooks & API Platforms AutoML Solutions Data Scientist Data Engineer
  • 6. Bring Business Analysts, Engineers, and Scientists Together Share a common environment to have an impact DATA PREPARATION ML MODELING ML DEPLOYMENT Business Analyst Data Engineer Data Scientist Single Collaborative, Governable and Auditable Environment
  • 7. Leverage existing skills and secure sustained availability Maximise usage of most up-to-date technologies Extend based on current and future operating requirements Get Results Today, Build for Tomorrow Future proof your data effort Use your current infrastructure and be ready for tomorrow’s Bokeh
  • 8. Fortune 500 Customer Rockets through Acceleration Phase Customer Testimony Quarterly Evolution of Dataiku Users Analytics Leader 10 Projects Leaders Scale their team to deliver 10x Projects / Briefs / Models / ... Business Analyst 500 Business Analysts Leverage Large and Complex Data Sources Independent to Deliver New Projects Accelerate by leveraging tools packaged by Data Scientists 100 Data Scientists Focus On Complex Data Processing Deliver Code and Plugins for Reuse Data Scientist 20 Data Engineers Ensure availability of data infrastructures Operationalize, monitor and maintain data projects Data Engineer Delivering 1,000s of analysis, insights, models and optimized business processes
  • 9. Enable Self-Service Analytics and Operationalize ML The Two Key Modes of Data Innovation SSA Quick answers to unformulated questions Directly by the end-users Pervasive Agile and instantaneous Limited integration High volume o16n Robust solutions to business challenges Organization-driven Focused Longer term Fine integration High value projects
  • 10. How a Major Software Player Auto-Deploys 12,000 Models Customer Testimony Design complex recommendation engines combining price, content and demand logics (the final models actually combine 3 predictive models) Automatically generate such recommendation engines based on each of its seller’s data and data models Operate models in real time and update them with no down time, scaling up on a fully managed platform on top of Kubernetes An AI-enabled Layer on top of an an existing product Powered by Dataiku Dataiku Customer provides a sales management software platform to 4,000 B2B clients (including several Fortune 100 companies), and has deployed Dataiku in order to:
  • 11. Leverage your full stack and skills Dataiku Solution Overview: Architecture LINUX SERVER ON PREMISE OR MANAGED CLOUD CENTRALIZED OR AD-HOC DATA SOURCES, DATABASES, DATA LAKE AVAILABLE OR SPUN-UP PROCESSING RESOURCES Leveraging best storage and compute resources Dataiku deployment servers for enterprise grade operationalization PRODUCTION SYSTEMS Centralized server to facilitate access to data, ressources, Browser based interface VISUAL DEVELOPMENT COMPLETE CODING ENVIRONMENTS VISUALIZATIO N COLLABORATION AND PROJECT MANAGEMENT AUDIT, MONITORING AND SCHEDULING User/task specific interaction modes
  • 12. 4 components Dataiku DSS Public API Dataiku DSS components
  • 13. Data Scientist Business Analyst Data Engineer Machine Learning Model DeploymentData Management MADlib In-database machine learning Graph Relationship Analytics Greenplum Integrated and cleansed data, parallel SQL processing GPText Fast index, search, text analytics PostGIS Location analytics Enable In-Database Analytics & Operationalized ML Dataiku & Pivotal® Greenplum’s Value
  • 14. High-Performance Analytics at Petabyte Scale ▪ Dataiku leverages Pivotal® Greenplum for in-database parallel processing of complex queries, visual analysis and charts. Simplify Collaboration across Data Teams ▪ End-to-end project collaboration for data scientists and engineers ▪ Self-service access to data sources ▪ Visual Development experience for building comprehensive analytics pipelines Mature Your Data Analytics Operations ▪ Enable self-service analytics of large datasets stored in Pivotal® Greenplum ▪ Enforce data governance between roles and teams ▪ Enable comprehensive of machine learning pipelines and models. Solution Features Dataiku & Pivotal® Greenplum’s Value
  • 15. Dataiku + Postgres and Greenplum (example) Order Data Movements (if compatible) Dataiku Datasets: ● Index definitions ● Incremental SQL push back: Charts using SQL Pushback … Storage
  • 16. ©2019 dataiku, Inc. | dataiku.com | contact@dataiku.com | @dataiku