Raghu Chakravarthi
Uncovering the next generation of Data Architecture
Insights at the speed of thought
Customer 360
Risk Analytics
Hyper Personalization
Modern use cases need a cross section of data
Behavioral Data
Social Data
Transactional
Data
Customer Journey
Hyper Personalization
What do these data demand from data architecture
Behavioral Data
Social Data
Transactional
Data
Quick Access, Analytics
On Prem
Fast inserts, updates
Hybrid
Heavy inserts, Realtime
Cloud
Realtime updates, Fast ingestion, Quick Access,
Analytics where data lives
Hybrid Cloud
Key trends in the data and analytics
Managed
Performance at scale
Agility
Best TCO
Edge
Hybrid
Key Trends
Legacy enterprise databases hinder business growth
Data siloed across on-
prem & cloud
Enterprises require existing on-prem
investments to be leveraged
Hard to move data into the cloud
Cloud lock-in
Performance slows with
concurrency
Designed to perform at TB scale not
PB requiring additional compute to
handle concurrency
While pricing starts low, production
workloads at scale are expensive
Pricing scales up
too quickly
© 2019 Actian Corporation5
Vs.
We need something better …
Data
Integration
Performance to PB
scale
Pay for what
you use
Easy to
manage
© 2019 Actian Corporation6
Introducing Operational Data Warehouse
Usability
Performance at scale
Agility
Best TCO
Edge Hybrid
Operational Data
Warehouse
These have become Table Stakes
Compute
Storage
Separation
Columnar
Store
Data
Variety
Fast Data
Ingestion
On Prem
& Cloud
Elasticity
Hot, Warm,
Cold Data
Separation
Analytics to
Data
Parallel Updates Federated
Query
Persona
based access
Data Labs
Data Agility from access to analytics
User experience improves Productivity and Collaboration
Usability
SQL, Py, R, Scala, Java
Data Cleansing
Tokenizing
Feature Prep
Atomic Data
Aggregated Data
User Data
Pipelines
Access Rules
Annotations
Sandbox
Versioning
Data lives at the Edge
Zero ETL SQL & API Access
Zero DBA Embeddable
Secure Time Series
Performance at scale…its in the engines
• Vector Processing
• Leverage Chip Cache
• Realtime Trickle Updates
• Automatic Indexing
• Smart Compression
Total Cost of Ownership... Its about doing more with less
Linear for Data
Size growth
Cost to
performance
$/workload
Linear for
Concurrency Growth
Integrates
with current
ecosystem
Best
TCO
50 100 200 500 1000
$perconcurrent
workload
Count of Workloads
Cost Comparison
Lets not forget the Personas
Data
Scientist
Data
Engineer
Business
Analyst
Atomic Data
R, Python, Scala, SQL
Data Prep Functions
Feature Engineering Functions
Jupyter Hub, R Studio, SAS
Atomic and Aggregated Data
Python, Java, Scala
Data Manipulation Functions
Custom Functions
RESTful, Anaconda, deplyR
Aggregated, UI Data
SQL, Drag and Drop GUI
Data Visualization Functions
In Database Functions
Alteryx, Dataiku, Looker, Qlik
Operational Data Warehouse Architecture
Operational
Data Warehouse
Persistent
Storage
Cross-Engine
Orchestration
SQL Access
Studio, BI and Visualization
tools
Hadoop
S3
ADLS
Local Storage
Cloud Storage
Languages
SQL, Javascript, Python, R,
Java, Scala
Tools
Jupyter, RStudio, KNIME,
SAS, Dataiku
Column
Engine
NoSQL EngineRow Engine
Spark
Engine
Custom
Engine
Deep
Learning
Engine
Gluster
GFS
Data Fabric
Ingestion
Spark, Kafka, ETL
Data Scientist
Data Engineer
Business
Analyst
Actian Avalanche Cloud Data Warehouse
Business Intelligence
Advanced
Analytics
Artificial Intelligence
& Machine Learning
External data source ingestion
On Premise Apps/EDWsNative Cloud Apps & DataMobile & IoT
Batch (e.g. JDBC/ODBC)Streaming (e.g. Kafka & Spark)
Avalanche FlexPath™ Augmented Data Management
Performance Security Hybrid ExtensibilityScale
• Vectorization
• Adv. Columnar
• In-chip processing
• Automated Storage
Indexing
• Multi-core parrallization
• Real-time updates
• Petabyte Scale
• High Concurrency
• Extreme Transaction
Processing
• Analyze all data – no
sampling
• Authentication
• Discretionary Access
control
• Security auditing
• Auditing & GDPR
compliance
• End-to-end encryption
• Data masking
• Cloud & On-premise
• 100% functional
compatible
• Load and go exchange
between query workloads
• Multi-cloud support
• Open architecture
• Premise & SaaS
Connectors
• External Table
• Native Spark
• Access to any data
source
On-Demand Encrypted Self-tuning Autonomic Standards based
ActianAvalancheQuickStart
ActianAvalancheMigrationQuickStart
ActianAvalancheConnectorQuickStart
Authentication.
Service
Provisioning
Service
Management
Service
Inform
Service
Metering
Service
Monitoring
Service
Avalanche
QuickStart
Services
Introduction to Actian
Performance
at Scale
Hybrid by Design
Spectacular Operational
Savings
© 2019 Actian Corporation17
Fully Managed with
Pre-built
Connectors
Cloud Data Warehouse
www.actian.com/avalancheFree 30 day trial with $500 credit
Visit us at Booth 1238
@rchakravarthi1 linkedin.com/in/raghuc

Data Architecture for Modern Applications

  • 1.
    Raghu Chakravarthi Uncovering thenext generation of Data Architecture Insights at the speed of thought
  • 2.
    Customer 360 Risk Analytics HyperPersonalization Modern use cases need a cross section of data Behavioral Data Social Data Transactional Data
  • 3.
    Customer Journey Hyper Personalization Whatdo these data demand from data architecture Behavioral Data Social Data Transactional Data Quick Access, Analytics On Prem Fast inserts, updates Hybrid Heavy inserts, Realtime Cloud Realtime updates, Fast ingestion, Quick Access, Analytics where data lives Hybrid Cloud
  • 4.
    Key trends inthe data and analytics Managed Performance at scale Agility Best TCO Edge Hybrid Key Trends
  • 5.
    Legacy enterprise databaseshinder business growth Data siloed across on- prem & cloud Enterprises require existing on-prem investments to be leveraged Hard to move data into the cloud Cloud lock-in Performance slows with concurrency Designed to perform at TB scale not PB requiring additional compute to handle concurrency While pricing starts low, production workloads at scale are expensive Pricing scales up too quickly © 2019 Actian Corporation5 Vs.
  • 6.
    We need somethingbetter … Data Integration Performance to PB scale Pay for what you use Easy to manage © 2019 Actian Corporation6
  • 7.
    Introducing Operational DataWarehouse Usability Performance at scale Agility Best TCO Edge Hybrid Operational Data Warehouse
  • 8.
    These have becomeTable Stakes Compute Storage Separation Columnar Store Data Variety Fast Data Ingestion On Prem & Cloud Elasticity Hot, Warm, Cold Data Separation
  • 9.
    Analytics to Data Parallel UpdatesFederated Query Persona based access Data Labs Data Agility from access to analytics
  • 10.
    User experience improvesProductivity and Collaboration Usability SQL, Py, R, Scala, Java Data Cleansing Tokenizing Feature Prep Atomic Data Aggregated Data User Data Pipelines Access Rules Annotations Sandbox Versioning
  • 11.
    Data lives atthe Edge Zero ETL SQL & API Access Zero DBA Embeddable Secure Time Series
  • 12.
    Performance at scale…itsin the engines • Vector Processing • Leverage Chip Cache • Realtime Trickle Updates • Automatic Indexing • Smart Compression
  • 13.
    Total Cost ofOwnership... Its about doing more with less Linear for Data Size growth Cost to performance $/workload Linear for Concurrency Growth Integrates with current ecosystem Best TCO 50 100 200 500 1000 $perconcurrent workload Count of Workloads Cost Comparison
  • 14.
    Lets not forgetthe Personas Data Scientist Data Engineer Business Analyst Atomic Data R, Python, Scala, SQL Data Prep Functions Feature Engineering Functions Jupyter Hub, R Studio, SAS Atomic and Aggregated Data Python, Java, Scala Data Manipulation Functions Custom Functions RESTful, Anaconda, deplyR Aggregated, UI Data SQL, Drag and Drop GUI Data Visualization Functions In Database Functions Alteryx, Dataiku, Looker, Qlik
  • 15.
    Operational Data WarehouseArchitecture Operational Data Warehouse Persistent Storage Cross-Engine Orchestration SQL Access Studio, BI and Visualization tools Hadoop S3 ADLS Local Storage Cloud Storage Languages SQL, Javascript, Python, R, Java, Scala Tools Jupyter, RStudio, KNIME, SAS, Dataiku Column Engine NoSQL EngineRow Engine Spark Engine Custom Engine Deep Learning Engine Gluster GFS Data Fabric Ingestion Spark, Kafka, ETL Data Scientist Data Engineer Business Analyst
  • 16.
    Actian Avalanche CloudData Warehouse Business Intelligence Advanced Analytics Artificial Intelligence & Machine Learning External data source ingestion On Premise Apps/EDWsNative Cloud Apps & DataMobile & IoT Batch (e.g. JDBC/ODBC)Streaming (e.g. Kafka & Spark) Avalanche FlexPath™ Augmented Data Management Performance Security Hybrid ExtensibilityScale • Vectorization • Adv. Columnar • In-chip processing • Automated Storage Indexing • Multi-core parrallization • Real-time updates • Petabyte Scale • High Concurrency • Extreme Transaction Processing • Analyze all data – no sampling • Authentication • Discretionary Access control • Security auditing • Auditing & GDPR compliance • End-to-end encryption • Data masking • Cloud & On-premise • 100% functional compatible • Load and go exchange between query workloads • Multi-cloud support • Open architecture • Premise & SaaS Connectors • External Table • Native Spark • Access to any data source On-Demand Encrypted Self-tuning Autonomic Standards based ActianAvalancheQuickStart ActianAvalancheMigrationQuickStart ActianAvalancheConnectorQuickStart Authentication. Service Provisioning Service Management Service Inform Service Metering Service Monitoring Service Avalanche QuickStart Services
  • 17.
    Introduction to Actian Performance atScale Hybrid by Design Spectacular Operational Savings © 2019 Actian Corporation17 Fully Managed with Pre-built Connectors Cloud Data Warehouse www.actian.com/avalancheFree 30 day trial with $500 credit Visit us at Booth 1238
  • 18.