Teradata and Hortonworks 
The Unified Data Architecture (UDA) 
16th October, 2014
2 
Shift from a Single Platform to an Ecosystem 
"Logical" Data Warehouse 
“The hype around replacing the data 
warehouse gives way to the more 
sensible strategy of augmenting it … 
The influence of the logical data 
warehouse has created a situation in 
which multiple repository strategies are 
now expected.” 
“Big Data requirements are solved by 
a range of platforms including 
analytical databases, discovery 
platforms, and NoSQL solutions 
beyond Hadoop.” 
Source: “Big Data Comes of Age”. EMA and 9sight 
Consulting. Nov 2012.
Marketing 
Applications 
Business 
Intelligence 
Data 
Mining 
Math 
and Stats 
Languages 
ANALYTIC TOOLS 
& APPS 
Customers 
Partners 
Business 
Analysts 
Data 
Scientists 
USERS 
UNIFIED DATA ARCHITECTURE 
MOVE MANAGE ACCESS 
INTEGRATED DATA WAREHOUSE 
INTEGRATED DISCOVERY PLATFORM 
ERP 
SCM 
CRM 
Images 
Audio 
and Video 
Machine 
Logs 
Text 
Web and 
Social 
SOURCES 
DATA 
PLATFORM 
System Conceptual View 
Marketing 
Executives 
Operational 
Systems 
Frontline 
Workers 
Engineers
Marketing 
Applications 
Business 
Intelligence 
Data 
Mining 
Math 
and Stats 
Languages 
ANALYTIC TOOLS 
& APPS 
Customers 
Partners 
Business 
Analysts 
Data 
Scientists 
USERS 
UNIFIED DATA ARCHITECTURE 
Business Conceptual View 
INTEGRATED DATA WAREHOUSE 
INTEGRATED DISCOVERY PLATFORM 
ERP 
SCM 
CRM 
Images 
Audio 
and Video 
Machine 
Logs 
Text 
Web and 
Social 
SOURCES 
DATA 
PLATFORM 
Business Intelligence 
Predictive Analytics 
Operational Intelligence 
Data Discovery 
Path, graph, time-series analysis 
Pattern Detection 
Fast Data Loading 
& Availability 
Filtering & 
Processing 
Deep History: 
Online Archival 
Fast-Fail Hypothesis Testing 
Marketing 
Executives 
Operational 
Systems 
Frontline 
Workers 
Engineers 
MOVE MANAGE ACCESS 
Data Mgmt. 
(data lake)
5 
Discovering Deep Retail Insights with UDA 
Transforming Web Walks into DNA Sequences 
Impact 
Situation 
Largest German online retailer, conglomerate with numerous 
brands and 50 websites. 1 Millions visitors, viewing 2M 
products. 
Problem 
Needed a better way of analyzing consumer behavior on the 
websites, communicating with category managers 
Solution 
Treat each web visit sequence like DNA sequence. Built a fast 
query tools so analysts can express queries easily for their 
categories, get deeper insights 
• Leverage Aster platform to generate rapid path insights 
• Drives 15% increase in market baskets through personalization 
• Drives 10-20% increase in conversions by shortening paths 
• Can now see what does and doesn’t lead to sales 
• Widening use across all the Corporate Group websites
Modern Data Architecture: Teradata 
TVI – Proactive system monitoring tied to Teradata customer support 
Viewpoint Alerts Services System 
KNOX 
AMBARI 
SOURCE DATA 
Sensor Log 
Data 
Customer/ 
Inventory 
Data 
Clickstream 
Data 
Flat Files 
Sentiment 
Analysis 
Data 
DB 
File 
JMS 
REST 
HTTP 
Streaming 
Query/Visualization/ 
Reporting/Analytical 
Tools and Apps 
JDBC/ODBC Compliant 
Tool 
Analytical 
Platforms 
Aster Discovery 
Platform 
Teradata IDW 
MAPREDUCE 
YARN 
Health Node 
Health Space 
Usage Capacity 
Heatmap Metrics 
Analysis 
HDFS 
REFINE 
HIVE 
PIG 
ETL 
CUSTOM 
LOAD 
SQOOP 
FLUME 
NFS 
Web HDFS 
EXTRACT 
BULK COPY 
DISTCP AFS 
STRUCTURING 
HCATALOG 
INTERACTIVE 
QueryGrid 
EXPORT 
SQOOP / HIVE 
LOAD 
TDCH 
EXTRACT 
Bidirectional
7 
Teradata Portfolio for Hadoop 
” Bringing Hadoop to the Enterprise” 
• Most Trusted and Flexible Hadoop Platforms for Your 
Next-Generation Unified Data Architecture™ 
1. Teradata Aster Big Analytics Appliance 
2. Teradata Appliance for Hadoop 
3. Teradata Commodity Offering with Dell 
4. Hortonworks Data Platform software-only support resell 
• Complete consulting and training capability 
> Big Analytics Services—across the UDA 
> Data Integration Optimization—ETL, ELT across the UDA 
> Hadoop deployment and mentoring 
> Teradata delivering Hortonworks training 
> Hadoop Managed Services—operations and administration 
• Customer Support for Hadoop 
> World-class Teradata customer support, backed by Hortonworks
8 
Teradata Loom® 2.3 
“Integrated metadata management, data lineage 
and data wrangling for Enterprise Hadoop” 
Loom is a platform for profiling, preparing and tracking data lineage for data 
in Hadoop 
• Hadoop Data Governance and Metadata Management 
– Rich information model for capturing and managing the relationships 
– Data dictionary for the big data landscape 
– Support for non-Hadoop sources 
Free version of Loom pre-installed with 
Hortonworks Sandbox 
• Automation (Activescan) 
– Discovering and introspecting new data in the cluster 
– Triggering external processing (e.g. Oozie script for ETL) 
– Automatically collecting metadata about the job - lineage, statistics 
– Polling YARN job history for lineage 
• User Interactivity (Workbench) 
– Advanced user interfaces for data exploration, profiling and preparation 
– Data wrangling for interactively cleaning/reshaping raw data into useable data
Teradata Appliance for Hadoop 
9 
Teradata QueryGrid ® 
Teradata Studio with 
Smart Loader 
Value Added Software from Partners 
Teradata Viewpoint 
Teradata Connector for Hadoop (TDCH) 
Intelligent Start and Stop 
NameNode Failover 
Optimized hardware for Hadoop 
BYNET™ V5 40GB/s InfiniBand interconnect 
Teradata Vital Infrastructure 
Teradata Distribution for Hadoop 
(Based on Hortonworks HDP) 
Kerberos 
HCatalog 
Teradata Loom® ( for data management )
10 
Teradata QueryGrid™ Vision 
Business users Data Scientists 
TERADATA 
ASTER 
DATABASE 
SQL, 
SQL-MR, 
SQL-GR 
TERADATA 
DATABASE 
Multiple 
Teradata 
Systems 
HADOOP 
Push-down 
to Hadoop 
System 
IDW 
TERADATA 
DATABASE 
Discovery 
TERADATA 
ASTER 
DATABASE 
COMPUTE 
CLUSTER 
Run SAS, Perl, 
Ruby, Python, R 
RDBMS 
DATABASES 
Push-down 
to Other 
Database 
MONGODB 
DATABASE 
Push-down 
to NoSQL 
Databases
11 
Teradata QueryGrid™: Teradata - Hadoop 
Give business users on-the-fly access to data in Hadoop 
• Trusted: Use existing tools/skills and enable 
self-service BI with granular security 
• Standard: 100% ANSI SQL access to 
Hadoop data 
• Fast: Queries run on Teradata or Aster, 
data accessed from Hadoop 
• Efficient: Intelligent data access 
leveraging the Hadoop HCatalog 
QueryGrid: Teradata-Hadoop 
QueryGrid: Aster-Hadoop 
Hadoop 
MR 
Hive 
Hadoop Layer: HDFS 
Pig 
HCatalog 
Data 
Data Filtering
Teradata Viewpoint 
12 
Single Operational View (SOV) 
for Teradata, Aster, & Hadoop 
• Hadoop Portlets: 
– Node Monitor (Aster & Hadoop) 
– Hadoop Services 
• Integration into existing: 
– Monitoring: System Health, Metrics 
Analysis, Metrics Graph, Capacity 
Heatmap, Space Usage. 
– Admin: Alert Viewer, Alert Setup, 
Teradata Systems, Role Manager
Teradata Connector for Hadoop (TDCH) 
13 
• Key Features 
– High-speed connector between Teradata and 
Hadoop based on Apache Sqoop framework 
– Both import and export data between Teradata and 
Hadoop 
– Leverages the JDBC-FastLoad/FastExport mechanism 
from Teradata 
– Import/export Hive rcfile/sequencefile/textfile format 
and Hive partitioned files 
INTEGRATED 
DATA WAREHOUSE 
CAPTURE | STORE | REFINE 
• Available through Hortonworks 
> Hortonworks 
• Teradata Connector for Apache Hadoop (Release v1.2.0) 
• Download link: http://hortonworks.com/download/
Teradata Studio: Smart Loader for Hadoop 
Self-Service Load 
14 
• Hadoop View 
– Browse through tables 
within the Hadoop cluster 
- Views table properties 
– Bi-directional table copies 
- Drag and drop interface 
- Maps data types between Hadoop 
and Teradata tables 
– Transfer Status and History 
- Track load status 
• Benefits 
– Simplifies Hadoop browsing 
– Ad hoc data movement between 
Teradata and Hadoop 
– No scripting required 
– Point and click
15 
Questions and Next Steps 
More about Teradata & Hortonworks 
http://www.hortonworks.com/partner/teradata/ 
Teradata Loom for HDP 
http://www.teradata.com/tryloom 
Find Us 
@Strata 
Booth # 324 
Teradata Hadoop Station

Teradata - Presentation at Hortonworks Booth - Strata 2014

  • 1.
    Teradata and Hortonworks The Unified Data Architecture (UDA) 16th October, 2014
  • 2.
    2 Shift froma Single Platform to an Ecosystem "Logical" Data Warehouse “The hype around replacing the data warehouse gives way to the more sensible strategy of augmenting it … The influence of the logical data warehouse has created a situation in which multiple repository strategies are now expected.” “Big Data requirements are solved by a range of platforms including analytical databases, discovery platforms, and NoSQL solutions beyond Hadoop.” Source: “Big Data Comes of Age”. EMA and 9sight Consulting. Nov 2012.
  • 3.
    Marketing Applications Business Intelligence Data Mining Math and Stats Languages ANALYTIC TOOLS & APPS Customers Partners Business Analysts Data Scientists USERS UNIFIED DATA ARCHITECTURE MOVE MANAGE ACCESS INTEGRATED DATA WAREHOUSE INTEGRATED DISCOVERY PLATFORM ERP SCM CRM Images Audio and Video Machine Logs Text Web and Social SOURCES DATA PLATFORM System Conceptual View Marketing Executives Operational Systems Frontline Workers Engineers
  • 4.
    Marketing Applications Business Intelligence Data Mining Math and Stats Languages ANALYTIC TOOLS & APPS Customers Partners Business Analysts Data Scientists USERS UNIFIED DATA ARCHITECTURE Business Conceptual View INTEGRATED DATA WAREHOUSE INTEGRATED DISCOVERY PLATFORM ERP SCM CRM Images Audio and Video Machine Logs Text Web and Social SOURCES DATA PLATFORM Business Intelligence Predictive Analytics Operational Intelligence Data Discovery Path, graph, time-series analysis Pattern Detection Fast Data Loading & Availability Filtering & Processing Deep History: Online Archival Fast-Fail Hypothesis Testing Marketing Executives Operational Systems Frontline Workers Engineers MOVE MANAGE ACCESS Data Mgmt. (data lake)
  • 5.
    5 Discovering DeepRetail Insights with UDA Transforming Web Walks into DNA Sequences Impact Situation Largest German online retailer, conglomerate with numerous brands and 50 websites. 1 Millions visitors, viewing 2M products. Problem Needed a better way of analyzing consumer behavior on the websites, communicating with category managers Solution Treat each web visit sequence like DNA sequence. Built a fast query tools so analysts can express queries easily for their categories, get deeper insights • Leverage Aster platform to generate rapid path insights • Drives 15% increase in market baskets through personalization • Drives 10-20% increase in conversions by shortening paths • Can now see what does and doesn’t lead to sales • Widening use across all the Corporate Group websites
  • 6.
    Modern Data Architecture:Teradata TVI – Proactive system monitoring tied to Teradata customer support Viewpoint Alerts Services System KNOX AMBARI SOURCE DATA Sensor Log Data Customer/ Inventory Data Clickstream Data Flat Files Sentiment Analysis Data DB File JMS REST HTTP Streaming Query/Visualization/ Reporting/Analytical Tools and Apps JDBC/ODBC Compliant Tool Analytical Platforms Aster Discovery Platform Teradata IDW MAPREDUCE YARN Health Node Health Space Usage Capacity Heatmap Metrics Analysis HDFS REFINE HIVE PIG ETL CUSTOM LOAD SQOOP FLUME NFS Web HDFS EXTRACT BULK COPY DISTCP AFS STRUCTURING HCATALOG INTERACTIVE QueryGrid EXPORT SQOOP / HIVE LOAD TDCH EXTRACT Bidirectional
  • 7.
    7 Teradata Portfoliofor Hadoop ” Bringing Hadoop to the Enterprise” • Most Trusted and Flexible Hadoop Platforms for Your Next-Generation Unified Data Architecture™ 1. Teradata Aster Big Analytics Appliance 2. Teradata Appliance for Hadoop 3. Teradata Commodity Offering with Dell 4. Hortonworks Data Platform software-only support resell • Complete consulting and training capability > Big Analytics Services—across the UDA > Data Integration Optimization—ETL, ELT across the UDA > Hadoop deployment and mentoring > Teradata delivering Hortonworks training > Hadoop Managed Services—operations and administration • Customer Support for Hadoop > World-class Teradata customer support, backed by Hortonworks
  • 8.
    8 Teradata Loom®2.3 “Integrated metadata management, data lineage and data wrangling for Enterprise Hadoop” Loom is a platform for profiling, preparing and tracking data lineage for data in Hadoop • Hadoop Data Governance and Metadata Management – Rich information model for capturing and managing the relationships – Data dictionary for the big data landscape – Support for non-Hadoop sources Free version of Loom pre-installed with Hortonworks Sandbox • Automation (Activescan) – Discovering and introspecting new data in the cluster – Triggering external processing (e.g. Oozie script for ETL) – Automatically collecting metadata about the job - lineage, statistics – Polling YARN job history for lineage • User Interactivity (Workbench) – Advanced user interfaces for data exploration, profiling and preparation – Data wrangling for interactively cleaning/reshaping raw data into useable data
  • 9.
    Teradata Appliance forHadoop 9 Teradata QueryGrid ® Teradata Studio with Smart Loader Value Added Software from Partners Teradata Viewpoint Teradata Connector for Hadoop (TDCH) Intelligent Start and Stop NameNode Failover Optimized hardware for Hadoop BYNET™ V5 40GB/s InfiniBand interconnect Teradata Vital Infrastructure Teradata Distribution for Hadoop (Based on Hortonworks HDP) Kerberos HCatalog Teradata Loom® ( for data management )
  • 10.
    10 Teradata QueryGrid™Vision Business users Data Scientists TERADATA ASTER DATABASE SQL, SQL-MR, SQL-GR TERADATA DATABASE Multiple Teradata Systems HADOOP Push-down to Hadoop System IDW TERADATA DATABASE Discovery TERADATA ASTER DATABASE COMPUTE CLUSTER Run SAS, Perl, Ruby, Python, R RDBMS DATABASES Push-down to Other Database MONGODB DATABASE Push-down to NoSQL Databases
  • 11.
    11 Teradata QueryGrid™:Teradata - Hadoop Give business users on-the-fly access to data in Hadoop • Trusted: Use existing tools/skills and enable self-service BI with granular security • Standard: 100% ANSI SQL access to Hadoop data • Fast: Queries run on Teradata or Aster, data accessed from Hadoop • Efficient: Intelligent data access leveraging the Hadoop HCatalog QueryGrid: Teradata-Hadoop QueryGrid: Aster-Hadoop Hadoop MR Hive Hadoop Layer: HDFS Pig HCatalog Data Data Filtering
  • 12.
    Teradata Viewpoint 12 Single Operational View (SOV) for Teradata, Aster, & Hadoop • Hadoop Portlets: – Node Monitor (Aster & Hadoop) – Hadoop Services • Integration into existing: – Monitoring: System Health, Metrics Analysis, Metrics Graph, Capacity Heatmap, Space Usage. – Admin: Alert Viewer, Alert Setup, Teradata Systems, Role Manager
  • 13.
    Teradata Connector forHadoop (TDCH) 13 • Key Features – High-speed connector between Teradata and Hadoop based on Apache Sqoop framework – Both import and export data between Teradata and Hadoop – Leverages the JDBC-FastLoad/FastExport mechanism from Teradata – Import/export Hive rcfile/sequencefile/textfile format and Hive partitioned files INTEGRATED DATA WAREHOUSE CAPTURE | STORE | REFINE • Available through Hortonworks > Hortonworks • Teradata Connector for Apache Hadoop (Release v1.2.0) • Download link: http://hortonworks.com/download/
  • 14.
    Teradata Studio: SmartLoader for Hadoop Self-Service Load 14 • Hadoop View – Browse through tables within the Hadoop cluster - Views table properties – Bi-directional table copies - Drag and drop interface - Maps data types between Hadoop and Teradata tables – Transfer Status and History - Track load status • Benefits – Simplifies Hadoop browsing – Ad hoc data movement between Teradata and Hadoop – No scripting required – Point and click
  • 15.
    15 Questions andNext Steps More about Teradata & Hortonworks http://www.hortonworks.com/partner/teradata/ Teradata Loom for HDP http://www.teradata.com/tryloom Find Us @Strata Booth # 324 Teradata Hadoop Station