Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Teradata - Architecture of Teradata


Published on

Parsing Engine(PE),
Request and Response Parcel,
Access Module Processors (AMPs),
Data access Handling,
TD Config Utilities,
Config and Reconfig,

Published in: Technology
  • Be the first to comment

Teradata - Architecture of Teradata

  1. 1. Introduction To Teradata
  2. 2. Teradata Company Highlights • Founded 1979 – West LA • First product to market – 1984 • First Terabyte system – 1987 • Acquired by AT&T and merged with acquired NCR – 1992 • Tri-vested as part of NCR - 1997 • Teradata Corporation – (re)Launched October 1, 2007 – Global Leader in Enterprise Data Warehousing • EDW/ADW Database Technology • Analytic Solutions – Positioned in Gartner’s Leaders Quadrant in data warehousing since 1999 • Top 10 U.S. publicly-traded software company – S&P 500 Member – Listed NYSE: “TDC” – 2007 - $1.7B revenue
  3. 3. Continuous (R)evolution Hardware + Database + Consulting + Data models and reports + Analytic applications
  4. 4. Continuous (R)evolution Sell the HW, give everything else away Sell the SW with some HW to run on Sell solving business problems – and technology to solve them Sell applications with consulting, SW and HW inside
  5. 5. Continuous (R)evolution 90% R&D 10% integration 80286 70% R&D 30% integration i486 20% R&D 80% integration Pentium 10% R&D 90% integration Xeon Quad Core
  6. 6. Scale • Every dimension of the technology must scale to meet today’s requirements – Data, Data model complexity, Users, Performance, queries, Data loading, … • What is a big Data Warehouse? • Total spinning disk? – 2.5 Petabytes • Big table? – 150 billion rows • Number of tables? – 300,000 • Insert/Update per day? – 5 billion records • Identified users? – 100,000 • Queries per day? – 5 million • Data Turnover rate? – 1TB per 5 seconds
  7. 7. The Problem 10 > 09/2009 Accts. Payable Accts. Receivable Invoicing Sales/Orders Finance G/L Customer Support HR Payroll Purchasing Order Fulfillment Manufacturing Inventory … Marketing Supply Chain Finance Risk Management Maintenance Sales Operations Inventory Call Center … Operational Systems Decision Makers
  8. 8. The EDW Solution Accts. Payable Accts. Receivable Invoicing Sales/Orders Finance G/L Customer Support HR Payroll Purchasing Order Fulfillment Manufacturing Inventory … EnterpriseEnterprise DataData WarehouseWarehouse (EDW)(EDW) Marketing Supply Chain Finance Risk Management Maintenance Sales Operations Inventory Call Center … Operational Systems Decision Makers
  9. 9. Active Enterprise Intelligence™ An Obvious Trend: More Speed, More Users Strategic Intelligence Operational Intelligence Enterprise Data Warehouse BI Tools & reports Analysis & visualization Predictive Analytics EDW Enterprise Integration Mixed workload management SOA, BPMS, IDEs Portals/composite applications Days Seconds
  10. 10. Active Enterprise Intelligence™ enabled by an Active Data Warehouse™ STRATEGIC INTELLIGENCEOPERATIONAL INTELLIGENCE Business Intelligence Tools and Applications Teradata Warehouse Workflow & Applications Active EventsActive Access Suppliers Customers Call Center Logistics MarketingFinanceProduct/ Services Executive Active Enterprise Integration Active Availability Active Workload Management Active Load
  11. 11. Active Enterprise Intelligence™ in Retail Detecting Retail Fraud Situation Thieves make copies of cash register receipts, walk into the store, pick up merchandise, and return items for cash. Problem Associates in returns department did not have historical POS receipt retrieval access to verify against previously “returned” receipts or to do returns without receipts. Solution Associates query Teradata to quickly check if a return has already occurred on that receipt number. Also used by analysts to understand and prevent excessive returns. Impact (for 500-store chain) • 100% ROI in 5 months • Stopped a crime ring on the first day of rollout • “Cost savings have been huge”
  12. 12. Active Enterprise Intelligence™ in Retail Single View of the Customer Across All Channels Situation Needed to add Web channel for selling shoes. Problem Too much time and cost to keep multiple customer systems synchronized. Realized they needed just one customer database, not one more for the Web, in addition to Call Center, and POS/Store databases. Solution Adopted an ADW strategy, moved all customer data to one Teradata system, revised data models to cover all channels, added web channel for commerce, used web services, added TASM to handle multiple workload types Impact • 1M tactical hits to the EDW per day from the POS, Call Center, and Web with 0.11 sec response time • Runs simultaneously with back-office BI, reports, and ETL workloads • Eliminated all other customer data systems
  13. 13. What is the Measure of a Great Architecture? Handle huge changes of underlying technologies and dependent components while continuing to deliver the key value proposition.
  14. 14. Processor RoadmapCPU power radically increasing 2003 2005 2009 2011 90nm process 45nm process 65nm process 32nm process 22nm process Hyper-Threading Dual Core Multi Core 20002000 2008+2008+ SPECInt2000SPECInt2000 5X5X SINGLE-CORESINGLE-CORE PERFORMANCEPERFORMANCE DUAL/MULTI-CORE PERFORMANCE 2007 20042004
  15. 15. What Does Shared Nothing Mean? • 1985 – Every hardware part, every line of software – “pure” shared nothing • 1995 – Multiple units of parallelism sharing CPU, memory • 2004 – Multiple units of parallelism sharing multiple cores, memory • 2009 – Multiple units of parallelism sharing same physical spindles – but still not sharing data • Future – Multiple units of parallelism in Virtual machines/cloud not even knowing what physical machine it is on or sharing 19 > 09/2009 Copyright Teradata © 2007-2009 – All rights Reserved
  16. 16. Teradata MPP Server Architecture • Nodes – Incrementally scalable to 1024 nodes • Operating System – Linux, Windows, Unix • Storage – Independent I/O – Scales per node • BYNET Interconnect – Fully scalable bandwidth • Connectivity – Fully scalable – Channel – ESCON/FICON – LAN, WAN • Server Management – One console to view the entire system SMP Node1 SMP Node2 SMP Node3 SMP Node4 Server Management Dual BYNET Interconnects CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys
  17. 17. Shared Nothing - Dividing the Work • “Virtual processors” (vprocs) do the work • Two types – AMP: owns and operates on the data – PE: handles SQL and external interaction • Configure multiple vprocs per hardware node – Take full advantage of SMP CPU and memory • Each vproc has many threads of execution – Many operations executing concurrently – Each thread can do work for any user, transaction • Software is equivalent regardless of configuration – No user changes as system grows from small SMP to huge MPP
  18. 18. Shared Nothing - Dividing the Work • Basis of Teradata scalability – Each AMP owns an equal slice of the disk – Only that AMP reads that slice • No single point of control for any operation – I/O, Buffers, Locking, Logging, Dictionary – Nothing centralized – Exponential communication costs avoided AMPsLogs Locks Buffers I/O # Nodes Coordination cost Teradata
  19. 19. Teradata Data Distribution • Rows automatically distributed evenly by hash partitioning – Even distribution results in scalable performance – Done in real-time as data are loaded, appended, or changed. – Hash map defined and maintained by the system • 2**32 hash codes, 64K buckets distributed to AMPs – Prime Index (PI) column(s) are hashed – Hash is always the same - for the same values – No reorgs, repartitioning, space management Table A Table B Table C AMP1 AMP2 AMP3 AMP4 ……………………………………………………… AMPn Primary Index Teradata Parallel Hash Function P DM P DM P DM P DM P DM P DM P DM P DM P DM RowHash (Hash Bucket) Data Fields
  20. 20. Disk Capacity Exploding with Little Increase in Performance 36 GB 5.5 73 GB 6.0 146 GB 6.4 .044 .080 .155 PerformanceperCapacity MB/Sec/GB DiskDriveBandwidth(MB/Sec) 1 2 3 4 5 6 7 8 Disk Drive Capacity
  21. 21. Platform Change • Focus used to be – Optimization of expensive CPU cycles – Micro-management of precious disk space • Now – Manage I/O – Balance CPU power to the I/O capacity – Find new ways to optimize I/O, trading for CPU use as necessary – Pulling 2.5GB/sec per node continuous • Discontinuity coming – SSDs become price competitive and reliable
  22. 22. File System • Teradata wrote a new rule book – Old one written by IBM 35 years ago, used by all mainstream DBMSs today - except Teradata • File system built of raw slices • Rows stored in blocks – Variable length – Grow and shrink on demand – Rows located dynamically • May be moved to reclaim space, defrag – Maximum block size is configurable • System default or per table • 8K to 128K • Change dynamically • Indexes are just rows in tables • Has evolved from direct management of single spindles to completely virtualized storage, not even knowing spindle location
  23. 23. Workload Management Evolution • 1984 – pure timeshare • 1987 – 4 priorities, defined by user • 1995 – multiple priorities in multiple partitions • 2000 – weighted workload groups • 2004 – queuing, reserved resources, focus on tactical work • 2009 – Visualization and detailed workgroup management • Future – Set service level goals, our job to deliver
  24. 24. Active Workload Management • Manage workloads – Reduce server congestion • Dynamically adjust in-flight task priority – Turn the dial – change priorities • Fast active access queries – Performance, performance, performance • Get maximum throughput Speed 10 Active Events Active Access Query and ReportingActive Load Active Data Warehouse Speed 60 Speed 75 Speed 25
  25. 25. TASM Reporting/Monitoring - 13.10
  26. 26. Availability Requirements IT, Finance, Planners, Power Users, Data Miners Executives, Middles Managers, Marketing 1000000 100000 10000 1000 100 10 Consumers Suppliers B2B Operational Employees Category Mgr, Line Managers, Service Managers Users Mission Critical Dual Active Strategic Intelligence Operational Intelligence
  27. 27. “Always ON” – An Elusive Challenge • Unplanned downtime – Hardware faults – Software faults – Hangs • Planned downtime – Software upgrade – Hardware upgrade – Data center maintenance • “Disasters” – Multi-component failures – Building disasters – Area disasters • And optimize resource value to the business • And avoid hidden costs and surprises – Eg Major performance variations • Major opportunity for research – but must be holistic – Reaches far beyond core database
  28. 28. Real time Operational Actions Strategic Intelligence Operational Intelligence 1. Customer makes multi-segment travel reservation 2. Flight rerouted causing missed connections. “Active” Enterprise Data Warehouse 3. What are the customers’ flying history? 4. How profitable is each customer? 5. Which customers experienced delays or other problems in last 6 months? WebSphere MQ, Oracle AQ, Microsoft MSMQ 6. Customer re-booked and notified. 7. Airport operations adjusted
  29. 29. Real Time Customer Management Strategic Intelligence Operational Intelligence 4. Is this customer approaching the predicted loss rate for their segment? 5. What offers are available for this customer?6. Message sent to floor Luck Ambassador with customer offer to prevent additional losses. TIBCO 2. What is the customer’s past spending history in all our casinos? 3. What is a significant loss for this person based on market segment, past and predicted behavior?“Active” Enterprise Data Warehouse 1. Customer inserts Total Rewards Card at Slot Machine
  30. 30. That’s a Wrap! • Business requires a new level of decision making – Many more decisions by many more people much faster – Current representation of the state of the enterprise • Data Warehouse must evolve to support the requirements of Active Enterprise Intelligence • Technology must evolve to deal with the new requirements – Rich area for research and innovation – Change view of what data warehouse/BI means • Teradata driving an aggressive roadmap to meet real business requirements
  31. 31. For More Information click below link: Follow Us on: Thank You !!!