2. Agenda
• Company Background and EDW Configuration
• Why Virtualize?
• Virtual Environment Details
• Benefits
• Challenges
• Results
• Next Steps
• Questions
3. About O.co
• Initial focus on liquidating excess inventory and currently
focus on offering brand-name merchandise at discount prices
• Products offering has grown from less than 100 in 1999 to
over 700,000 currently
• Customer Service – Currently ranked #4 by National Retail
Federation and American Express
• O.co, also known as Overstock.com, is Your Savings Engine offering
brand-name products. The company offers its customers an
opportunity to shop for bargains conveniently, while offering its
suppliers an alternative inventory distribution channel. O.co,
headquartered in Salt Lake City, is a publicly traded company listed on
the NASDAQ Global Market System and can be found online at
www.overstock.com and www.o.co.
6. Abstract & Motives
• Test system currently functions as both development and test
environments
• Creates inconsistent object names between environments
• Evaluating the limits and boundaries of new technology
offerings
• Enable Test system to more closely resemble production
• Functional testing and development
7. Other Considerations for Virtualization
• Performance
• Decreased load and demands on EDW systems
• Availability
• Additional flexibility for scheduling system maintenance
• Isolation & Sandbox
• Budget
• Commodity hardware
9. VMware ESXi
• Installs directly on physical servers and partitions into
multiple virtual machines that can be run simultaneously
• Currently utilizing ESXi 4.0 cluster
• Can clone and deploy additional images from master
templates
• Requires administrator interaction to clone and perform
basic configuration steps
• Provides workload management and automatic migration
between hosts on cluster
• “Plug & Play” install of new TD Express images
• SAN Storage
• Tier 2 SAN Storage (10K RPM SAS)
10. TD Express
• Free developer version of the Teradata Database
• 1 TB Teradata Express 13.10 for VMware Player
• Provided configuration includes
• SLES 10 64-bit Linux
• Teradata Database
• Teradata Tools and Utilities (TTU)
• Load & Unload Tools
• Teradata’s EasyLoader tool
• SQL Assistant Java Edition
• Provided AS IS (Unsupported)
12. Benefits
• Secure
• Data never outside corporate datacenters
• Testing
• Uncertified software ahead of upgrades
• New features
• BAR/Disaster Recovery
• Able to connect externally using existing TTU
• Developer Isolation & Sandboxes
13. Additional Benefits
• Educational playground
• Additional exposure to internal processes and programs
• New features & functionality
• Growth & capacity planning
• Additional hardware (Nodes)
• Additional storage
14. Environment Isolation
• Allows for greater flexibility in testing and development
• Reduces contention on frequently updated and accessed
objects
• Experimentation
• New processes, programs & functionality
• UDF’s
• Safe
• Snapshot functionality
• Regression testing
16. Database Administration Challenges
• Synchronizing and Migrating DDL
• TSET
• Data Movement
• BAR (Arcmain)
• Data Mover
• ETL (Fastload, TPT, Multiload, etc..)
• Performance
• Limited to 2 AMP’s
• Hardware configuration
• Shared hardware
• Data Distribution/Skew
17. Security & Access
• Maintain an environment representative of the enterprise
Teradata systems
• Simplify roles and rights to limit maintenance overhead on
multiple virtual machines
• Consolidate users on virtual machines
• Corporate security considerations (SSO, etc…)
18. 1 TB Perm Space Limitation
• Try to minimize base image size to limit disk space overhead
• Allow developers to load larger datasets as needed
• Varying projects require vastly different data sets
19. Perm Space Limitation – Solution 1
Limited Data Set
• Seed smaller base objects (Lookup/Dimensions) in full
• Larger tables can be seeded partially or left empty (DDL only)
• Maintaining referential integrity can be challenging
• Still may not be able to provide enough data for functional
testing in very large environments
• May enable limited integration testing in DEV environment
• Larger storage footprint
20. Perm Space Limitation – Solution 2
Break Virtual Images into Subject Areas
• Can be difficult to identify and maintain objects
• Duplicate Objects
• Same objects required across multiple subject areas (3NF)
• Not feasible for all subject areas
• Referential integrity across subject areas
• Merging DDL changes back to trunk
• Integration testing may not be possible
21. Perm Space Limitation – Solutions 3 & 4
Ad hoc – Self Service
• Allows for greater flexibility in data size
• Requires self service options to limit administrative overhead
• Additional work involved for developers to prepare environment
• Unit testing only - Integration testing not possible
Materialize Queries – BI/Reporting Only
• Limited to report development only
• Better performance for multiple query executions during
reporting development (Cube, Grid & Report refreshes)
• SQL would not be consistent between environments
• Requires self service options to be implemented
22. Other Challenges
• Development & Deployment Lifecycle
• Image Management
• Version Control
• Library (Check in/out)
• Refresh Interval
• Migration and Project Planning
23. Support Considerations
• DBA
• Architecture
• Self Service
• External Groups
• VMware systems
• Storage
• Network
25. Achievements
• Testing new features and functionality ahead of 13.10 upgrade
• Conduct proof of concept testing with minimal impact to
enterprise systems
• Ability to perform preliminary process validation
• Currently rolling out to ETL Developers
• Endpoint Testing
• Integration into ETL infrastructure
• TTU Tools Validation
26. Successful Projects
• Oracle Data Integrator (ODI/Sunopsis )
• Hadoop UDFs – Full Development Lifecycle
• Proof of Concept
• Development
• Testing & Validation
• GoldenGate Testing & Validation
27. Lessons Learned and Takeaways
• Security & Access Controls
• Decide early on how to simplify roles and consolidate users
• Mixed response from end users and developers
• Perform limited Beta testing with mixed selection of users
• Data seeding challenges
• Perm Space Limitations
• Refresh Method
• Throughput
• Refresh Intervals
• System Requirements
• IP Addresses
• Hosts file entries
• Keep it simple!
28. Hardware Resource Allocation
• 2 Core Minimum (x64 CPU)
• 2 GB Memory Minimum
• 4 GB suggested for acceptable performance
• Static IP Addresses
• Encountered issues with VMware dynamically assigning IP
addresses as images migrated between hosts
• Physical Network Connections
• Ensure adequate bandwidth for all Virtual Machines on the
physical host
• Storage
• Tier 2 SAN (10K Serial Attached SCSI)
• Workload Management
• Disk I/O
• Concurrent Users
• Dedicated Virtual Environment
29. End User/Developer Considerations
• Deployment procedures and processes
• Well defined image life cycle & refresh interval
• Training
• Best Practices Guide
• Constraints
• Perm space
• Limited Data Set
• Storage
• Performance
• 2 Amps
• Shared Hardware
• Skew
• Unit Testing Only
31. Data Discovery
(Source)
Refresh
Checkout and
Development
Configure
Images (Virtual
Virtual Machine
Machine)
Development Lifecycle
Production Development &
Deployment Unit Testing
(Enterprise (Virtual
Systems) Machine)
Integration
Testing & QA
(Enterprise
Systems)
33. Next Steps
• Developer and End user Access Levels
• System Level
• Database Level
• Automation
• Master Image Updates
• Viewpoint
• Data Mover -> Self Service
• Full Deployment
• Decommission existing DEV environment
34. Next Steps – VMware Environment
• SAN Storage
• NetApp Storage Solution
• Enables the ability to quickly clone Virtual Machines
without consuming additional space
• Capacity Planning
• Thin Provisioning
• VMware Lab Manager
• Reduced administration
• Self service portal
• Better support for high turnover machines
• Linked machines (de-duplication)
36. Virtual Self Service Environment
Developers
Image 1 Users select an available
image from library
Image 2
Image 3
Image 4
Image 5 Image is copied from
Image 6 library and place in
Image is started in
an available slot on
virtual environment
VMware Cluster
and ready for use
Image Library
Image 1 Image 2 Image 4
Image 3 Image 1 Image 1
Administrator Image 2 Image 6
Image 5
Creates and Maintains Library
Snapshot
VMware Cluster
37. Summary
• Provides isolated sandbox for testing and development
• Safe and secure environment
• Provided by Teradata
• Supports Teradata Tools and Utilities (TTU)
• Works with existing Teradata infrastructure
• Scales easily with additional users & projects
• 1 TB perm space constraint can be challenging
• Cannot be used for performance testing or tuning
• Decreases loads on enterprise systems
• Budget considerations
• Runs on commodity hardware & disk