Slide 1 - McGill University
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Slide 1 - McGill University

on

  • 449 views

 

Statistics

Views

Total Views
449
Views on SlideShare
449
Embed Views
0

Actions

Likes
1
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • To move forward We should have these meetings more often. Going to focus on significant changes in 10g mostly those that have an impact on Infrastructure and Budget
  • Put into Context the problems we face
  • Too many difference stress staffing resources
  • - Going to cover these in more detail in later slides, but for now trying to put into context what new feature solves a problem
  • Previous database upgrades were simple, basically DBAs just ran scripts to upgrade the database and developers tested to make sure the upgrade didn’t break anything. 10g will involve changes at all levels of the product stack, and will involve a strong relationship and commitment between teams in NCS and ISR to be successful.
  • All developers;
  • Ask Colin about the single-sign on statement, would we consider Kerberos??
  • Mention that this follows Oracle’s best practices/recommendations

Slide 1 - McGill University Presentation Transcript

  • 1. Evolving the Enterprise’s Database Infrastructure “ Move to the Grid”
  • 2. Agenda
    • Problems
    • Introduction to Oracle 10g features
    • Demonstrate impact on the Enterprise
    • Propose Phase I Project
      • Consolidation
      • Scalable Grid Architecture
  • 3. Top 7 Problems for DBAs
    • Growth in number and size of databases do not match staffing levels
    • Root cause of performance bottlenecks are not easily diagnosed or obvious
    • After a session ends, statistics and troubleshooting information are not always available
    • Databases are shoehorned onto servers without consideration of correct layout leading to IO bottlenecks
  • 4. Top 7 Problems for DBAs
    • Impossible to manually monitor and tune all databases
    • Managing storage correctly is very time consuming
    • Database tuning is part experience, part science, part art and part intuition.
  • 5. Top Problems for Sysadmins
    • Many different servers, different architectures
    • High number of databases per single node – complex to schedule maintenance windows
    • Grey area between DBA and sysadmin responsibilities
  • 6. New in 10g
    • The vision for the grid
    • 10g not a regular database upgrade
    • RAC enhancements
    • Backup strategy
    • ASM (Automatic Storage Management)
    • ADDM & Advisors
    • DataGuard
  • 7. Problems Solved in 10g for DBAs
    • Some tedious and time consuming DBA tasks are now managed by Oracle
    • Oracle will identify root causes of performance issues and rank the effectiveness of fixing them
    • Oracle stores statistics about every session in its repository
    • ASM will rebalance hot spots making it easier to have many databases on a server
  • 8. Problems Solved in 10g for DBAs
    • 10g metrics and alerts will allow the DBAs to be more proactive by providing out of the box alerts
    • ASM will allow for Oracle to manage storage reducing this very time consuming problem
    • Oracle 10g provides advisors for tuning
  • 9. The vision for the Grid
    • The “g” in 10g
    • Grid is not RAC, RAC is not Grid
    • Treat all computing resources like a utility in all layers of the product stack
    • Clustered application servers (ias cluster)
    • Clustered database (RAC)
    • Automatic Storage management (ASM) for provisioning Storage
  • 10. The vision for the Grid
    • Scalability – Easily add more resources
    • Management, monitoring and provisioning with “Grid Control”
    • Virtualization of resources – Applications are not tied to specific hardware but rather see one large pool of resources
  • 11. 10g NOT a regular database upgrade
    • Big learning curve
    • Changes at all levels of the hardware stack
    • Good opportunity to define job responsibilities in relation to the hardware stack
  • 12. The grid hardware stack
    • Application servers (ISR/NCS depending on application)
    • Databases (DBA Team / ISR)
    • Load balancers/Interconnects/Network Infrastructure (NCS)
    • Servers (NCS Sysadmins)
    • Storage Architect (NCS)
    • Cluster (Sysadmins/Storage Architects)
    • Firewall appliances (NCS)
    • Backups (DBA / NBU Admins)
  • 13. RAC Enhancements
    • FAN – Fast Application Notification
    • Smarter load balancing across nodes
      • Can now mix different classes of servers in your Cluster this gives ability to leverage existing hardware
      • Before grid some servers were almost always idle and some were never idle, grid makes the best use of resources
    • Assign % of CPU usage to a Service
    • Better management of workload
  • 14. Backup Philosophy in 10g
    • Backups go to disk not tape
    • Flashback logs
      • Supports flashback database and recovery through resetlogs
    • Flash recovery area
      • On disk
      • Holds one full backup
      • Holds all Incrementals
      • Archive & flashback logs
      • Backed up and managed by RMAN
      • Flash recovery area backed up to Tape
      • Best practice: Use ASM for this area
      • Shared by all instances on server
  • 15. Backup philosophy in 10g
    • Benefits
      • Most failures now are due to NBU on a rate of 5 or 6 per day. Requires operations to resubmit the backup and DBA time to follow up.
      • Time of Backup now at 4-6 hours (for MCGP)
      • Lots of time spent waiting on tape
      • Recovery from tape is slow, new features help minimize downtime
      • All files to recover are in same location
      • Having this on ASM minimizes work to maintain archivelog free space (avoid database hang)
  • 16. Automatic Storage management
    • Oracle’s “Smart” Filesystem
    • DBAs only have to deal with a few diskgroups rather then trying to fit datafiles on fixed size mountpoints.
    • Raw partitions have always been recommended for performance but before ASM were very difficult to manage
    • ASM can stripe and mirror your storage (Optional)
    • ASM can rebalance to avoid hot spots
    • Managing storage is very time consuming to do right, ASM does the tedious tasks for you.
  • 17. ADDM & Advisors
    • Oracle has internalized metric collection in 10g
    • ADDM runs and looks for problems
    • ADDM will recommend the use of advisors to further investigate the problem
    • Will help the DBA (and developer) by providing tuning advice.
  • 18. DataGuard
    • What is redo
    • RAC = Instance availability
    • DataGuard = Database availability
    • Logical and Physical standby
    • Protect database vs. Provide service
    • All enterprise systems should have Dataguard
    • Imagine loosing an hour of committed transactions in Banner or Vista?
    • Time to rebuild an enterprise system?
    • Uses for DWH
  • 19. Phase I Project scope
    • Bring in required infrastructure
    • Consolidate
      • Tempest/Squall replaced with scalable grid technology
      • Migrate DORACs/ORACs into this architecture
  • 20. Phase I Project scope
    • Current grid control implementation not highly available
      • Migrate Grid Control repository database to RAC.
      • Cluster application server, Norad2
      • Leverage virtualization
  • 21. Required Infrastructure (Grid Control)
    • Have been using grid control for the past two years since it was beta
    • Not optional in 10g*
    • Has helped us to develop standards and be proactive
    • Upgrade to release 2 in progress
    • Release 2 improves on provisioning and RAC management
    • Will be used by developers as well as DBAs when we go to 10g
    • In release 2, Oracle has partnered with third parties to deploy agents on non Oracle software and appliances Including SQL Server, WebLogic, F5 Load Balancers
  • 22. Losing Grid Control
    • No monitoring and alerts for databases
    • No GUI to manage 10g databases
    • Loss of tools for programmers and DBAs
    • Scheduled DBA jobs would not run
  • 23. Required Infrastructure (OID)
    • Oracle Internet Directory
    • ONAMES is deprecated in 10g. ONAMES is a central naming service used to translate a name to a connect string and is needed for connectivity.
    • Bridge from Oracle products to Active Directory for single sign-on and authentication
    • Could have many other uses to manage and simplify security in Oracle products (Needs more research)
    • Should be highly available or risk users not being able to connect to databases
  • 24. Required Infrastructure (OID)
    • Establish a two node OID, objectives:
    • Replace ONAMES and shared TNSNAMES files as a standard naming method
    • Clean up of all names as well as investigate the use of global_names
    • Replace infra1.portal.mcgill.ca for managing authentication. (Migrate asdb instance on infra1 to RAC - solely for Portal metadata)
  • 25. Infrastructure (worth investigating)
    • WebCache
    • Part of Oracle application server install
    • Used by Portal (but not currently installed in HA config)
      • Should be made highly available
    • Should have a better understanding of how it works
    • Can it benefit more than just the portal? (Improve Registration?)
    • Investigate “Times 10” data cache
  • 26. Consolidation (Tempest/Squall)
    • Tempest and Squall are servers funded by NCS as per a Tony Masi initiative to consolidate disparate databases from across campus.
    • Tempest is a test server containing 12 databases.
    • Squall is a production server containing 20 databases
    • Databases serve mostly E-business group’s clients, ICS (HEAT) and ARR (Scheduling)
    • On-going demand for new databases
    • Difficult to estimate capacity and resource needs
    • Not scalable and not highly available
    • Best candidate for new architecture
  • 27. Consolidation (Tempest/Squall)
    • Set up a 10g test grid to replace Tempest
    • Set up a 10g production grid to replace Squall
    • Migrate any applications on Tempest/Squall to 10g grid for which 10g is supported as well as migrate all McGill developed applications currently residing on Tempest/Squall.
    • Migrate NCS databases
    • Production Grid will provide a location for any 10g database that needs to be highly available (Grid Control repository, Portal repository)
    • Project should include consultant from Oracle to review plan, discuss best practices and guide in initial setup of test environment.
    • Good learning experience before restructuring large Enterprise systems (Vista, Banner)
  • 28. Risks of non-action
    • Not a Tony Masi “Top 5” project but if we do not get Phase I accomplished and gain the needed knowledge we will not meet next year’s objectives (i.e. Vista upgrade, Banner upgrade)
    • Staff resources continue to be stressed
    • Advantages of new best practices for RMAN and backups of flash recovery area
    • Development of methodology for migrating to Cost based optimizer
    • Learning best practices for ASM on Hitachi SAN
    • Benefiting from new features in OEM (monitoring, tuning and provisioning)
    • New failover and load balancing features on RAC (FAN – Fast application Notification)
    • Setup and configuration of 10g RAC
  • 29. Key Skills to Develop
    • Best practice to migrate 9i RAC to 10g RAC
    • Correct use of WebCache
    • Understand implications of global_names=true
    • Get developers up to speed on writing good code and performance tuning as well as trained on using new 10g tools
    • Oracle Internet Directory
  • 30. Summary
    • Big learning curve
    • Need to move forward or future projects will be in jeopardy of failure
    • All levels of hardware stack are implicated