• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Clustering van IT-componenten
 

Clustering van IT-componenten

on

  • 1,254 views

Inleiding op de clustering van IT-componenten

Inleiding op de clustering van IT-componenten

Statistics

Views

Total Views
1,254
Views on SlideShare
1,254
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Clustering van IT-componenten Clustering van IT-componenten Presentation Transcript

    • Clustering Presentatie voor SNS Bank, afdeling Datacenter Services,TADS-DBS 12 Juli 2007
      • Clustering algemeen
        • (Web) applicaties
        • (Transactieverwerkende) databases
      • Oracle clustering
      • Conclusie
      Agenda
      • A formal Definition
      • “ Clustering is a configuration of a group of autonomous server machines that work together to behave as a single system.”
        • www.dbmsmag.com
      • In Simple
      • “ A computer cluster is a group of loosely coupled computers that work together closely so that in many respects it can be viewed as though it were a single computer.”
        • wikipedia.org
      What is a Clusters?
      • It provides high Scalability
        • You can “scale out” your system cluster according to your requirements
        • It is more cost effective
          • Redundant Array of Inexpensive Servers (RAIS)
      • It provides high Availability
        • It eliminates the single point of failure
        • Redundant nodes can used to automatically recover data
      Why Clusters?
    • Clustering for Scalability Browsers Web Servers Servlet Engines (JSP) Object Servers (JMS/EJB) Databases (JDBC) = Load Balancing of incoming Connections (Web) Application Clustering architecture Example: BEA WebLogic Cluster Architecture Load Balancing & Failover Points #1 #2 #3 #4
    • Clustering for Availability Session State Replication For State full Services Browser Web Servers Servlet Engines B A B C B C A (Web) Application Clustering architecture Example: BEA WebLogic Cluster Architecture #1 #2
    • Three dominant themes in building high transaction rate multiprocessor CLIENTS CLIENTS CLIENTS Memory Processors Easy to program Expensive to build Difficult to scale Hard to program Cheap to build Easy to scale
      • Sequent
      • SGI
      • Sun
      • VMScluster
      • IBM Parallel Sysplex
      • Oracle RAC
      • Tandem
      • Teradata,
      • IBM SP2
      • IBM BD2 UDB
      (Use affinity routing to approximate SN- like non-contention) Transaction processing is designed to maintain a database in a known, consistent state, by ensuring that any operations carried out on the database that are interdependent are either all completed successfully or all cancelled successfully. Shared Memory (SM) Multiple processors shared a common central memory Shared Disk (SD) Multiple processors with private memory share a common collection of disks Shared Noting (SN) Neither memory nor peripheral storage is shared among processors
    • System feature Shared Memory (SM) Shared Disk (SD) Shared Nothing (SN) 1 ) Difficulty of concurrency control 2 ) Difficulty of crash recovery 3 ) Difficulty of database design 4 ) Difficulty of load balancing 5 ) Difficulty of high availability 6 ) Number of messages 7 ) Required bandwidth 8 ) Ability to scale to large number of machines 2 3 2 1 3 2 2 2 3 1 2 3 3 2 1 1 2 3 3 2 1 3 2 1 3 2 1 9 ) Ability to have large distances between machines 3 2 1 10) Susceptibility to critical sections 1 3 3 11) Number of system images 3 3 3 12) Susceptibility to hot spots Three dominant themes in building high transaction rate multiprocessor systems Bron: The Case for Shared Nothing, Michael Stonebraker, Database Engineering Bulletin,1986 . 1 = the best 2 = 2 e the best 3 = 3 e the best
    • Clustering for scalability Pipelined parallelism: many machines each doing one step in a multi-step process. Pipeline
        • Partitioned parallelism: many machines doing the same thing to different pieces of data.
      Partition 1) Parallel executing model 2) Parallel executing model Both are natural in DBMS . Clustering for scalability ≈ parallel processing Two executing models for parallel processing Any Sequential Program Any Sequential Program Sequential Sequential Sequential Sequential Any Sequential Program Any Sequential Program
    • Frontend SQL Compiler Query Plan/code Coördinator Executor Executor Executor Executor User Application Backend Catalog Scheduler Shared Nothing DBMS
        • Partitioned parallelism: many machines doing the same thing to different pieces of data.
      Partition Sequential Sequential Sequential Sequential Any Sequential Program Any Sequential Program Uses parallel Executing model Transaction management requires a distributed deadlock detector and a multi-phase commit protocol
    • Why Parallel Access To Data? 1 Terabyte 10 MB/s At 10 MB/s 1.2 days to scan 1 Terabyte 1,000 x parallel 1.5 minute to scan. Parallelism: Divide a big problem into many smaller ones to be solved in parallel. Bandwidth Shared Nothing DBMS
      • Concepts
        • Design for (not all 4)
            • Performance
            • Cost
            • Scalability
            • Availability
            • 99.999% is 5 minutes a year
      • Solutions
        • Dataguard
        • Failsafe
        • Real Application Cluster (RAC)
      Oracle HA
    • <<server>> <<server>> <<process>> :Ora1 Primaire server 1) Oracle Dataguard : Physical standby and logical standby <<data opslag>> Primaire database Log data Log data <<data opslag>> Redo logs SQL Statements Transform Archive log files to SQL Statements <<data opslag>> Fysical standby database <<server>> <<data opslag>> Logical standby database Physical Standby Logical Standby
      • Oracle Dataguard
        • Database-level replication feature.
        • Allows offsite data replication
        • Managed as a single configuration
        • Primary and standby databases can be Real Application Clusters or single-instance Oracle
        • Up to nine standby databases supported in a single configuration
      <<process>> :Ora3 Logical standby server <<process>> :Ora2 Physical standby server
    • <<server>> <<server>> <<process>> :Ora1 <<process>> :Ora1 Agent 2) Oracle Fail Safe / HA <<data opslag>> Primaire disks Ora1 <<data opslag>> Primaire disks Ora2 <<server>> <<server>> <<process>> :Ora1 <<process>> :Ora2 Overname Normale werking Werking na overname <<process>> :Ora2 Agent <<process>> :Ora2
      • <<process>>
      • Ora2
      • Uitgevallen
      • <<process>>
      • :Ora1 Agent
      • Uitgevallen
      <<data opslag>> Primaire disks Ora1 <<data opslag>> Primaire disks Ora2
    • Shared Cache <<server>> <<server>> 3) Oracle RAC <<Opslag services>> {FC,PPRC} <<Data opslag netwerk>> <<server>> <<Database netwerk>> <<Applicatie netwerk>> {TCP/IP} {Inifiniband of Gigabyte ethernet}
      • RAC is database clustering
        • Shared disk solution
        • One physical database serviced by multiple cluster nodes/instances
        • Cluster consists of database nodes, fast cluster interconnect, shared disk subsystem
        • Oracle provides integrated clusterware and storage management
      <<process>> :Ora1 <<process>> :Ora2 <<process>> :Ora3
    • Criterium Performance Availability Data Loss Manageability Cost Failover Data Guard Low Normal High Complex Low 8-10 minutes HA Oracle Normal High Low Easy Normal 2-3 minutes Oracle RAC High Very High Very low Easy High < 1 minutes (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) (  ) Oracle HA concepts compared (  ) (  ) (  ) Disaster recovery Supported Possible Not supported (  ) (  ) (  ) Maximum Availability Architectures are using Oracle RAC and Oracle Data Guard together 1) 1) 1)
    • How much availability do you need? TIME TO FAILOVER Oracle with Symantec Virtual Cluster Server RAC 5 MIN 15 MIN 25 MIN 35 MIN MANUAL
      • RAC failover is not instant, it can take 20+ seconds
      • RAC failover requires most apps to re-connect (downtime)
      • Storage Foundation VCS can failover within seconds/minutes
      Oracle RAC Availability Bron: Clustering Choices for Oracle RAC, Stefan Kwiatkowski, Symantec, Upstate NY Oracle Users Group Oracle User Group Select Journal “ Most likely, you don’t need RAC. Alternatives will usually be cheaper, easier to manage and quite sufficient.”
      • Global Cache Service Consumes Resources
      • Performance per node degrades with more nodes
      • RAC an ineffective answer to scalability
      How well does RAC scale? COST PERFORMANCE SERVER CAPACITY AVAILABLE CAPACITY Oracle RAC Scalability Bron: Clustering Choices for Oracle RAC, Stefan Kwiatkowski, Symantec, Upstate NY Oracle Users Group De weergave is in lijn met inhoud van het rapport:”Database Scale-Out, Server Infrastructure Strategies, Philips Dawson, 20 Augustus 2002, MetaGroup”. Schaalbaarheid van een Propierty Cluster (Bijvoorbeeld HP TrueCluster in combinatie met Oracle RAC) bedraagt minder dan 75-85 % per node
    • Oracle RAC Scalability Be Aware of Costs
      • Example: Enterprise environment needing 4 databases
      Does not even include maintenance, management, implementation, etc. Bron: Clustering Choices for Oracle RAC, Stefan Kwiatkowski, Symantec, Upstate NY Oracle Users Group SF-HA = Veritas Storage Foundation for Databases SINGLE INSTANCE WITH HA $40K/CPU RAC $60K/CPU HARDWARE USD $70,000 20 CPUs in a SF-HA cluster USD $96,000 32 CPUs in four RAC cluster nodes (8 CPUs each) ORACLE SOFTWARE USD $800,000 20 CPUs USD $1,920,000 32 CPUs STORAGE SOFTWARE USD $50,000 20 CPUs USD $192,000 32 CPUs TOTAL USD $920,000 USD $2,208,000
      • Clustering in het algemeen
        • Het is moeilijk door middel van éé n type clustering zowel hoge beschikbaar en uitwijk te realiseren, gecombineerd met een hoge schaalbaarheid.
          • Shared Nothing Architecturen bieden de beste mogelijkheden.
      • Oracle Clustering
        • Oracle Real Application Cluster (RAC) biedt op een aantal punten voordelen ten opzichten van alternatieve Oracle Clusterconfiguraties.
          • Het is sterk afhankelijk van de specifieke omgeving en de gestelde eisen, of deze voordelen opwegen tegen de consequenties.
          • Als gevolg van de beperkte schaalbaarheid lijkt de haalbaarheid van een kostenbesparende Redundant Array of Inexpensive Servers (RAIS) klein.
      Conclusies