Parallelism in sql server

Technical Leader at @SolidQ and Microsoft Data Platform MVP
Mar. 20, 2013
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
Parallelism in sql server
1 of 42

More Related Content

Similar to Parallelism in sql server

Migration To Multi Core - Parallel Programming ModelsMigration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsZvi Avraham
Parallel Processing (Part 2)Parallel Processing (Part 2)
Parallel Processing (Part 2)Ajeng Savitri
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSKathirvel Ayyaswamy
Assisting User’s Transition to Titan’s Accelerated ArchitectureAssisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated Architectureinside-BigData.com
Balancing Power & Performance WebinarBalancing Power & Performance Webinar
Balancing Power & Performance WebinarQualcomm Developer Network
Approximation techniques used for general purpose algorithmsApproximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithmsSabidur Rahman

More from Enrique Catala Bañuls

Sql server ha muerto, larga vida a sql serverSql server ha muerto, larga vida a sql server
Sql server ha muerto, larga vida a sql serverEnrique Catala Bañuls
Capas de acceso a datos .net escalables de verdad contra SQL ServerCapas de acceso a datos .net escalables de verdad contra SQL Server
Capas de acceso a datos .net escalables de verdad contra SQL ServerEnrique Catala Bañuls
Paralelismo en SQL ServerParalelismo en SQL Server
Paralelismo en SQL ServerEnrique Catala Bañuls
Aplicando R al análisis de rendimiento de un servidorAplicando R al análisis de rendimiento de un servidor
Aplicando R al análisis de rendimiento de un servidorEnrique Catala Bañuls
Técnicas avanzadas para resolver tus problemas de sql serverTécnicas avanzadas para resolver tus problemas de sql server
Técnicas avanzadas para resolver tus problemas de sql serverEnrique Catala Bañuls
Capas de acceso a datos .NET escalables de verdad: el batido perfecto para el...Capas de acceso a datos .NET escalables de verdad: el batido perfecto para el...
Capas de acceso a datos .NET escalables de verdad: el batido perfecto para el...Enrique Catala Bañuls

More from Enrique Catala Bañuls(20)

Recently uploaded

Product Listing Presentation-Maidy Veloso.pptxProduct Listing Presentation-Maidy Veloso.pptx
Product Listing Presentation-Maidy Veloso.pptxMaidyVeloso
Product Research Presentation-Maidy Veloso.pptxProduct Research Presentation-Maidy Veloso.pptx
Product Research Presentation-Maidy Veloso.pptxMaidyVeloso
"Building Asynchronous SOA for Modern Applications", Sai Pragna Etikyala "Building Asynchronous SOA for Modern Applications", Sai Pragna Etikyala
"Building Asynchronous SOA for Modern Applications", Sai Pragna Etikyala Fwdays
Need for Speed: Removing speed bumps in API ProjectsNeed for Speed: Removing speed bumps in API Projects
Need for Speed: Removing speed bumps in API ProjectsŁukasz Chruściel
"From Orchestration to Choreography and Back", Yevhen Bobrov "From Orchestration to Choreography and Back", Yevhen Bobrov
"From Orchestration to Choreography and Back", Yevhen Bobrov Fwdays
UiPath Tips and Techniques for Debugging - Session 3UiPath Tips and Techniques for Debugging - Session 3
UiPath Tips and Techniques for Debugging - Session 3DianaGray10

Parallelism in sql server

Editor's Notes

  1. Thereis a lot of topicsonthis área and i tryedto concéntrate some of themostimportantparts in anhoursession
  2. A quickexample:If i have 200 differentcoins and Iwanttogethowmuchmoney Ihave, i can addonebyone , I can give 100 coinstomypartnertoget a partialresult, orforexample Split mycoinsbetween 10 partnerstoget 10 partialresults and thenobtainhowmuchmoney I have … aftersome “specialfee” youknowThe real time expended gettingtheresultswillnot be thesame and obviouslythemuchpartners i use togetpartialresults, the more quicklyi´llgettheresult….butthisisnotalways true.
  3. Typicalbennefit: the more CPU, the more performance…butitsthat true?It´stipicallon REBUILDING indexes, aggregations, tablescans,…CHART, GRAPHIC
  4. The server iscomposedonmultiple NUMA nodes 2-4 typicallyonthe standard configurationsEach NUMA node has itsown CPU and memoryThe server seesthe sum of CPU and memory and all are accesible from SQL Server
  5. The images show the detection of three-node NUMA hardware by SQL Server and the three lazy writer threads (one per each NUMA node).SQL Server is able to get the best performance in NUMA hardware by doing some special automatic configurations, such as having special threads for some internal components in each NUMA node. Note: Mention that as a common rule, you must configure the MAXDOP value lower than the number of physical cores per each NUMA node. With this configuration, if a query is executed in parallel, all the threads will be in the same node.
  6. the SQLOS is a thin user-mode layer that sits between SQL Server and Windows. It is usedfor low-level operations such as scheduling, I/O completion, memory management, and resourceManagementWhen an execution request is made within a session, SQL Server divides the work into one or moretasks and then associates a worker thread to each task for its duration. Runs in user modeReduces context switchingBetter resource usageMultiprocessing is enhancedA task uses the same scheduler most of the timeMultiple tasks can be executed at the same timeData locality is enforcedBetter scalability on NUMA hardwareSQLOS works the same in each OS host (w2k3, w2k8r2, w2k12, etc.)
  7. why would i do that?
  8. This configuration is mainly for:Systems with more than one SQL Server instanceSystems with more than 32 heavily used CPUs on which you detected specific I/O congestion problemsWhen you don't use IO affinity the SQL Server worker handles (posts) the IO and takes care of the IO completion on the scheduler the worker was assigned to.The SQL Server GUI on SQL Server 2012 don´tletyoumakemistakesQUESTION: Whyiswrongtheconf “Bad”?REASON: By setting both at the same scheduler they will compete for resources, that is just what you want to avoid.
  9. ENHANCE DATA LOCALITYOnlargesystems, bydoingthiskind of affinity, you can obtain a performance gain of 20%. QUESTION: Why?ANSWER: Becausestatistically, when a scheduler “touches” a datapage, the page isstored at NUMA memory X. if a schedulercommingfromanother NUMA nodeneedstoreadthatspecific page, ittakes 3-4 times the time togetthat page fromoutsideits NUMA node. So bydoingthis, we can forcé specificaplicationstoworkwithspecific NUMA nodes and doingthis, toincreasethepossibilitytoread-write data pagesonthesame NUMA nodes.
  10. 26’
  11. Degree of parallelism (DOP) is assigned at each parallel step of the execution planAll CPUs can be used by the schedulers, so threads can use all available CPUsNo special consideration for hyperthreaded CPUsBy limiting DOP, you can limit the number of available threads to solve a query DOP is determined when execution plan is retrieved from the plan cache
  12. 28
  13. This operator takes a single input stream of records and produces multiple output streams. The record contents and format are not changed. Each record from the input stream appears in one of the output streams. This operator automatically preserves the relative order of the input records in the output streams. Usually, hashing is used to decide to which output stream a particular input record belongs.
  14. operator consumes multiple streams and produces multiple streams of records. The record contents and format are not changed. If the query optimizer uses a bitmap filter, the number of rows in the output stream is reduced. Each record from an input stream is placed into one output stream. If this operator is order preserving, all input streams must be ordered and merged into several ordered output streams. El cálculo si se mira el plan de eejcución en detalle, viene dado por una expresión que se puede obtener ya en el hash match. En el momento del gather se obtiene el valor 6.En la demo 2-exchange_operators.sql se puede ver con detenimiento
  15. consumes several input streams and produces a single output stream of records by combining the input streamsPARALLEL PAGE SUPLIER to divide rowsacrossthreads in batches
  16. 45
  17. There are someenemies of theparalleliam and those are some of them
  18. Paralleloperationsmust be synchronizedbeforeserializyng. So ifsomeworkerendsitsexecution and someotherisstillexecuting, he throws a CXPACKET signalto SQL Server announcingthat he iswaiting and finisheditsexecution. CXPACKET isnot a problemitselfbutitsanindicator of badparallel SQL Server configurationifweseelots of waitsignals of thistype
  19. 50
  20. Here are typical scenarios involving CXPACKET wait statistics.Note: It is very unusual to have a pure OLTP system because most customers uses their SQL Server instances for applications, reports, BI data loading solutions, and more.In the example at the bottom, note that 9 days of CPU time is wasted by CXPACKET (786556034 ms = 13109 minutes = 218 hours = 9 days) in threading synchronization due to a bad configuration. (This is a real example from one of the SolidQ’s customers.)Important: It is very important that the students really understand the degree of parallelism setting. It is very common for students to confuse MAXDOP with CPU AFFINITY. Furthermore, make sure that students understand what is a pure OLTP system.