The document discusses column store indexes in SQL Server 2012. Some key points:
- Column store indexes provide significantly faster query performance compared to row-based indexes by storing and processing data by column rather than by row. Performance gains of 10x to 1000x are possible.
- xVelocity is Microsoft's term for next-generation technologies in SQL Server that take advantage of modern hardware through vector processing and in-memory optimizations.
- Column stores compress data more efficiently than row stores by storing identical values only once across columns rather than repeating in each row.
- Fact tables are good candidates for column store indexes due to their large size and aggregation query patterns in data warehouses.
- Considerations for designing
This document summarizes several winning applications of R from a 2012 business contest. The applications include using R for direct marketing in-flight forecasting, mining Twitter for airline sentiment analysis, predicting steel temperatures in a plant, time series analysis for order prediction, predicting clinical trial durations, estimating uncertainty in IT projects, and dynamic report generation. The applications demonstrate how R can be used effectively across different industries for analytics.
Collaborate 2011– Leveraging and Enriching the Capabilities of Oracle Databas...djkucera
The document discusses leveraging capabilities in Oracle Database 11g for managing a large-scale data warehouse efficiently. It describes how partitioning can enable efficient data retention by allowing instant dropping of partitions to delete large amounts of old data. It also explains how advanced compression can significantly reduce storage space and I/O time by compressing flat files by over 50%. The document further details how loading a Java library into the database allows supplementing Oracle's functionality, such as parsing Excel files, by using open-source Java classes from within the database.
Seminar: Data Modeling for Optimization with MPL - Oct 2012Bjarni Kristjánsson
1) The document discusses how MPL allows importing index and data values directly from databases to define indexes and data vectors in a model.
2) It can import from various database types and specifies indexes by referring to database tables and columns.
3) The document also explains how MPL enables exporting optimized variable values back to databases for reporting solutions.
En esta conferencia, hablaremos de nuestro producto Oracle TimesTen, el cual es una base de datos relacional que funciona completamente en memoria y por lo tanto es muy rápida. Conoceremos los fundamentos del mismo, y veremos como Oracle diseña y construye este producto, en territorio Mexicano con talento Mexicano.
This document provides a quick reference guide for Oracle Server 9i, summarizing key concepts, statements, and features. It covers Oracle architecture, instances, databases, utilities, tablespaces, logs, tables, views, indexes, users, auditing, networking, recovery, distributed databases, replication, queues, data warehousing, Real Application Clusters, globalization, SQL*Plus, data types, SQL, PL/SQL, Java, performance tuning, deprecated features, and more - all in a compact format to enable quick lookups. The guide is not meant to replace Oracle's official documentation, but rather serve as a high-level cheat sheet for commonly used elements.
We are leading manufacturers and exporters of service station and automotive garage equipment such as automotive nitrogen generators, drilling machines, air grinders etc. We ensure the quality at par with international standards.
This study guide is intended to provide those pursuing the CCNA certification with a framework of what concepts need to be studied. This is not a comprehensive document containing all the secrets of the CCNA, nor is it a “braindump” of questions and answers.
I sincerely hope that this document provides some assistance and clarity in your studies.
This document summarizes several winning applications of R from a 2012 business contest. The applications include using R for direct marketing in-flight forecasting, mining Twitter for airline sentiment analysis, predicting steel temperatures in a plant, time series analysis for order prediction, predicting clinical trial durations, estimating uncertainty in IT projects, and dynamic report generation. The applications demonstrate how R can be used effectively across different industries for analytics.
Collaborate 2011– Leveraging and Enriching the Capabilities of Oracle Databas...djkucera
The document discusses leveraging capabilities in Oracle Database 11g for managing a large-scale data warehouse efficiently. It describes how partitioning can enable efficient data retention by allowing instant dropping of partitions to delete large amounts of old data. It also explains how advanced compression can significantly reduce storage space and I/O time by compressing flat files by over 50%. The document further details how loading a Java library into the database allows supplementing Oracle's functionality, such as parsing Excel files, by using open-source Java classes from within the database.
Seminar: Data Modeling for Optimization with MPL - Oct 2012Bjarni Kristjánsson
1) The document discusses how MPL allows importing index and data values directly from databases to define indexes and data vectors in a model.
2) It can import from various database types and specifies indexes by referring to database tables and columns.
3) The document also explains how MPL enables exporting optimized variable values back to databases for reporting solutions.
En esta conferencia, hablaremos de nuestro producto Oracle TimesTen, el cual es una base de datos relacional que funciona completamente en memoria y por lo tanto es muy rápida. Conoceremos los fundamentos del mismo, y veremos como Oracle diseña y construye este producto, en territorio Mexicano con talento Mexicano.
This document provides a quick reference guide for Oracle Server 9i, summarizing key concepts, statements, and features. It covers Oracle architecture, instances, databases, utilities, tablespaces, logs, tables, views, indexes, users, auditing, networking, recovery, distributed databases, replication, queues, data warehousing, Real Application Clusters, globalization, SQL*Plus, data types, SQL, PL/SQL, Java, performance tuning, deprecated features, and more - all in a compact format to enable quick lookups. The guide is not meant to replace Oracle's official documentation, but rather serve as a high-level cheat sheet for commonly used elements.
We are leading manufacturers and exporters of service station and automotive garage equipment such as automotive nitrogen generators, drilling machines, air grinders etc. We ensure the quality at par with international standards.
This study guide is intended to provide those pursuing the CCNA certification with a framework of what concepts need to be studied. This is not a comprehensive document containing all the secrets of the CCNA, nor is it a “braindump” of questions and answers.
I sincerely hope that this document provides some assistance and clarity in your studies.
The document provides an overview of the Tata Group, one of India's largest conglomerates. It discusses the group's founding in 1868, current chairman Cyrus Mistry, operating companies such as Tata Steel and Tata Motors, acquisitions made, products and services offered, social activities, and approximate financial information for 2011-2012 with over $100 billion in revenue and over 450,000 employees. The Tata Group encompasses many business sectors and has operations in over 80 countries worldwide.
Architects in bangalore: 1000 emails database Gaurav Tripathi
This document contains a list of 93 registered architects with the Surat Municipal Corporation. It provides their license number, name, address, contact information and email. The architects are based in various cities including Surat, Mumbai, Vadodara, Ahmedabad and others.
The document outlines tasks for a SharePoint site collection administrator and farm administrator, including monitoring list limits, security settings, recycle bins, and sub-sites for the site collection admin and daily, weekly, and monthly tasks like checking logs and backups for the farm admin. It provides guidance on best practices for maintaining a secure and optimized SharePoint environment.
This document summarizes several light armored tracked vehicles developed by the Defence Research and Development Organisation (DRDO) for the Indian Army. It discusses the Armored Engineer Recce Vehicle (AERV), which provides combat engineering support. It also discusses the Armored Amphibious Dozer (AAD), which facilitates construction of bridges and tracks. It then discusses the NBC Recce Vehicle for detecting nuclear, biological, and chemical agents. It also summarizes the 105 mm light tank developed on a BMP-II chassis. Finally, it provides an overview of the technologies developed for the indigenous Infantry Combat Vehicle (ICV) Abhay, including the armored hull, composite armor, automotive systems, power pack, suspension
1. ATC (Clearing & Shipping) Pvt. Ltd. is a logistics company established in 1957 that provides global logistics solutions through air, sea, road, and rail transportation.
2. The company has a network across India and globally through partnerships. It provides various services including customs clearance, freight forwarding, warehousing, and supply chain consulting.
3. ATC has over 850 customers ranging from large corporations to SMEs across industries like automotive, engineering, chemicals, and consumer goods. It handles millions of kilograms of air cargo and billions of rupees in goods annually.
This document provides the marks secured by 56 students in the 1st mid term test of the subject "Mould Manufacturing Techniques" for the DPT-V course (2010-2013 academic year) at CIPET-HALDIA. The total marks for the test were 20. The highest marks secured were 12.5 and the lowest marks were 1. Most students scored between 5-8 marks.
DYAUSH AVIATION is an aviation company established by Capt. Udit Panwar to capitalize on opportunities in India's growing aviation market, projected to be over $20 billion by 2020. The company has assembled a team with extensive experience in the Indian aviation industry to launch and operate new aviation ventures profitably and with a high level of governance and regulatory compliance. The team has identified several niche focus opportunities, including an express cargo airline, regional airline, pilot training, and aerosports, and has in-depth experience and qualifications to successfully execute these opportunities.
1) JXTA is an open-source peer-to-peer computing platform that provides protocols and APIs for distributed applications.
2) The presentation discusses JXTA's capabilities for virtual networking, security, discovery, and integration with web services.
3) Examples of applications using JXTA include distributed computing platforms like Triana and projects within the Global Grid Forum.
Bharat Biotech is an Indian biotechnology company that specializes in developing and manufacturing vaccines and biotherapeutics focused on public health in developing nations. It supplies vaccines to the Indian government and over 65 other countries. Bharat Biotech has large manufacturing facilities in Hyderabad, India that meet stringent international standards. The company is a leader in innovation and research and development, with a focus on diseases prevalent in developing market populations.
Metaltech Motor Bodies Pvt. Ltd. is Vehicles Body Fabricator, We are involve in Vehicles Body Customization, Vehicle Bullet Proofing and Special Purpose Vehicles Body Fabrication.
Tata Communications is an Indian multinational telecommunications company headquartered in Mumbai, India. It provides a range of network services and managed services including tele-presence, hosting, data centers, security, enterprise voice, and publicly available tele-presence. Key partners include Telefonica, Cisco, Dimension Data, Glowpoint, Starwood Hotels, Rendezvous Hotel Group, Neotel, and Taj Hotels. The presentation provides information on Tata Communication's mission, subsidiaries, leadership team, services provided to consumers and companies, new developments, awards received, and future outlook. Financial information from 2009-2013 shows increasing annual net sales. Voice services account for 48% of revenues, data for
American Megatrends (AMI) Embedded software company focused on Embedded OS porting and Validation Services for the “Platform bring-up” and this moment we are aggressively pushing our services to enable customers with our Embedded OS Porting and Validation Services for the various Embedded-OS-es they need support for.
I would appreciate if you could help us accelerate our services by referring some of your requirement which will need our services.
Our experience in the Embedded Devices :
(1) MeeGo, Chrome OS, WinCE 6.0 & Embedded Linux, Home Server Porting and Validation Services for “Intel’s PineView Platform”.
(2) Windows* 7 Embedded Standard, Windows* XP Embedded, Meego, Android, Chromium Porting and Validation Services for “Intel’s Cougar Point Platform”.
(3) Driver Development for USB, Ethernet, PCI, PCI-X, RAID, Modem & Network Interface Cards
(4) UEFI compliant BIOS/Boot loader for the ARM Platform
(5) Linux and Windows Driver (2000, XP, Vista)Development Services
(6) Firmware Development Services (ARM, MIPS, TI OMAP, Xilinx…………….)
(7) Android Porting and Application Development for Mobile & Embedded Devices
(8) PC and Server Diagnostics for UEFI and Legacy environment (http://www.amiindia.co.in/diagnostic.html )
(9) Embedded XP (XPe) Development Services
(10) WinCE / Windows Mobile Embedded & Applications Software Development Services
Services specific to the computing domain:
(2) Pre-boot Applications
(3) Option ROM Customisation & Development
(4) IPMI, Service Processor and Remote Management (http://www.ami.com/serviceprocessors/)
(5) Extense on Custom Drivers in Windows and Linux.
American Megatrends
Email.: sunilp@amiindia.co.in
Mobile.: +91 96000 10071
Tel.: +91 44 66540922 Extn: 112
This document provides information about Majestic HR, a recruitment agency located in Coimbatore, Tamil Nadu, India. It lists the agency's contact details and describes various job opportunities for candidates with different educational qualifications, including software engineering, mechanical engineering, management, and vocational roles. Stipend and salary ranges are provided for freshers and experienced candidates. The last section lists some of the client companies where candidates may be placed.
Gandhi Ankit provides a curriculum vitae summarizing his personal and professional details. He has an MBA in materials management and supply chain management from the National Institute of Management. He has over 10 years of work experience in production engineering and quality control roles for glass manufacturing companies. Currently he works as a senior executive for technical services at Can Pack India Pvt Ltd, overseeing new product development, customer relations, production, quality, and warehouse operations. He is seeking new opportunities and has experience leading projects in areas like six sigma, production improvement, and implementing new systems.
The document discusses Bharat Electronics Limited (BEL), an Indian public sector undertaking established in 1954 to meet the specialized electronic needs of the Indian defense services. It has several units across India producing electronics products for defense including radars, communications systems, naval systems, and more. Two such products are described in more detail - the LUP 291 secure UHF handheld radio and the TIDEX Time Division Modular Exchange, a microprocessor-controlled telephone exchange used in forward military areas.
This session will explore the Sybase database embedded in Novell ZENworks 10 Configuration Management. We'll discuss topics such as backup, recovery, basic maintenance, tuning and troubleshooting techniques for the database components that underpin Novell ZENworks Configuration Management.
Chris Asano has over 25 years of experience as a senior database architect and DBA with expertise in Oracle, SQL Server, Sybase, DB2, MySQL, and other database technologies. He has worked for various companies across different industries performing tasks such as database migrations, backups/restores, performance tuning, disaster recovery implementation, and project management. Currently, he is seeking new opportunities as a senior database professional.
The document provides an overview of the Tata Group, one of India's largest conglomerates. It discusses the group's founding in 1868, current chairman Cyrus Mistry, operating companies such as Tata Steel and Tata Motors, acquisitions made, products and services offered, social activities, and approximate financial information for 2011-2012 with over $100 billion in revenue and over 450,000 employees. The Tata Group encompasses many business sectors and has operations in over 80 countries worldwide.
Architects in bangalore: 1000 emails database Gaurav Tripathi
This document contains a list of 93 registered architects with the Surat Municipal Corporation. It provides their license number, name, address, contact information and email. The architects are based in various cities including Surat, Mumbai, Vadodara, Ahmedabad and others.
The document outlines tasks for a SharePoint site collection administrator and farm administrator, including monitoring list limits, security settings, recycle bins, and sub-sites for the site collection admin and daily, weekly, and monthly tasks like checking logs and backups for the farm admin. It provides guidance on best practices for maintaining a secure and optimized SharePoint environment.
This document summarizes several light armored tracked vehicles developed by the Defence Research and Development Organisation (DRDO) for the Indian Army. It discusses the Armored Engineer Recce Vehicle (AERV), which provides combat engineering support. It also discusses the Armored Amphibious Dozer (AAD), which facilitates construction of bridges and tracks. It then discusses the NBC Recce Vehicle for detecting nuclear, biological, and chemical agents. It also summarizes the 105 mm light tank developed on a BMP-II chassis. Finally, it provides an overview of the technologies developed for the indigenous Infantry Combat Vehicle (ICV) Abhay, including the armored hull, composite armor, automotive systems, power pack, suspension
1. ATC (Clearing & Shipping) Pvt. Ltd. is a logistics company established in 1957 that provides global logistics solutions through air, sea, road, and rail transportation.
2. The company has a network across India and globally through partnerships. It provides various services including customs clearance, freight forwarding, warehousing, and supply chain consulting.
3. ATC has over 850 customers ranging from large corporations to SMEs across industries like automotive, engineering, chemicals, and consumer goods. It handles millions of kilograms of air cargo and billions of rupees in goods annually.
This document provides the marks secured by 56 students in the 1st mid term test of the subject "Mould Manufacturing Techniques" for the DPT-V course (2010-2013 academic year) at CIPET-HALDIA. The total marks for the test were 20. The highest marks secured were 12.5 and the lowest marks were 1. Most students scored between 5-8 marks.
DYAUSH AVIATION is an aviation company established by Capt. Udit Panwar to capitalize on opportunities in India's growing aviation market, projected to be over $20 billion by 2020. The company has assembled a team with extensive experience in the Indian aviation industry to launch and operate new aviation ventures profitably and with a high level of governance and regulatory compliance. The team has identified several niche focus opportunities, including an express cargo airline, regional airline, pilot training, and aerosports, and has in-depth experience and qualifications to successfully execute these opportunities.
1) JXTA is an open-source peer-to-peer computing platform that provides protocols and APIs for distributed applications.
2) The presentation discusses JXTA's capabilities for virtual networking, security, discovery, and integration with web services.
3) Examples of applications using JXTA include distributed computing platforms like Triana and projects within the Global Grid Forum.
Bharat Biotech is an Indian biotechnology company that specializes in developing and manufacturing vaccines and biotherapeutics focused on public health in developing nations. It supplies vaccines to the Indian government and over 65 other countries. Bharat Biotech has large manufacturing facilities in Hyderabad, India that meet stringent international standards. The company is a leader in innovation and research and development, with a focus on diseases prevalent in developing market populations.
Metaltech Motor Bodies Pvt. Ltd. is Vehicles Body Fabricator, We are involve in Vehicles Body Customization, Vehicle Bullet Proofing and Special Purpose Vehicles Body Fabrication.
Tata Communications is an Indian multinational telecommunications company headquartered in Mumbai, India. It provides a range of network services and managed services including tele-presence, hosting, data centers, security, enterprise voice, and publicly available tele-presence. Key partners include Telefonica, Cisco, Dimension Data, Glowpoint, Starwood Hotels, Rendezvous Hotel Group, Neotel, and Taj Hotels. The presentation provides information on Tata Communication's mission, subsidiaries, leadership team, services provided to consumers and companies, new developments, awards received, and future outlook. Financial information from 2009-2013 shows increasing annual net sales. Voice services account for 48% of revenues, data for
American Megatrends (AMI) Embedded software company focused on Embedded OS porting and Validation Services for the “Platform bring-up” and this moment we are aggressively pushing our services to enable customers with our Embedded OS Porting and Validation Services for the various Embedded-OS-es they need support for.
I would appreciate if you could help us accelerate our services by referring some of your requirement which will need our services.
Our experience in the Embedded Devices :
(1) MeeGo, Chrome OS, WinCE 6.0 & Embedded Linux, Home Server Porting and Validation Services for “Intel’s PineView Platform”.
(2) Windows* 7 Embedded Standard, Windows* XP Embedded, Meego, Android, Chromium Porting and Validation Services for “Intel’s Cougar Point Platform”.
(3) Driver Development for USB, Ethernet, PCI, PCI-X, RAID, Modem & Network Interface Cards
(4) UEFI compliant BIOS/Boot loader for the ARM Platform
(5) Linux and Windows Driver (2000, XP, Vista)Development Services
(6) Firmware Development Services (ARM, MIPS, TI OMAP, Xilinx…………….)
(7) Android Porting and Application Development for Mobile & Embedded Devices
(8) PC and Server Diagnostics for UEFI and Legacy environment (http://www.amiindia.co.in/diagnostic.html )
(9) Embedded XP (XPe) Development Services
(10) WinCE / Windows Mobile Embedded & Applications Software Development Services
Services specific to the computing domain:
(2) Pre-boot Applications
(3) Option ROM Customisation & Development
(4) IPMI, Service Processor and Remote Management (http://www.ami.com/serviceprocessors/)
(5) Extense on Custom Drivers in Windows and Linux.
American Megatrends
Email.: sunilp@amiindia.co.in
Mobile.: +91 96000 10071
Tel.: +91 44 66540922 Extn: 112
This document provides information about Majestic HR, a recruitment agency located in Coimbatore, Tamil Nadu, India. It lists the agency's contact details and describes various job opportunities for candidates with different educational qualifications, including software engineering, mechanical engineering, management, and vocational roles. Stipend and salary ranges are provided for freshers and experienced candidates. The last section lists some of the client companies where candidates may be placed.
Gandhi Ankit provides a curriculum vitae summarizing his personal and professional details. He has an MBA in materials management and supply chain management from the National Institute of Management. He has over 10 years of work experience in production engineering and quality control roles for glass manufacturing companies. Currently he works as a senior executive for technical services at Can Pack India Pvt Ltd, overseeing new product development, customer relations, production, quality, and warehouse operations. He is seeking new opportunities and has experience leading projects in areas like six sigma, production improvement, and implementing new systems.
The document discusses Bharat Electronics Limited (BEL), an Indian public sector undertaking established in 1954 to meet the specialized electronic needs of the Indian defense services. It has several units across India producing electronics products for defense including radars, communications systems, naval systems, and more. Two such products are described in more detail - the LUP 291 secure UHF handheld radio and the TIDEX Time Division Modular Exchange, a microprocessor-controlled telephone exchange used in forward military areas.
This session will explore the Sybase database embedded in Novell ZENworks 10 Configuration Management. We'll discuss topics such as backup, recovery, basic maintenance, tuning and troubleshooting techniques for the database components that underpin Novell ZENworks Configuration Management.
Chris Asano has over 25 years of experience as a senior database architect and DBA with expertise in Oracle, SQL Server, Sybase, DB2, MySQL, and other database technologies. He has worked for various companies across different industries performing tasks such as database migrations, backups/restores, performance tuning, disaster recovery implementation, and project management. Currently, he is seeking new opportunities as a senior database professional.
The document provides an overview of tuning the Oracle E-Business Suite environment. It discusses tuning the applications tier, concurrent manager, client tier and network, database tier, and applications. Specific tips are provided for each area, such as upgrading technology stacks, minimizing network traffic, using specialized managers, enabling SQL tracing and profiling, and isolating the database and applications tiers on a private network.
(ATS3-PLAT01) Recent developments in Pipeline PilotBIOVIA
This session will review significant enhancements to Pipeline Pilot in recent releases. Areas covered are: Professional client, administration, security, integration, databases, and collections (chemistry, next gen sequencing, documents and text, statistics, and imaging).
Oracle Database In-Memory will be generally available in July 2014 and can be used with all hardware platforms on which Oracle Database 12c is supported.
Accelerate database performance by orders of magnitude for analytics, data warehousing, and reporting while also speeding up online transaction processing (OLTP).
Allow any existing Oracle Database-compatible application to automatically and transparently take advantage of columnar in-memory processing, without additional programming or application changes.
The document discusses optimizing Oracle and Siebel applications on the Sun UltraSPARC T1 platform. It describes how Siebel's multi-threaded architecture is well-suited to the T1 processor's ability to run multiple threads in parallel. It provides examples of consolidating Siebel environments and optimizing performance through Solaris, Siebel, and Oracle database tuning. Metrics show Siebel performing well with low CPU utilization on T1 systems.
Databases in the Cloud discusses AWS database services for moving workloads to the cloud. It describes Amazon Relational Database Service (RDS) which provides several fully managed relational database options including MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora. It also discusses non-relational database services like DynamoDB, ElastiCache, and Redshift for analytics workloads. The document provides guidance on choosing between SQL and NoSQL databases and discusses benefits of managed database services over hosting databases on-premises or in EC2 instances.
October Rules Fest 2008 - Distributed Data Processing with ILOG JRulesDan Selman
UBS Bank operates in over 50 countries and employs more than 80,000 people. Learn how UBS generates internal and regulatory reports that consolidate the financial performance of the bank using ILOG JRules and a distributed grid architecture. 20 billion records are processed every night, with 20 million records passing through the rule engine. Stringent performance objectives are in place to ensure the bank meets its regulatory requirements and the financial reports are in place before the trading day starts.
The document discusses optimizing Oracle and Siebel applications on Sun Microsystems' UltraSPARC T1 (Niagara) platform. It provides an overview of Siebel architecture and its suitability for the T1 processor. Performance benchmarks show Siebel scaling well by taking advantage of the T1 processor's multithreading capabilities. The document also discusses various optimizations that can be done at the application, database, storage, and operating system levels to further improve performance.
This document provides an overview of Amazon Aurora and discusses its performance advantages over traditional databases. Aurora delivers the performance and availability of commercial databases at 1/10th the cost by leveraging simple open source architecture. The document describes how Aurora achieves high performance through its distributed, asynchronous architecture and integration with other AWS services. It also discusses how Aurora provides high availability through its quorum-based storage system and ability to handle failures without stopping writes or restarting the database. Finally, the document shares benchmark results and customer use cases that demonstrate Aurora's ability to scale to large workloads and datasets at significantly lower costs than alternative solutions.
1. The document discusses various optimizations that can be made to an ASP.NET MVC application to improve performance, including compiled LINQ queries, URL caching, and data caching.
2. Benchmark results show that optimizing partial view rendering, LINQ queries, and URL generation improved performance from 8 requests/second to 61.5 requests/second. Additional caching of URLs, statistics, and content improved performance to over 400 requests/second.
3. Turning off ASP.NET debug mode also provided a significant performance boost, showing the importance of running production sites in release mode.
This document summarizes architectural lessons learned from refactoring a Solr-based API application at Shopping24. Key strategies discussed include sharding the index to improve performance as the data size grew from 500k to 7m documents, using caching and separate Solr cores to optimize access for different clients, and automating infrastructure to dynamically scale hardware resources. The document stresses examining business requirements to simplify queries and indexes by removing unsearchable data and duplicating frequently accessed content.
Lessons Learned: Refactoring a Solr-Based API App - Torsten Koesterlucenerevolution
See conference video - http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011
In this case study I'll discuss architectural lessons learned from refactoring an existing REST-API backed by Apache Solr. The initial goal of the refactoring was to speed up data access while scaling from 5m documents to 20-50m documents stored in Solr. Under consideration was the hosting infrastructure, the REST API Java code and the Solr documents and configuration. In this talk I'll give a brief review of the results.
"Pimping" the Solr configuration, the client access and the document structure achieved better results. But the elementary lesson learned was, that a significant increase of data access speed can only be realized with a functional redesign and a simplification of the REST API. NO CAPS ON CORES & SHARDS) I'll explain how this led us directly to distinct Solr cores and why we dropped the introduction of Solr shards or a breathing cloud infrastructure.
Sloupcové uložení dat a použití in-memory technologií u řešení ExadataMarketingArrowECS_CZ
Oracle Database 12c provides in-memory capabilities that allow real-time analytics on operational systems without requiring changes to applications. The in-memory columnar format improves performance of analytic queries by up to 100 times compared to traditional storage. The in-memory architecture supports both row and column formats simultaneously for the same table, enabling both analytics and transactions without tradeoffs.
Amazon Aurora is a MySQL and PostgreSQL compatible relational database built for the cloud, that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. In this session, we explore features of Amazon Aurora and demonstrate database migration using the AWS Database Migration Service.
This portfolio provides examples of Brian Hertzke's work with SQL Server technologies through projects in the SetFocus SQL Server Master's Program. It includes samples demonstrating his skills in database design, T-SQL coding, XML querying and manipulation, SQL Server Integration Services, and SQL Server Reporting Services. Specifically, it outlines a movie rental database design, stored procedures, triggers, XML creation and import/export, using SSIS for BI solutions, and reports on sales, members, and employees created in SSRS.
1. VertiPaq-ing With SQL
2012 Columnstore Indexes
xVelocity
Abstract
Significant work on the SQL 2012. One of such
column store has changed most expectant
the storage paradigm for implementation is xVelocity
data warehousing. The Column Store Indexes. This
column store supported by paper contains the detailed
vector based query discussion on column store
execution and substantial touching on the basics,
progress in data limitations & design
compression have edged considerations with the
the technology as potential demonstration examples.
game changers. Microsoft Columnar and Column store
targeting high on next-gen will be used
technologies increasing use interchangeably
of in-memory but memory in the below discussion.
optimized techniques in
www.aditi.com
2. Audiences
This paper targets IT Planners, and Memory are highly
Architects, BI Users, CTO‟s and dependent on the method of
CIO‟s evaluating the SQL 2012 storage and querying. Queries
answering their large data can be seen as (1) read-only
grounded business needs. It also workloads which are mostly
targets the enthusiasts of 2012 to reporting and DW systems and
provide new dimensions and out (2) the read-write workloads
of box thinking to the mostly OLTP systems. The
organization to maintain data potential game changer in read-
using SQL 2012. only workloads is the storage
method to minimize I/O and
Overview Memory based operations
Data is growing exponentially where-as conventional RDBMS
and performance is becoming a stores data in row based storage
recurring cost for the system. Based on the columnar
organizations. Performance design a gain or speedup in
impact can be broadly factorized queries can be seen from 10X up
as (1) I/O based operations (2) to 1000X.
Memory based operations & (3)
Operations to transfer data in For instance suppose the
N/W or other peripherals. I/O employee information looks like
…
ID Name Street Address
32498 Diamond John Crouse Manson
45298 Mary Anglos Wilson Street
Acronyms
RDBMS Relational Data Base Management System HCC Hybrid Columnar Compression
I/O Input Output FCC Full Columnar Compression
N/W Network LOB Large Objects
OLTP Online Transaction Processing OLAP Online Analytical Processing
DW Data Warehouse CSI Column Store Indexes
ETL Extract Transform & Load
www.aditi.com
3. Then in the conventional RDBMS (e.g. SQL Server Where-as the Columnar Storage (e.g. SQL Server 2012
2008) it will be stored in row by row-wise fashion as using column store index) will store it columnar fashion
shown in the Diagram-1 below. as shown in Diagram-2 below.
Diagram-1 Diagram-2
VertiPaq-ing with
xVilocity Columnnar Indexes
Why Columnar Indexes
There is a great debate for the
columnar structure. Below are
benefits of using columnar indexes
specifically to SQL Server 2012
Astonishing Results
Thought is to start the result driven comparison of query time with shows that warm cache is taking
discussion. Below are the graphs “Column Store Index” Vs. very less time comparatively. The
for the query performance results. “Conventional Indexes” are graph-3 below is gain in “X”
We have started with the 12.5 exceptionally revealing in favor of number of times in warm and cold
million rows and doubled it every column store indexes. The graph 1 cache. The result really excites the
time till 400 million records to get and graph 2 shows the big gap of use of CSI.
total sales across products. The hundreds of seconds. The result
www.aditi.com
4. Query Execution - Cold Cache
Query Exec Time (in sec.) 1400
1200
1000 Column Store
800 Indexes
600 Conventional
Indexes
400
200
0
12.5 25 50 100 200 400
Number of Rows (in millions)
Graph-3
Query Execution - Warm Cache
1400
Query Exec Time (in sec.)
1200
1000
Column Store
800 Indexes
600 Conventional
Indexes
400
200
0
12.5 25 50 100 200 400
Number of Rows (in millions)
Graph-4
Gain in query performance - Warm Vs. Cold Cache
90
Perf Gain (number of times)
75
60
Warm Cache
45
Cold Cache
30
15
0
12.5 25 50 100 200 400
Number of Rows (in millions)
Graph-5
www.aditi.com
6. VartiPaq-ing & Appolo
VertiPaq-ing is vertical partition of the data or in other words storing in the data in column-wise fashion. The
Diagram-3 shows the difference between the row and column store data layout in terms of pages which is
basic unit of storage. For detailed discussion refer to the Basics Behind the Scenes section below. The goal
behind it is to accelerate the common DW queries.
Appolo is the code name available in SQL 2012 with the new feature available targeting
•
•
•
•
www.aditi.com
7. xVelocity optimized to use multi cores and the machine. Concisely they are
xVelocity is term used by SQL high memory. Some more highly optimized in-memory
Server family to define next- utilization of these techniques are operations. Below is the screen
generation technologies. These there in Analysis Services and shot taken during column store
technologies targets surprisingly PowerPivot. Portions of data are index creation of CPU utilization
high query performance in moved in and out of memory by xVelocity technologies.
modern hardware. They are based on the memory available in
Graph-7
Basics Behind the Scenes
Full Column Store & Hybrid only indexes). Refer to row. The rows spans over
Column Store diagram-2 and 3 for details. multiple data blocks. The
SQL 2012 is full columnar On the other hand hybrid diagram-5 shows the detail of
storage where each column is column used both rows and the concept. This way the large
compressed and stored columns to store data. Hybrid amount of compression is
together. This technique has its technique creates column achieved as well as the
own advantages but it may vector for each performance issues of the full
negatively impact the column, compresses and stores columnar databases is also
performance on accessing in data blocks. The mitigated.
more columns or perform compression unit contains
small number of updates more than one data blocks and
(although SQL 2012 has read- it contains all columns for a
Graph-8
www.aditi.com
8. For the warehousing scenario the HCC approach many times is less performing because of
•
•
•
Segments & Dictionaries
The columnar indexes are of the same data types in a about segments.
physically stored in the form of segment. Even the large repeated A value reference links to an entry
Segments. Typically data per data the compression is even in one of up to two hash
column is broken as one million better as a unique small symbol Dictionaries. Dictionaries are kept
rows per segment (a.k.a. row is stored for the duplicate value in memory and the data value ids
groups) for each column. The which saves the size with large from the segment are referred
segments are stored as LOB and degrees. Segments also have from these dictionaries but this
can contain multiple pages. The header records containing process is held over as long as
index build process runs in parallel max_data_id, min_data_id etc. possible for better performance
and creates as many full segments These header information is reasons. Simply for a table with
as possible but some of the used to omit he compete one partition every column added
segments can have comparatively partition commonly known as to the column store index will be
small size. These segments store segment estimation. The anti- added as a row in the segment.
highly compressed values because patterns part details even more
Batch Mode Processing & Row Processing
Query processing can be done disk (mostly in hash join); which each column within batch is stored
either in Row mode or in Batch can be checked by tempdb uses; as vector in memory which is
Mode. While taking an example of also increases the memory uses known as vector-based query
join physical join operation takes 2 for processing. processing. It uses the latest
sets as input parameters and Vector processing was one of the algorithms to utilize the multicore
produces the output set based on biggest revolutions which brought CPUs and the latest hardware.
the join conditions. In the row the fundamentals of batch Batch processing works on the
processing each these sets are processing. These physical compressed data when possible
processed in row-by-row mode operators for query processing and thus reduces the CPU
e.g. nested loop join etc. and large takes batch of rows in form of an overhead on join operations; filter
amount of CPU is used. Most of array (of same type) and process etc. (only some of the operators)
the times while operating on large the data. Here batch typically
amount of data also spill over the consists of 1000 rows of data and
www.aditi.com
9. Demonstration Example
For the demonstration purpose Contoso Retail DW is being used, made available from Microsoft.
Creation of Columnar Index Ultimately they both are same. The Store Indexes Vs. Conventional
Column Store Indexes Vs. basic T-Sql index is as below and Indexesfrom here.
Conventional Indexesquery. details can be captured Column
Creation of Columnar Index – Code Block 1
CREATE [ NONCLUSTERED ] COLUMNSTORE INDEX index_name ON <tablename> ( column [ ,...n ] )
[ WITH ( <column_index_option> [ ,...n ] ) ]
[ ON {
{ partition_scheme_name ( column_name ) }
| filegroup_name
| "default"
}
]
Below steps can be followed to create the column store index from index creation wizard.
•
•
www.aditi.com
11. Performance Observations
The performance check ContosoRetailDW having detailed with example in
was done one the table 12.6 million records. More the Anti-Patterns section
dbo.FactOnlineSales‟ from facts and limitations are below.
•
•
•
•
•
Graph-10
www.aditi.com
12. Design Considerations
Candidates for Column Store Index
DW scenarios most commonly fall designed to accelerate the queries do not take considerable large
in the pattern of having read-only satisfying above said criteria. This space. Although the algorithm is
data where data is appended makes CSI an absolutely perfect fit designed to compress in large
periodically commonly using for the DW scenarios. So the rule scale still for the best practice we
sliding window pattern. They of thumb says large fact tables are should only include the all the
seldom have updates. Data is the candidates for CSI. Security of dimension keys and measures
retained for longer time of at least the data is not a big concern from the table.
8 to 10 years resulting huge because CSI also supports The fact-less fact tables and
volume of data in Transparent Data Encryption (TDE). multivalued dimensions are not
gigabytes, terabytes or even Another question is what all always perfect fit because they will
petabytes for some scenarios. The columns need to be added to CSI? not gain the benefit of batch
DW data mostly is divided either The answer seems considerably processing but the advantage of
in star or snowflake pattern where easy that all the columns can be compression and parallel read and
fact table contains millions and included as long as they follow the segment estimation will definitely
billions or records ready to be prerequisites quoted in the Anti- be there. Below is the example of
aggregate in different fashion. All Patterns section. This decision can choosing candidate tables for CSI.
these schemas are queried be true when we talk about the This selection mostly is based
typically using star join queries for small or medium scale DW upon number of rows and mostly
grouping aggregations. because audit columns or some of they will be fact tables only.
Column store indexes are the text columns in the fact tables
Candidates for Column Store Index - Code Block 1 Diagram-9
--Choose candidate tables for CSI
SELECT O.name TableName ,SUM(P.rows) CountOfRows
FROM sys.partitions P
JOIN sys.objects O ON P.object_id = O.object_id
WHERE O.type = 'U' --user tables
GROUP BY O.name
ORDER BY 2 DESC
Graph-11
Below is the example for choosing while creating CSI. We‟ll ignore the primary key defined in it so will
candidate columns for the fact audit and degenerated dimension automatically be added to the CSI
table using FactOnlineSales. The columns here e.g. if not mentioned in column list.
mark is used to show the selection SalesOrderNumber Corresponding SQL code refers to
for the columns for the ,SalesOrderLineNumber „Candidates for Column Store
dimensions. Along with it all the ,ETLLoadID ,LoadDate Index – Code Block 2‟ in SQL file.
measures will also be included ,UpdateDate. OnlineSalesKey have
www.aditi.com
13. Graph-12
SQL code „Candidates for are accelerated within a primary
Column Store Index – Code second. Both the star and or and secondary snow flake
Block 3‟ in corresponding SQL snowflake schema query are dimension is too large to
file contains the example of the benefited by CSI. Snowflake support batch mode.
star join query where results may have issues if any of the
Anti-Patterns
Design considerations are limitations always provide first soprano of design
always with-in the defined foundations to decide on decision.
boundaries. Anti-patterns / boundaries making them the
• Only one CSI can be created on a table. It returns the below error.
Msg 35339, Level 16, State 1, Line 1
Multiple nonclustered columnstore indexes are not supported.
• Msg 35339, Level 16, State 1, Line 1
• Multiple nonclustered columnstore indexes are not supported.
• Key column concept does not relevant in CSI because data is stored in columnar fashion hence each
column will be stored in its own way. Having a clustered key will make difference only while creating the
column store index in terms of reads and the order but there is no impact in query performance.
• The base table on which index is created is read-only i.e. can‟t be updated or altered. Managing updates
is quoted below.
• Interestingly the order of the columns in the create index statement do not have impact either in
creating index or in query performance.
• Only limited data types are allowed for CSI i.e.
I
nt, bigint, smallint, tinyint, money, smallmoney, bit, float, real, char(n), varchar(n), nchar(n), nvarchar(n),
date, datetime, datetime2, smalldatetime, time, datetimeoffset with precision <=2, decimal/numeric with
precision <= 18
• CSI can have at most 1024 columns and don‟t support
www.aditi.com
14. - Sparse & Computed Columns
- Indexed Views or Views
- Filtered & Clustered Index
- With INCLUDE, ASC, DESC, FORCESEEK keywords
- Page and row compression, and vardecimal storage format
- Replication, Change tracking, Change data capture & Filestream
• CSI can simply be ignored using „IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX‟. This option helps user not to
know the other index names. Even more helpful if other index names are left to the automatic naming by SQL
server when it is difficult to know the name while writing queries.
Anti-Patterns - Code Block 1
SELECT P.BrandName Product ,SUM(SalesAmount) Sales
FROM dbo.FactOnlineSales S
JOIN dbo.DimProduct P ON S.ProductKey = P.ProductKey
GROUP BY P.BrandName
OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX)
• Only below operators support batch mode processing therefore use of CSI.
- Filter
- Project
- Scan
- Local hash (partial) aggregation
- Hash inner join
- (Batch) hash table build
• Outer joins, MAXDOP 1, NOT IN, UNION ALL are not supported for batch mode execution. Rather we can have a
tweak for the existing query for the same. Some examples are here below.
• Although filters in CSI are pushed down to the segments to get benefit of segment estimation but string filters
do not have max or min values hence they do not utilize these filters. So string filters or joins should be avoided
on CSI.
• It is observed during above investigation that after the partition switching the query compilation for the first time
takes lot of time. Similar behavior was not found while insertion or deletion of data from the table. It may be
because of the estimation changes due to partition switching mainly in large data scenario. It is recommended to
warm the cache after the partition switching.
Managing Column Store Indexes
Memory Considerations
Column Store Indexes (CSI) is the available CPUs and the MaxDOP or 8658 comes. It can be resolved
technology which is created setting restrictions. For large data by granting enough memory to
considering modern hardware creation of CSI takes the server and the corresponding
with multiple CPUs and high comparatively more time than the workgroup. There can be requests
memory operating on large B-Tree indexes. Before the for more memory at later point of
amount of data specially terabytes. creation there is memory estimate execution and if enough memory
For CSI memory is used during the done initially for the query is not there the insufficient
creation and execution time. We‟ll execution and that memory grant memory error can flash i.e. error
discuss them separately. is provided. There may be cases code 701 or 802.
Creating CSI is a parallel where initial memory request is
operation. It is dependent on the not granted and error code 8657
www.aditi.com
15. Latter error codes come during resolution for other 2 errors is at optimizer changes the query plan
the execution at run time whereas 701 and 802. On concluding to use the row mode. We can
the former come at the starting of remarks the CSI can‟t be created if check the batch mode processing
the query execution. Solution to enough memory is not there in the uses by query plan as below. Batch
the problem is to change memory system. One of the easiest mode processing always uses the
grant for the workgroup or solutions for such memory memory so whenever there disk
increase memory in the server. The considerations is vertical partition spill due to large data the row
8657 or 8658 sometimes can occur of the existing table i.e. breaking by row processing replaces the
because of the SQL server the existing table to two or more batches; mostly seen during
configuration of „min server tables. hash join for large tables. Another
memory‟ and „max server CSI uses the batch mode reason for row by row processing
memory‟. Suppose the minimum processing for the execution. is incorrect statistics update which
memory needed for the CSI is 3GB Typically a batch consists of 1000 in turn spills the data into the
and SQL Server have not taken on rows stored in the vector. This disks resulting row by row
1GB memory due to min server type of processing is optimized to operation. To check this
memory configuration then it can use the modern hardware. This operation the extended event
happen. The resolution can be provides better parallelism. Batch batch_hash_table_build_bailout‟
either run a COUNT(*) query on operators can work on can be configured. The warning
any of the large tables before the compressed data resulting in high „Operator used tempdb to spill
index creation or make the min degree of processing in small data during execution‟ also flashes
and max server memory values to memory. A considerable amount for this kind of behavior.
same number. This will help SQL of memory is needed to execute
server to take the required the batch mode query processing.
memory at the starting time. The If the memory is not present the
Graph-13
Add & Modify Data in Column
Store Index
Table with CSI is read only i.e. we
can‟t perform operations like
INSERT, UPDATE, DELETE or
MERGE. These operations fail with
the error message e.g.
Msg 35330, Level 15, State 1, Line 1
UPDATE statement failed because data cannot be updated in a table with a columnstore index. Consider disabling the
columnstore index before issuing the UPDATE statement, then rebuilding the columnstore index after UPDATE is complete.
www.aditi.com
16. Considering this we have the below options or workarounds for the operation.
• Have staging/work tables without CSI (most of the cases these are drop and recreate tables). Create CSI and
switch it to the empty partition of the table. We have to make sure that we have the empty partition because if
there is data in the partition and CSI is created into the table we can‟t split it. Below is the example code segment
for the same. Corresponding SQL code refers to „Add & Modify Data – Code Block 2‟ in SQL file.
Add & Modify Data - Code Block 1
ALTER INDEX csiFactOnlineSales ON dbo.FactOnlineSales DISABLE
GO
UPDATE dbo.FactOnlineSales
SET SalesAmount = SalesAmount * 2
GO
ALTER INDEX csiFactOnlineSales ON dbo.FactOnlineSales REBUILD
• Switch a partition from table to the empty staging table. Drop CSI from staging table and perform updates, inserts
etc. and build the CSI and switch the staging table to the empty (empty by previous switch) partition.
Corresponding SQL code refers to „Add & Modify Data – Code Block 3‟ in SQL file.
• We can choose to create different underline tables to represent a fact table and access all of them using UNION
ALL views. Just disable the index in the most recent table which will have the updates and rebuild/recreate the CSI.
We can always get the data from those UNION ALL views.
• Put the data into the staging table, create the CSI in staging table and just drop the existing table and rename the
staging table to original (better to do both of the operations in a transaction, note that both of them will be
metadata operations only). This will have the more processing time but will ensure the high availability. This option
can be chosen only when there are relatively small or medium scales of data in the table.
Size of Column Store Index Statistics are another valuable statistics object for CSI is used for
Size of the CSI is based on the size consideration. We have statistics the database cloning (DB clone is
of the segment and dictionaries. for the base table having CSI but copy of the statistics-only
Most of the space is used by the not for the CSI in particular. The database investigating query plan
segments. We can get the same in statistics object is created for the issues). Corresponding SQL code
more simplified manner. Here are CSI but SHOW_STATISTICS shows refers to „Size of Column Store
the simple and the actual size null for the CSI and show values Index – Code Block 1‟ in SQL file.
estimation query. for the clustered index. The
www.aditi.com
17. Column Store Indexes Vs. Conventional Indexes
Column Store Index vs. Clustered Indexes
CSI is different than all other highly selective query i.e. only few automatically figures out the
conventional indexes. Both of records are being queried using highly utilized query. Moreover
them are the utilities for different both of the indexes. Please take a plan guides can also be pinned for
type of scenarios. Till now we have note that we are using only those abnormal behavior of the queries.
seen that CSI are a lot faster than columns which are being used in Here for the apple to apple
the conventional indexes. Here the CSI creation. The comparisons comparison we are using only the
below is the example where CSI is among the indexes is always columns used to create CSI.
taking almost 99% in the relative based on the nature of uses on
query plan. Here we are using the the data i.e. queries. SQL Server
Column Store Index vs. Clustered Indexes – Code Block 1
SELECT SalesAmount ,ProductKey FROM dbo.FactOnlineSales S WITH (INDEX(PK_FactOnlineSales_SalesKey))
WHERE OnlineSalesKey IN (32188091,23560484,31560484,27560484)
SELECT SalesAmount ,ProductKey FROM dbo.FactOnlineSales S WITH (INDEX(csiFactOnlineSales))
WHERE OnlineSalesKey IN (32188091,23560484,31560484,27560484)
Graph-14
Column store index Vs. them to the B-Tree using INCLUDE Selecting one more column can
Covering Indexes Vs. keyword. A very detailed make the covering index
One index each column description can be referred ineffective which is not the case
Covering index is the highly used from here. with normal index. Creating each
terminology to achieve the high CSI or the covering index, again index each column will not be
performing queries. Creation of the discussion depends on the useful on selecting multiple
covering index is always a cautious amount of data, the query and the columns. Moreover the size of all
decision. It is very difficult to put memory. On the same nodes CSI the covering or other indexes
indexes which cover all the uses compression as well as the captures relatively larger footprint
queries, particularly in the data batch mode processing hence on the disk, which is multiple
warehousing scenarios where faster scans. If we have the entire copies of the same data resulting
users are open to use any kind of star schema for our DW the CSI is more maintenance and sometimes
queries. Covering index can be best to use for aggregative adding to downtime to the
achieved either by adding the queries. It also reduces the index application.
columns into the index i.e. design and maintenance time and
composite index or by pinning one index shows all of the magic.
www.aditi.com
18. On the other hand here is another row execution. Here we‟ll just query plan. Corresponding SQL
example which shows that CSI is create example table joining with code refers to „Column store index
not benefiting more on query FactOnlineSales. Both of the table Vs. Covering Indexes – Code Block
execution time because of the will have the same cardinality. We 1‟ in SQL file.
large hash joins and batch can easily see a warning message
execution turning back to row by and warning icon in the actual
Graph-15
Graph-16
Performance Tuning Considerations
Analyzing TempDB Uses
TempDB is core of all the temp tempdb as well. Point of analysis is row operation instead of batch
operations for which memory is tempdb uses during creating and ensuring data is spilled into the
not granted. SQL server uses the query on CSI. For the surprise the disk i.e. tempdb is used and above
tempdb extensively and if users tempdb was not used during the is the example showing this
have read only permissions on any creation as well as querying time. behavior. Corresponding SQL code
of the databases that ensures the The tempdb will be used when the refers to „Analyzing TempDB Uses
read-write permissions on the execution is done using row by – Code Block 1‟ in SQL file.
www.aditi.com
19. Maximizing Segment Estimation
The data of CSI is divided in segments and this information is stored in the „column_store_segments‟ system table.
The columns for the relevance to understand segment estimation are in below query.
Maximizing Segment Estimation - Code Block 1
SELECT S.column_id ,S.segment_id ,S.min_data_id ,S.max_data_id
FROM sys.column_store_segments S
Here segment stores the min and to the segment the scan for that writing a query for the below
max value for the segment and if segment is ignored i.e. called which says „OnlineSalesKey >
the filter value does not belongs segment estimation. E.g. if we are 30000000‟ the second segment
will be ignored.
Graph-16
Here in the example we are seeing one segment is eliminated. Here and the values are aligned
that the min and max values are we need to find how to arrange properly to the segments. We can
skewed. This is not ideal for the the data so that we have use the below techniques.
segment estimation because only maximum number of partitions
•
•
Maximizing Segment Estimation - Code Block 2
SELECT G.ContinentName ,SUM(S.SalesAmount) TotalSales
FROM dbo.FactOnlineSales S
JOIN dbo.DimCustomer C ON C.CustomerKey = S.CustomerKey
JOIN dbo.DimGeography G ON G.GeographyKey = C.GeographyKey
WHERE S.DateKey BETWEEN '2012-01-01' AND '2012-12-30'
GROUP BY G.ContinentName
www.aditi.com
20. Graph-17 Graph-18
On running the above query again also should have enough data for that none of the queries are using
we can found that the segment each partition so that the MAXDOP 1 option. Below example
estimation will scan will skip the segments are utilized. If we‟ll have shows the difference in the
crossed partitions and thus the less than 1 million records we may execution plan. The below query
segment estimation is maximized. end up doing crash landing and plan shows that there is no use of
It is nice to use this approach but queries may not help as expected. parallel and batch operations.
it is very hard to manage these Moreover the cost for the CSI scan
kinds of partitions and it may end Ensuring Batch Mode Execution is also more for MAXDOP 1.
up coming out to be another tool. Batch mode vector based
Moreover adding other multiple execution helps the query a lot.
dimensions will add similar MAXDOP configuration helps to
complexity to the partitions. We check this behavior by ensuring
Graph-19
Batch mode processing is not join records. The query plan shows have to mark time and have close
supported for outer joins for this all different results where batch eye for each query being written
release of SQL Server. To get the mode and row mode is used along on CSI. Query plans should be
benefit of the batch processing we with the parallelism. It also shows monitored closely for further
need to change the queries a bit. that the alternate query just takes changes not only in development
One of the typical example of the 12% of the relative cost. but also in production
changing the query is as below These examples shows that we environments. Corresponding SQL
where we first are getting inner need to redesign our conventional code refers to „Ensuring Batch
join values and joining them back queries to take advantage of the Mode Execution – Code Block 1‟ in
to the dimension table for outer batch mode. The bottom line is we SQL file.
www.aditi.com
22. Using Partitioning partitioning is detailed here.
Partitions are another key Every partition is compressed
Conclusions
Columnar indexes are the
performance factor which still separately and has its own
breakthrough innovation with
works with columnar indexes dictionaries. These dictionaries
the capability to expend the
having said that every non- are shared across the segments
envelope to improve overall
partitioned table always has one within same partition. We can
performance of the ETL and
physical partition. We can create easily find which segments and
DW workload. Current version
partitions on the table and then dictionaries belong to any
of 2012 has a bit of limitation
create CSI. The CSI must be specific partition. Thus partition
with expectation of
partition aligned with the table switching still is a metadata
improvement in future version
partition. CSI uses the same operation. We already have
especially with respect to
partition scheme used by the partitions created on the table
addition and modification of
base table and equal number of while exploring CSI management
data. It‟s recommended to use
partitions will be created. We can above and will be using same to
the best
also switch in or switch out the explore more. Below example
practices, understanding case
partition and that will partition shows that we have 2 dictionaries
studies and fitting the defined
the corresponding CSI. Segments for each partition irrespective of
business problem to the
are created on the partitions. So segments except partition 5
columnar solution pattern.
if the partitions are skewed we which is without dictionaries.
can see more number of Exploring this behavior is left to
segments. Modifying CSI using the reader.
Using Partitioning - Code Block 1 Diagram-21
/*Exploring Partitions*/
SELECT * FROM sys.partition_schemes
SELECT * FROM sys.partition_functions
SELECT * FROM sys.partition_range_values
SELECT * FROM sys.partition_parameters
SELECT P.partition_number ,COUNT(*) Segment#
FROM sys.column_store_segments S
JOIN sys.partitions P ON P.hobt_id = S.hobt_id
WHERE P.object_id = OBJECT_ID('[dbo].[FactOnlineSales]')
GROUP BY P.partition_number
SELECT P.partition_number ,COUNT(*) Dictionary#
FROM sys.column_store_dictionaries D
JOIN sys.partitions P ON P.hobt_id = D.hobt_id
WHERE P.object_id = OBJECT_ID('[dbo].[FactOnlineSales]')
GROUP BY P.partition_number
About Aditi
Aditi helps product companies, web businesses and enterprises leverage the power of www.aditi.com
cloud,
e-social and mobile, to drive competitive advantage. We are one of the top 3 Platform- https://www.facebook.com/AditiTechnologies
as-a-Service solution providers globally and one of the top 5 Microsoft technology
partners in US. http://www.linkedin.com/company/aditi-
technologies
We are passionate about emerging technologies and are focused on custom
development. We provide innovation solutions in 4 domains: http://adititechnologiesblog.blogspot.in/
Digital Marketing solutions that enable online businesses increase customer https://twitter.com/WeAreAditi
acquisition
Cloud Solutions that help companies build for traffic and computation surge
Enterprise Social that enables enterprises enhance collaboration and productivity
Product Engineering services that help ISVs accelerate time-to-market
www.aditi.com