Home » Blogs » Cloud Services, Data as an Asset
Cloud vs On­prem Data Warehouse
By Siddharth Jothimani | July 20, 2023
Introduction
A data warehouse is a central repository of data and information businesses can use to analyze and make
informed decisions. These data can come from in­house applications and databases regularly and are
accessed by various people depending on their requirements. Various business intelligence systems and
analytic solutions access these data and give decision­makers meaningful insights. While transactional
database systems (OLTP) systems enable the real­time execution of many transactions across multiple
databases, they may not be more suited for sizeable analytical processing. Data warehouses are best suited
for business intelligence and reporting use cases as they offload the analytical processing from
transactional databases and provide faster processing of large volumes of data through various data storage
modes.
A simple architecture of a data warehouse is below:
Search
This website stores cookies on your computer. These cookies are used to collect information about how you interact
with our website and allow us to remember you. We use this information in order to improve and customize your
browsing experience and for analytics and metrics about our visitors both on this website and other media. To find out
more about the cookies we use, see our Privacy Policy
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your
browser to remember your preference not to be tracked.
Accept Decline
Businesses deploy data warehouses in two methodologies:
1. On­premises (on­prem) deployment
2. Cloud­based (SaaS) deployment
On­prem data warehouses
In an on­prem data warehouse model, the customer is entirely responsible for purchasing, deploying, and
maintaining the hardware and software. The on­prem data warehouse is aptly named as it resides within the
customer's data center, ensuring its physical presence at their premises. The customer will have complete
control over the security aspects of the warehouse. This is applicable starting from perimeter security to
prevent any physical damage up to securing the data stored in the warehouse using appropriate encryption
tools.
Some of the commonly used on­prem data warehouse products are:
1. IBM Integrated Analytics System
2. Pivotal Greenplum
3. Teradata
Benefits
Some of the common advantages of on­prem systems include:
1. Control – The organization will control the entire data warehouse operations. This includes:
End­to­end software technology stack, including all customizations and configurations that suit their
requirements.
The type of hardware to buy ­ commodity servers or purpose­built storage and networking
components, power supply, backup options, etc.
Physical access to a data warehouse in case of any failure for inspection and troubleshooting of any
component – be it at the hardware or software layer.
2. Performance – The network latency between various components in an on­prem data warehouse will be
relatively less. This will lead to a marginal increase in the speed or performance of the application.
However, other factors affect latency, and just being on­prem does not guarantee good speed or
performance of the application.
3. Governance – Since the data warehouse is located on a customer’s premises, requirements around data
governance and compliance are much easier to achieve. Regulatory requirements like GDPR or CCPA are
easier to implement as you know where the data is precisely located.
Accept Decline
Challenges
Some of the drawbacks of on­prem systems include:
1. High upfront cost – A price is involved for setting up an on­prem warehouse, including the hardware
and software cost and physical building with all required systems to keep the center up and running. This is
also exacerbated by hardware depreciation, recurring maintenance costs for support personnel, etc.
2. Need for support from the team – There should always be a support team that is primarily responsible
for keeping the systems up and running efficiently. This team includes administrators and engineers at
different levels – network, system, database, and application.
3. Cannot quickly scale up or down based on business need – Rapidly adjusting resources to
accommodate unexpected surges in activity poses a significant challenge for data warehouses.
Cloud data warehouse
Businesses are moving towards a Cloud­based data warehousing model to leverage Cloud providers'
advantages, leading to a new service model called Data warehouse­as­a­Service (DWaaS).
Some of the facts that corroborate this include:
• Research Nester published a report titled "Data Warehouse as a Service (DWaaS) Market: Global
Demand Analysis & Opportunity Outlook 2031" – which states that the DWaaS market is estimated to
grow at a CAGR of ~22% during the forecast period 2022 – 2031.
• The DWaaS market size was valued at USD 4.26 Billion in 2021 and is projected to reach USD 29.52
Billion by 2030.
Some of the popular Cloud data warehouses include:
1. Snowflake – Can operate across multiple Cloud providers
2. Google BigQuery
3. AWS RedShift
4. Microsoft Azure Synapse Analytics
Benefits
Some of the essential benefits offered by Cloud warehouses include:
1. Pay­as­you­go pricing model – There is no upfront cost involved in setting up a data warehouse, and
the pricing model almost always depends on the usage of services. This means no capital expenses for
organizations and only running operational costs, thereby saving the TCO.
2. High scalability – Cloud solutions are scalable based on business needs, and handling colossal capacity
or volume of data will never be an issue. The resources will be added and removed based on the load and
can be customized by organizations. For example, the organization can set up a rule to increase
computational resources by 80% over the weekend to account for a promotional sale on its website. And
another rule is to delete the newly added resources once the sale is completed.
3. Quick time to market – Since there is no upfront cost to build the warehouse, organizations can quickly
deploy their application and gather business insights, increasing their time to market.
4. High availability – Most services provide at least 99.9% data availability. This is coupled with high
durability and reliability as the data is stored in multiple data centers across different regions.
5. Security – While security is a common concern towards moving to the Cloud, most providers invest
heavily in security aspects and have various mechanisms to ensure the data is safe. This includes
encryption of the data stored in disk (encryption­at­rest) using multiple options, encryption in transit using
encryption of the data stored in disk (encryption­at­rest) using multiple options, encryption in transit using
SSL, diligently following various security certifications like SOC 2, etc.
Challenges
Despite all the advantages offered by migrating to the Cloud, organizations face specific challenges, some
of which are listed below:
1. Most Cloud services expect the users to be aware of their responsibility in using them. Some critical
features like security in the Cloud, cost management, and user access control depend entirely on how the
organizations configure and use such features. Simply put, the security of the Cloud is the Cloud provider's
responsibility, and security in the Cloud is the customers' responsibility.
2. Based on usage, the dynamic pricing model of Cloud services presents a challenge for organizations
once they adapt to this flexible cost structure.
3. Deploying hybrid architectures that need high interoperability and customizations is difficult. Some of
the architectures would have used highly customized or legacy software. Cloud services cannot provide
fine­grained customization, and organizations must follow different approaches to deploy such
architectures.
4. Contractual obligations and technical challenges to change the Cloud provider if required.
Key differences
Here are some key differences between on­prem and DWaaS data warehouses based on common criteria.
Criteria On­prem data warehouse Cloud data warehouse
Cost
Upfront capital expense required.
No upfront capital cost whereas operating expenses
will be incurred.
Cost depreciation over time. Variable monthly cost depending on the usage.
Regular maintenance of hardware. No maintenance overhead
Need dedicated support personnel.
Premium support cost is usually required for
critical applications
No monthly costs.
Scalability
Highly rigid. Any changes to
hardware or software requires heavy
IT time and effort to execute.
Highly elastic. Resources can be added or removed
on the fly without any manual intervention based
on the application load.
Time to market
Higher go to market time as the
infrastructure needs to be built first.
Less go to market time. Businesses can build their
applications quickly and deploy to get user
feedback without worrying about the infrastructure.
Built in
ecosystem
In on­prem environment,
organization should build all
applications for security, user
management, monitoring,
notifications, analytics, etc.
All cloud providers offer services for security, user
management, analytics, etc. so the organization
need not go to different software for each use cases.
Security
Security is sole responsibility of the
organization and in­house IT team
deployed to maintain the on­premises
data warehouse.
Security is a shared responsibility between cloud
service providers and organizations. All necessary
security features are available in cloud and should
be implemented by the organization.
Conclusion
Conclusion
Most organizations consider DWaaS an integral step in their architecture landscape, considering the savings
in cost and effort that Cloud solutions offer. However, on­premises data warehouses are optional. Some
industries have highly customized or niche legacy use cases running on their on­premises for decades, for
which Cloud support may need to be higher. So, it is up to organizations to assess their technology
landscape and roadmap and identify which is best suited for their interests.
Mastech helps customers understand their business requirements and guide them to build the right data
warehouse on­premises or in Cloud that they can use to unlock important insights out of the data.
SUBSCRIBE
Siddharth Jothimani
Director ­ Data In Motion
Technology leader with 18+ years of expertise in application architecture, design and Cloud adoption
transforming businesses through innovative data solutions.
MORE READS
Cloud Data Warehouse – Comparison of the Big 5
By Bala Uppaloori | August 21, 2023
Pioneering the Future: Your Guide to a Next­gen Cloud Data Warehouse
By Data Management Team | August 18, 2023
Data Warehouse Evolution: How Modernization Enhances Business Intelligence
By Data Management Team | July 31, 2023
Choose Category
Data Management
Data as an Asset
Data Science
Intelligence
Data Engineering
Cloud Services
POPULAR READS
Cloud Services, Data as an Asset | July 23, 2019
How to use Informatica Power Center as a RESTful Web Service Client?
Cloud Services, Data as an Asset | July 23, 2019
How to integrate Informatica Data Quality (IDQ) with Informatica MDM
Cloud Services, Data as an Asset | July 25, 2019
Deterministic Matching versus Probabilistic Matching
Cloud Services, Data as an Asset | July 25, 2019
Informatica MDM MDE Batch Process in a nutshell
Cloud Data Warehouse
Horses for Courses
September 07, 2023
Register Now
Stay Updated On The Trending
Topics On Data And Analytics.
S U B S C R I B E
W H O W E A R E C A R E E R S E V E N T S R E S O U R C E S B L O G
C O N TA C T U S
Sitemap Terms of Use Privacy Policy Brand Guidelines
Mastech InfoTrellis, INC. © 2023
Mastech InfoTrellis is the wholly­owned subsidiary of Mastech Digital

Cloud vs On-prem Data Warehouse.pdf

  • 1.
    Home » Blogs» Cloud Services, Data as an Asset Cloud vs On­prem Data Warehouse By Siddharth Jothimani | July 20, 2023 Introduction A data warehouse is a central repository of data and information businesses can use to analyze and make informed decisions. These data can come from in­house applications and databases regularly and are accessed by various people depending on their requirements. Various business intelligence systems and analytic solutions access these data and give decision­makers meaningful insights. While transactional database systems (OLTP) systems enable the real­time execution of many transactions across multiple databases, they may not be more suited for sizeable analytical processing. Data warehouses are best suited for business intelligence and reporting use cases as they offload the analytical processing from transactional databases and provide faster processing of large volumes of data through various data storage modes. A simple architecture of a data warehouse is below: Search This website stores cookies on your computer. These cookies are used to collect information about how you interact with our website and allow us to remember you. We use this information in order to improve and customize your browsing experience and for analytics and metrics about our visitors both on this website and other media. To find out more about the cookies we use, see our Privacy Policy If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference not to be tracked. Accept Decline
  • 2.
    Businesses deploy datawarehouses in two methodologies: 1. On­premises (on­prem) deployment 2. Cloud­based (SaaS) deployment On­prem data warehouses In an on­prem data warehouse model, the customer is entirely responsible for purchasing, deploying, and maintaining the hardware and software. The on­prem data warehouse is aptly named as it resides within the customer's data center, ensuring its physical presence at their premises. The customer will have complete control over the security aspects of the warehouse. This is applicable starting from perimeter security to prevent any physical damage up to securing the data stored in the warehouse using appropriate encryption tools. Some of the commonly used on­prem data warehouse products are: 1. IBM Integrated Analytics System 2. Pivotal Greenplum 3. Teradata Benefits Some of the common advantages of on­prem systems include: 1. Control – The organization will control the entire data warehouse operations. This includes: End­to­end software technology stack, including all customizations and configurations that suit their requirements. The type of hardware to buy ­ commodity servers or purpose­built storage and networking components, power supply, backup options, etc. Physical access to a data warehouse in case of any failure for inspection and troubleshooting of any component – be it at the hardware or software layer. 2. Performance – The network latency between various components in an on­prem data warehouse will be relatively less. This will lead to a marginal increase in the speed or performance of the application. However, other factors affect latency, and just being on­prem does not guarantee good speed or performance of the application. 3. Governance – Since the data warehouse is located on a customer’s premises, requirements around data governance and compliance are much easier to achieve. Regulatory requirements like GDPR or CCPA are easier to implement as you know where the data is precisely located. Accept Decline
  • 3.
    Challenges Some of thedrawbacks of on­prem systems include: 1. High upfront cost – A price is involved for setting up an on­prem warehouse, including the hardware and software cost and physical building with all required systems to keep the center up and running. This is also exacerbated by hardware depreciation, recurring maintenance costs for support personnel, etc. 2. Need for support from the team – There should always be a support team that is primarily responsible for keeping the systems up and running efficiently. This team includes administrators and engineers at different levels – network, system, database, and application. 3. Cannot quickly scale up or down based on business need – Rapidly adjusting resources to accommodate unexpected surges in activity poses a significant challenge for data warehouses. Cloud data warehouse Businesses are moving towards a Cloud­based data warehousing model to leverage Cloud providers' advantages, leading to a new service model called Data warehouse­as­a­Service (DWaaS). Some of the facts that corroborate this include: • Research Nester published a report titled "Data Warehouse as a Service (DWaaS) Market: Global Demand Analysis & Opportunity Outlook 2031" – which states that the DWaaS market is estimated to grow at a CAGR of ~22% during the forecast period 2022 – 2031. • The DWaaS market size was valued at USD 4.26 Billion in 2021 and is projected to reach USD 29.52 Billion by 2030. Some of the popular Cloud data warehouses include: 1. Snowflake – Can operate across multiple Cloud providers 2. Google BigQuery 3. AWS RedShift 4. Microsoft Azure Synapse Analytics Benefits Some of the essential benefits offered by Cloud warehouses include: 1. Pay­as­you­go pricing model – There is no upfront cost involved in setting up a data warehouse, and the pricing model almost always depends on the usage of services. This means no capital expenses for organizations and only running operational costs, thereby saving the TCO. 2. High scalability – Cloud solutions are scalable based on business needs, and handling colossal capacity or volume of data will never be an issue. The resources will be added and removed based on the load and can be customized by organizations. For example, the organization can set up a rule to increase computational resources by 80% over the weekend to account for a promotional sale on its website. And another rule is to delete the newly added resources once the sale is completed. 3. Quick time to market – Since there is no upfront cost to build the warehouse, organizations can quickly deploy their application and gather business insights, increasing their time to market. 4. High availability – Most services provide at least 99.9% data availability. This is coupled with high durability and reliability as the data is stored in multiple data centers across different regions. 5. Security – While security is a common concern towards moving to the Cloud, most providers invest heavily in security aspects and have various mechanisms to ensure the data is safe. This includes encryption of the data stored in disk (encryption­at­rest) using multiple options, encryption in transit using
  • 4.
    encryption of thedata stored in disk (encryption­at­rest) using multiple options, encryption in transit using SSL, diligently following various security certifications like SOC 2, etc. Challenges Despite all the advantages offered by migrating to the Cloud, organizations face specific challenges, some of which are listed below: 1. Most Cloud services expect the users to be aware of their responsibility in using them. Some critical features like security in the Cloud, cost management, and user access control depend entirely on how the organizations configure and use such features. Simply put, the security of the Cloud is the Cloud provider's responsibility, and security in the Cloud is the customers' responsibility. 2. Based on usage, the dynamic pricing model of Cloud services presents a challenge for organizations once they adapt to this flexible cost structure. 3. Deploying hybrid architectures that need high interoperability and customizations is difficult. Some of the architectures would have used highly customized or legacy software. Cloud services cannot provide fine­grained customization, and organizations must follow different approaches to deploy such architectures. 4. Contractual obligations and technical challenges to change the Cloud provider if required. Key differences Here are some key differences between on­prem and DWaaS data warehouses based on common criteria. Criteria On­prem data warehouse Cloud data warehouse Cost Upfront capital expense required. No upfront capital cost whereas operating expenses will be incurred. Cost depreciation over time. Variable monthly cost depending on the usage. Regular maintenance of hardware. No maintenance overhead Need dedicated support personnel. Premium support cost is usually required for critical applications No monthly costs. Scalability Highly rigid. Any changes to hardware or software requires heavy IT time and effort to execute. Highly elastic. Resources can be added or removed on the fly without any manual intervention based on the application load. Time to market Higher go to market time as the infrastructure needs to be built first. Less go to market time. Businesses can build their applications quickly and deploy to get user feedback without worrying about the infrastructure. Built in ecosystem In on­prem environment, organization should build all applications for security, user management, monitoring, notifications, analytics, etc. All cloud providers offer services for security, user management, analytics, etc. so the organization need not go to different software for each use cases. Security Security is sole responsibility of the organization and in­house IT team deployed to maintain the on­premises data warehouse. Security is a shared responsibility between cloud service providers and organizations. All necessary security features are available in cloud and should be implemented by the organization. Conclusion
  • 5.
    Conclusion Most organizations considerDWaaS an integral step in their architecture landscape, considering the savings in cost and effort that Cloud solutions offer. However, on­premises data warehouses are optional. Some industries have highly customized or niche legacy use cases running on their on­premises for decades, for which Cloud support may need to be higher. So, it is up to organizations to assess their technology landscape and roadmap and identify which is best suited for their interests. Mastech helps customers understand their business requirements and guide them to build the right data warehouse on­premises or in Cloud that they can use to unlock important insights out of the data. SUBSCRIBE Siddharth Jothimani Director ­ Data In Motion Technology leader with 18+ years of expertise in application architecture, design and Cloud adoption transforming businesses through innovative data solutions. MORE READS Cloud Data Warehouse – Comparison of the Big 5 By Bala Uppaloori | August 21, 2023 Pioneering the Future: Your Guide to a Next­gen Cloud Data Warehouse By Data Management Team | August 18, 2023
  • 6.
    Data Warehouse Evolution:How Modernization Enhances Business Intelligence By Data Management Team | July 31, 2023 Choose Category Data Management Data as an Asset Data Science Intelligence Data Engineering Cloud Services POPULAR READS Cloud Services, Data as an Asset | July 23, 2019 How to use Informatica Power Center as a RESTful Web Service Client? Cloud Services, Data as an Asset | July 23, 2019 How to integrate Informatica Data Quality (IDQ) with Informatica MDM Cloud Services, Data as an Asset | July 25, 2019 Deterministic Matching versus Probabilistic Matching Cloud Services, Data as an Asset | July 25, 2019 Informatica MDM MDE Batch Process in a nutshell Cloud Data Warehouse Horses for Courses September 07, 2023 Register Now Stay Updated On The Trending Topics On Data And Analytics.
  • 7.
    S U BS C R I B E W H O W E A R E C A R E E R S E V E N T S R E S O U R C E S B L O G C O N TA C T U S Sitemap Terms of Use Privacy Policy Brand Guidelines Mastech InfoTrellis, INC. © 2023 Mastech InfoTrellis is the wholly­owned subsidiary of Mastech Digital