Scaling up data access and storage
without scaling up costs
Leading-edge tech capabilities
Futureproofing institutions for the big data revolution
Challenges faced by institutions Case study
Inaccurate data
extraction from
legacy systems
Isolated data
sources
Resource-
intensive
reporting
Lack of data
storage
infrastructure
Advanced
analytics
processing
requirements
and complexities
Data
segregation and
security issues
Context: A top Indian bank was faced with a
large increase in its data warehouse volumes
because of a number of reasons: post-
acquisition, organic growth in customer base
and increasing data from online channel. Data
growth was set to double over the next few
years, and data compression within a ‘data
lake’ was an option for cost-effective archival.
However, the bank sought to ensure fast
access to active warehouse data, while
ensuring secure access to compressed data in
parallel.
Recommended configuration:
• To demonstrate the platform’s
capabilities, data was loaded to a
Hive-staging table. Vendor solution
was deployed to join the Hive-staging
table with existing data warehouse
tables using sample queries.
• A sample interface was created to
generate account statements based
on consuming data from both data
sources.
• The solution included advanced data
capabilities such as compression,
encryption through customized
functions and in-memory parallel
processing.
Better
dashboards
In-the-
moment
offers
Faster
KYC
Cost
savings
Data
security
Advanced
analytics
Data
unification
Real time
High-speed
processing
Storage
optimization
Minimum
data
replication
Machine
learning and
AI
Client impact:
• Migration to a data lake plus data
compression enabled an 80% reduction in
data storage.
• Data queries ranged from 2 to 30 seconds
depending on the query complexity.
• It resulted in ease of deployment, fast
performance and scalability.
Transaction
system or
CRM data
REST API
Online reviews or
social media data
With data
lake
Data lake for
structured and
unstructured
data
Comma-
separated
values
(CSV) files
Skip data lake warehouse
Vendor solution:
t
Single view Data visualization On cloud On premise
Data analytics Regulatory
reporting
Online
applications
Varun Mittal
Global Emerging Markets
FinTech lead
varun.mittal@sg.ey.com
Contact us
FinTech Hub
www.ey.com/sg/FinTechHub

Scaling Up Data Access and Storage Without Scaling Up Costs

  • 1.
    Scaling up dataaccess and storage without scaling up costs Leading-edge tech capabilities Futureproofing institutions for the big data revolution Challenges faced by institutions Case study Inaccurate data extraction from legacy systems Isolated data sources Resource- intensive reporting Lack of data storage infrastructure Advanced analytics processing requirements and complexities Data segregation and security issues Context: A top Indian bank was faced with a large increase in its data warehouse volumes because of a number of reasons: post- acquisition, organic growth in customer base and increasing data from online channel. Data growth was set to double over the next few years, and data compression within a ‘data lake’ was an option for cost-effective archival. However, the bank sought to ensure fast access to active warehouse data, while ensuring secure access to compressed data in parallel. Recommended configuration: • To demonstrate the platform’s capabilities, data was loaded to a Hive-staging table. Vendor solution was deployed to join the Hive-staging table with existing data warehouse tables using sample queries. • A sample interface was created to generate account statements based on consuming data from both data sources. • The solution included advanced data capabilities such as compression, encryption through customized functions and in-memory parallel processing. Better dashboards In-the- moment offers Faster KYC Cost savings Data security Advanced analytics Data unification Real time High-speed processing Storage optimization Minimum data replication Machine learning and AI Client impact: • Migration to a data lake plus data compression enabled an 80% reduction in data storage. • Data queries ranged from 2 to 30 seconds depending on the query complexity. • It resulted in ease of deployment, fast performance and scalability. Transaction system or CRM data REST API Online reviews or social media data With data lake Data lake for structured and unstructured data Comma- separated values (CSV) files Skip data lake warehouse Vendor solution: t Single view Data visualization On cloud On premise Data analytics Regulatory reporting Online applications Varun Mittal Global Emerging Markets FinTech lead varun.mittal@sg.ey.com Contact us FinTech Hub www.ey.com/sg/FinTechHub