CNIC Information System with Pakdata Cf In Pakistan
Evolving Role of Enterprise Data Warehouse Department in Big Data World
1. THE EVOLVING ROLE OF DATA
(WAREHOUSING)
DEPARTMENT
ALMERE DATA CAPITAL/CIONET MEET 18 DEC 2013
Anurag Shrivastava
2. About Me
¨ Anurag Shrivastava
¨ Manager SODC (Customer &
Business Intelligence)
¨ At ING Retail Bank, Amsterdam
¨ Deliver solutions for Marketing,
Mortgages, KIM, Mobile, CRM
etc.
¨ Deliver solutions for inbound
and outbound marketing
¨ SODC
(C&BI)
¨ Was set up in 2000 during
Postbank era to support
marketing and sales
¨ Information Analysts, ETL
Developers, BI Specialists and
Team Managers ~ (30 Int+20 ext.)
¨ Oracle/Business Objects based
DWH platform, IBM Unica & SAS
for marketing
¨ Development and Operations fall
under separate line management
3. Transformation from CMM to Agile
¨ Pre 2012
¨ Many roles and tollgates
¨ Release cycles of 3-9 months
¨ Focus upon processes in CMMI
¨ Post 2012
¨ Implementation of Scrum
¨ Reduction in number of roles in IT
organization
¨ Implementation of DevOps teams
¨ Introduction of new roles that require new
skills and mindset
¨ Engineering and Craftsmanship Culture
ü Customer Centricity
ü Operational Excellence
ü New Revenue Streams
4. The Way Forward
¨ Batch processes affect both stability and
response times
¨ Solid skills in the present stack
¨ Business unprepared for agile
¨ Infra unprepared for agile
¨ Pressure from business and senior
management to deliver
¨ Change mindset and
behaviour
¨ Improve skills and
competence
¨ Speed up renewal of data
platform
¨ Deliver faster and often
¨ Automate Automate
Automate
Challenges The Way forward
5. Big Bang?
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
Dec
New Roles
are
announced
and
selection
started
First
Hadoop
cluster
Migration
to GIT
Introductio
of Dev
+Ops
Teams
First
Predictive
service
built using
R and Java
JIRA &
Confluence
Test
automated
deployment
with Nolio
IBM
Netezza in
Production
SAS is live
IBM
Datastage
Pilot
IBM Cognos
in
production
IBM Unica
Interact is
live
Capabilities contribute to our goals but:
• The link between the capabilities and goals is not direct and
obvious
• The change and too much and too fast
6. Old versus New World
Traditional DWH Stack
• A single or two vendor stack
• Built for durability
• Limited choices
• Proven Technology
• Knowledge retention
• Is this really a fun?
7. Challenges of Big Data World
u Too many choices – Hadoop, Hive, Scoop, Flume,
Ozzie, Pig etc.
u No clear leader in the vendor space
u Open source and Java focused community
u New technology – first mover disadvantage
u Steep learning curve, no experienced people
available
u Attention and hype from CXO (read pressure to
deliver)
u Do we really have a big data problem?
u Many alternatives to Hadoop are challenging
Hadoop
8. Traditional BI to Big Data
u Traditional BI and DWH will
continue to be mainstream for
some time but Big Data
technologies may reach inflection
point in 3 years from now
u Customer centricity will be a key
driver coupled with lower costs
u Adoption among traditional DWH
developers will resemble the
technology adoption curve but
acceleration is possible by
injecting new team members who
are early adopters
Source: http://setandbma.wordpress.com/2012/05/28/technology-
adoption-shift/
9. Hadoop is Getting SQL Friendly
• SQL or SQL like languages (HiveQL or CQL) are making fast inroads in Hadoop
world
• SQL is getting faster on Hadoop by bypassing the overhead of Map/Reduce
• Vendors are making learning Hadoop simpler to use by giving free Sandboxes
• Adming tools are still far behind and requires you to learn plethora of tools
• Knowledge of Java, Scripting and deployment tools becomes essential for Admin
people
• Learning Hadoop for application programmers is getting simpler but deployment
would still need different skillsets
• SQL skills will be useful but the pace of innovation will force developers to acquire
new skills other than SQL
10. Trends
¨ Hadoop will be made simpler to run and develop upon
¨ SQL is the way forward to ensure large scale adoption
¨ Support for enterprise admin tools
¨ Integration with other enterprise tools
¨ You do not have to be an open source geek to work on Hadoop
¨ Skills in data collection, cleansing and processing will be reusable
11. Waterfall to Scrum Transition
• Scrum practices such as Sprint Planning, Short Iterations, Sprint
Review and Planning Board get implemented quickly
• Implementing engineering practices in traditional data warehousing
world is hard. For example: TDD, Continuous Integration,
Continuous Deployment, Automated Build
• Agile coaching and mentorship is handy for managers as they might
create major challenge in the way of Agile adoption due to their
mindset
• Empowerment of team to decide about designs and tools takes time
before starts behaving like an empowered team
12. What we have tried and worked?
• Start by visiting industry events for knowledge and inspiration
• Let people experiment in a small group
• Build a community of practices and attend meet-ups
• Start your first assignment with a combination of external and internal
people
• Train people through vendor’s certification programs or use platforms
such as Coursera
• Getting people out of their comfort zone is tough but worth a try
13. “Big Data is nothing but old DWH concepts in a new wrapper.”