Successfully reported this slideshow.
Your SlideShare is downloading. ×

Northwestern Mutual Journey – Transform BI Space to Cloud

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 13 Ad

Northwestern Mutual Journey – Transform BI Space to Cloud

Download to read offline

The volume of available data is growing by the second (to an estimated 175 zetabytes by 2025), and it is becoming increasingly granular in its information. With that change every organization is moving towards building a data driven culture. We at Northwestern Mutual share similar story of driving towards making data driven decisions to improve both efficiency and effectiveness. Legacy system analysis revealed bottlenecks, excesses, duplications etc. Based on ever growing need to analyze more data our BI Team decided to make a move to more modern, scalable, cost effective data platform. As a financial company, data security is as important as ingestion of data. In addition to fast ingestion and compute we would need a solution that can support column level encryption, Role based access to different teams from our datalake.

In this talk we describe our journey to move 100’s of ELT jobs from current MSBI stack to Databricks and building a datalake (using Lakehouse). How we reduced our daily data load time from 7 hours to 2 hours with capability to ingest more data. Share our experience, challenges, learning, architecture and design patterns used while undertaking this huge migration effort. Different sets of tools/frameworks built by our engineers to help ease the learning curve that our non-Apache Spark engineers would have to go through during this migration. You will leave this session with more understand on what it would mean for you and your organization if you are thinking about migrating to Apache Spark/Databricks.

The volume of available data is growing by the second (to an estimated 175 zetabytes by 2025), and it is becoming increasingly granular in its information. With that change every organization is moving towards building a data driven culture. We at Northwestern Mutual share similar story of driving towards making data driven decisions to improve both efficiency and effectiveness. Legacy system analysis revealed bottlenecks, excesses, duplications etc. Based on ever growing need to analyze more data our BI Team decided to make a move to more modern, scalable, cost effective data platform. As a financial company, data security is as important as ingestion of data. In addition to fast ingestion and compute we would need a solution that can support column level encryption, Role based access to different teams from our datalake.

In this talk we describe our journey to move 100’s of ELT jobs from current MSBI stack to Databricks and building a datalake (using Lakehouse). How we reduced our daily data load time from 7 hours to 2 hours with capability to ingest more data. Share our experience, challenges, learning, architecture and design patterns used while undertaking this huge migration effort. Different sets of tools/frameworks built by our engineers to help ease the learning curve that our non-Apache Spark engineers would have to go through during this migration. You will leave this session with more understand on what it would mean for you and your organization if you are thinking about migrating to Apache Spark/Databricks.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Northwestern Mutual Journey – Transform BI Space to Cloud (20)

Advertisement

More from Databricks (20)

Recently uploaded (20)

Advertisement

Northwestern Mutual Journey – Transform BI Space to Cloud

  1. 1. Northwestern Mutual Journey - Transform BI Space to Cloud Madhu Kotian – Vice President of Engineering Keyuri Shah – Lead Engineer
  2. 2. Agenda § Introduction § Before and After Migration § Migration Approach § Frameworks Built § Challenges
  3. 3. Revenue $31.1 billion #102 on FORTUNE 500 4.6+ million clients 10,500+ financial professionals 6,700+ employees Headquartered in Milwaukee, Wisconsin Figures as of December 31, 2020. FOR 160+ YEARS, NORTHWESTERN MUTUAL HAS BEEN HELPING FAMILIES AND BUSINESSES ACHIEVE FINANCIAL SECURITY
  4. 4. COMMITMENT TO MUTUALITY FINANCIAL STRENGTH EXCLUSIVE CAREER DISTRIBUTION LONG-TERM PRODUCT VALUE
  5. 5. Our Team – Insights (Book of Business) • Build and manage reporting platform • Curate aggregated content to provide insights to our Field and Home office users • Generate canned reports and dashboards • Enable our Business partners to perform adhoc analysis
  6. 6. Our World Before Migration No of ETL 300 Batch cycle time 7 hours Time to market 5-6 Weeks
  7. 7. Pain Points Increased Data Volume Increased latency with our data load Inconsistent data due to data sprawl Integrated data not available for analysts Challenge to manage costs
  8. 8. Key Architecture Pillars ▪ Performance ▪ Easy to Maintain, Use and Learn ( Config Driven) ▪ Scale compute and storage as needed ▪ Ability to manage complicated dependencies between jobs • Metadata governance • Databricks Delta • Support ACID Operations • Data Lake • ELT/Scheduling • Column Level Encryption • Effective cluster management • Role Based Access to Database/Tables/Views • Security
  9. 9. Our World After Migration Config Files 500 Batch cycle time 2 hours Time to market 1-2 Weeks
  10. 10. Migration Approach • Team Building • Start with a small core group • Learn – Train – Transform - Repeat for team building • Ease out learning curve by building abstraction layers • Code Migration • Not lift and shift – redoing all code (No accelerators) • Build small shippable pieces • Keep it simple • Not changing end user experience • Production Support • Running both environments in parallel • Continuous push to new environment for faster feedback
  11. 11. Challenges • Bringing Business/Product/Security onboard • Go through current pain-points • Explain long term benefits • Think security first • Balance Business priority v/s innovation • Show and Prove Progress • Do incremental approach – learn - build – test – repeat • Put small chunks into production • Open Communication to all interested parties
  12. 12. Frameworks Built ▪ Config Driven – JSON File ▪ CI-CD - with approvals ▪ Column level encryption ▪ Exec Commands ▪ Talk scheduled at 5/25 3:50PM to 4:20PM Modern Config Driven ELT Framework for Building a Data Lake • Config Driven – YML File • CI-CD - with approvals • Schema management • Access management • Talk scheduled at 5/26 5PM to 5:30PM - Automated Metadata Management in Data Lake – A CI/CD Driven approach • Metadata Framework • ELT Framework • Config Driven – YML File • CI-CD - with approvals • Automatic DAG management • Dependency management • Airflow Framework
  13. 13. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions. Madhu Kotian: https://www.linkedin.com/in/imkotian Keyuri Shah: https://www.linkedin.com/in/keyuri-shah

×