Successfully reported this slideshow.
Your SlideShare is downloading. ×

Data Con LA 2022 - Moving Data at Scale to AWS

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Data Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Loading in …3
×

Check these out next

1 of 16 Ad

Data Con LA 2022 - Moving Data at Scale to AWS

George Mansoor, Chief Information Systems Officer at California State University
Overview of the CSU Data Architecture on moving on-prem ERP data to the AWS Cloud at scale using Delphix for Data Replication/Virtualization and AWS Data Migration Service (DMS) for data extracts

George Mansoor, Chief Information Systems Officer at California State University
Overview of the CSU Data Architecture on moving on-prem ERP data to the AWS Cloud at scale using Delphix for Data Replication/Virtualization and AWS Data Migration Service (DMS) for data extracts

Advertisement
Advertisement

More Related Content

More from Data Con LA (20)

Advertisement

Data Con LA 2022 - Moving Data at Scale to AWS

  1. 1. Getting Data Cloud Ready at Scale Foundational Element in Supporting University and Campus Goals 1 George Mansoor, Chief Information Systems Officer
  2. 2. Agenda • A Focus On Why We Are Here • Modernizing Strategically • The Path Forward 2
  3. 3. A focus on Students Graduation Initiative 2025: Launched in 2015, it is an ambitious plan to increase graduation rates, eliminate completion and meet California’s workforce needs. Being strategic: Each campus and the Chancellors office uses this vision as the litmus test when spending scarce strategic initiatives. 3
  4. 4. Background on our ERP system • PeopleSoft ERP supporting our Finance, Student and HR data since 2001 • 23 Campus Solutions • 23 Human Resources, migrating to Single HR starting 2022 • Single Finance since 2011 • 23 campuses, and eight off-campus centers • Support to the largest public higher education institution - 485,500 students with 56,000 faculty and staff 4
  5. 5. Challenges of Accessing ERP Data • Students and faculty need improved engagement and expect a stellar and seamless experience • Data from the ERP provides critical information about our students (financial aid, enrollment, graduation requirements, hiring, etc.) Campuses need data from our ERP • Current PeopleSoft ERP is highly customized making integrations costly and time consuming. • ERP is aging and we have grown! • 2001: 370,000 students with 40,000 faculty and staff • Today: 485,000 students with 56,000 faculty and staff • To accommodate ERP gaps we have an assortment of systems – Especially true in the Student Information Systems where student engagement systems have grown. • Limited resources 5
  6. 6. Overview • Objective: CSU was looking to modernize how we process data and wanted to get our data “cloud ready” and “report ready” to take advantage of new capabilities that the cloud offered. • Problem: CSU data resides in on-prem legacy ERP (PeopleSoft/Oracle) systems
  7. 7. Cloud Ready Repository (CMS Data Lake) • Repository of raw copies of source system data • Today that is largely CMS data sources with plans to include other sources over time (CSULearn/SumTotal, Person Data Management (PDM)) • Data is stored in Apache Parquet format • Intended to facilitate the use of cloud-based data tools using CMS data • Built to support the CSU Data Lake 7 Our Data Lakes
  8. 8. Report Ready Repository (CSU Data Lake) • Fairly comprehensive data and reporting solution • Uses source data and transforms data into rich data collections • Data collections targeted towards reporting 8 Our Data Lakes
  9. 9. © 2017 Unisys Corporation. All rights reserved. 9 ON-PREM PROD AWS CMS Data Lake - Data as a Service CMS Data Lake CSU Data Lake VPC Peering Students Students by Term Students by Class Students by Degree Classes by Section Applications by Applicant
  10. 10. First Big Problem • Our biggest source of data is our ERP. They are on-prem and will stay there. • 47 production instances • Approx 1TB aggregate of “interesting” data. • Oracle RDBMS • OLTP Optimized How do we get this to the cloud?
  11. 11. Delphix Data Virtualization Platform Data virtualization decouples the database layer that sits between the storage and application layers in the application stack. Just like a hypervisor sits between the server and the OS to create a virtual server, database virtualization software sits between the database and the OS to abstract/virtualize the data store resources. Because database resources are virtualized, they require a much smaller storage footprint than the source database. Instead of making and moving new blocks of data, virtual data (virtual data copies) use pointers to data blocks, providing high-performance access to data already in place. 11
  12. 12. 12
  13. 13. AWS Database Migration Service (DMS) AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from the most widely used commercial and open-source databases. 13
  14. 14. © 2017 Unisys Corporation. All rights reserved. 14 ON-PREMISE PRODUCTION CS HR FIN Oracle DB on Ec2 Campus Specific S3 buckets Continues Replication over Secure VPN • Data Gets Copied in to Encrypted S3 buckets • Historical Data is stored in date wise S3 bucket folders AWS CLOUD DMS Instances DMS Tasks S3 Copy CMS Data Lake - Data as a Service
  15. 15. 15
  16. 16. Questions 16

×