Getting Data Cloud Ready at Scale
Foundational Element in Supporting University and Campus Goals
1
George Mansoor, Chief Information Systems Officer
Agenda
• A Focus On Why We Are Here
• Modernizing Strategically
• The Path Forward
2
A focus on Students
Graduation Initiative 2025: Launched in 2015, it is an
ambitious plan to increase graduation rates, eliminate
completion and meet California’s workforce needs.
Being strategic: Each campus and the Chancellors office
uses this vision as the litmus test when spending scarce
strategic initiatives.
3
Background on our ERP system
• PeopleSoft ERP supporting our Finance,
Student and HR data since 2001
• 23 Campus Solutions
• 23 Human Resources, migrating to Single
HR starting 2022
• Single Finance since 2011
• 23 campuses, and eight off-campus
centers
• Support to the largest public higher
education institution - 485,500 students
with 56,000 faculty and staff
4
Challenges of Accessing ERP Data
• Students and faculty need improved engagement and expect a stellar
and seamless experience
• Data from the ERP provides critical information about our students
(financial aid, enrollment, graduation requirements, hiring, etc.)
Campuses need data from our ERP
• Current PeopleSoft ERP is highly customized making integrations
costly and time consuming.
• ERP is aging and we have grown!
• 2001: 370,000 students with 40,000 faculty and staff
• Today: 485,000 students with 56,000 faculty and staff
• To accommodate ERP gaps we have an assortment of systems –
Especially true in the Student Information Systems where student
engagement systems have grown.
• Limited resources
5
Overview
• Objective: CSU was looking to modernize how we process data and wanted
to get our data “cloud ready” and “report ready” to take advantage of new
capabilities that the cloud offered.
• Problem: CSU data resides in on-prem legacy ERP (PeopleSoft/Oracle)
systems
Cloud Ready Repository (CMS Data Lake)
• Repository of raw copies of source system data
• Today that is largely CMS data sources with plans to include other sources
over time (CSULearn/SumTotal, Person Data Management (PDM))
• Data is stored in Apache Parquet format
• Intended to facilitate the use of cloud-based data tools using CMS data
• Built to support the CSU Data Lake
7
Our Data Lakes
Report Ready Repository (CSU Data Lake)
• Fairly comprehensive data and reporting solution
• Uses source data and transforms data into rich data collections
• Data collections targeted towards reporting
8
Our Data Lakes
© 2017 Unisys Corporation. All rights reserved. 9
ON-PREM
PROD
AWS
CMS Data Lake - Data as a Service
CMS Data Lake CSU Data Lake
VPC Peering
Students
Students by Term
Students by Class
Students by Degree
Classes by Section
Applications by Applicant
First Big Problem
• Our biggest source of data is our ERP. They are on-prem and will stay there.
• 47 production instances
• Approx 1TB aggregate of “interesting” data.
• Oracle RDBMS
• OLTP Optimized
How do we get this to the cloud?
Delphix Data Virtualization Platform
Data virtualization decouples the database layer that sits between the storage and
application layers in the application stack. Just like a hypervisor sits between the server
and the OS to create a virtual server, database virtualization software sits between the
database and the OS to abstract/virtualize the data store resources. Because database
resources are virtualized, they require a much smaller storage footprint than the source
database. Instead of making and moving new blocks of data, virtual data (virtual data
copies) use pointers to data blocks, providing high-performance access to data already in
place.
11
12
AWS Database Migration Service (DMS)
AWS Database Migration Service (AWS DMS) helps you migrate databases
to AWS quickly and securely. The source database remains fully operational
during the migration, minimizing downtime to applications that rely on the
database. The AWS Database Migration Service can migrate your data to
and from the most widely used commercial and open-source databases.
13
© 2017 Unisys Corporation. All rights reserved. 14
ON-PREMISE
PRODUCTION
CS
HR
FIN
Oracle DB on Ec2
Campus Specific S3 buckets
Continues
Replication over
Secure VPN
• Data Gets Copied in to Encrypted S3
buckets
• Historical Data is stored in date wise S3
bucket folders
AWS CLOUD
DMS Instances
DMS Tasks S3 Copy
CMS Data Lake - Data as a Service
15
Questions
16

Data Con LA 2022 - Moving Data at Scale to AWS

  • 1.
    Getting Data CloudReady at Scale Foundational Element in Supporting University and Campus Goals 1 George Mansoor, Chief Information Systems Officer
  • 2.
    Agenda • A FocusOn Why We Are Here • Modernizing Strategically • The Path Forward 2
  • 3.
    A focus onStudents Graduation Initiative 2025: Launched in 2015, it is an ambitious plan to increase graduation rates, eliminate completion and meet California’s workforce needs. Being strategic: Each campus and the Chancellors office uses this vision as the litmus test when spending scarce strategic initiatives. 3
  • 4.
    Background on ourERP system • PeopleSoft ERP supporting our Finance, Student and HR data since 2001 • 23 Campus Solutions • 23 Human Resources, migrating to Single HR starting 2022 • Single Finance since 2011 • 23 campuses, and eight off-campus centers • Support to the largest public higher education institution - 485,500 students with 56,000 faculty and staff 4
  • 5.
    Challenges of AccessingERP Data • Students and faculty need improved engagement and expect a stellar and seamless experience • Data from the ERP provides critical information about our students (financial aid, enrollment, graduation requirements, hiring, etc.) Campuses need data from our ERP • Current PeopleSoft ERP is highly customized making integrations costly and time consuming. • ERP is aging and we have grown! • 2001: 370,000 students with 40,000 faculty and staff • Today: 485,000 students with 56,000 faculty and staff • To accommodate ERP gaps we have an assortment of systems – Especially true in the Student Information Systems where student engagement systems have grown. • Limited resources 5
  • 6.
    Overview • Objective: CSUwas looking to modernize how we process data and wanted to get our data “cloud ready” and “report ready” to take advantage of new capabilities that the cloud offered. • Problem: CSU data resides in on-prem legacy ERP (PeopleSoft/Oracle) systems
  • 7.
    Cloud Ready Repository(CMS Data Lake) • Repository of raw copies of source system data • Today that is largely CMS data sources with plans to include other sources over time (CSULearn/SumTotal, Person Data Management (PDM)) • Data is stored in Apache Parquet format • Intended to facilitate the use of cloud-based data tools using CMS data • Built to support the CSU Data Lake 7 Our Data Lakes
  • 8.
    Report Ready Repository(CSU Data Lake) • Fairly comprehensive data and reporting solution • Uses source data and transforms data into rich data collections • Data collections targeted towards reporting 8 Our Data Lakes
  • 9.
    © 2017 UnisysCorporation. All rights reserved. 9 ON-PREM PROD AWS CMS Data Lake - Data as a Service CMS Data Lake CSU Data Lake VPC Peering Students Students by Term Students by Class Students by Degree Classes by Section Applications by Applicant
  • 10.
    First Big Problem •Our biggest source of data is our ERP. They are on-prem and will stay there. • 47 production instances • Approx 1TB aggregate of “interesting” data. • Oracle RDBMS • OLTP Optimized How do we get this to the cloud?
  • 11.
    Delphix Data VirtualizationPlatform Data virtualization decouples the database layer that sits between the storage and application layers in the application stack. Just like a hypervisor sits between the server and the OS to create a virtual server, database virtualization software sits between the database and the OS to abstract/virtualize the data store resources. Because database resources are virtualized, they require a much smaller storage footprint than the source database. Instead of making and moving new blocks of data, virtual data (virtual data copies) use pointers to data blocks, providing high-performance access to data already in place. 11
  • 12.
  • 13.
    AWS Database MigrationService (DMS) AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from the most widely used commercial and open-source databases. 13
  • 14.
    © 2017 UnisysCorporation. All rights reserved. 14 ON-PREMISE PRODUCTION CS HR FIN Oracle DB on Ec2 Campus Specific S3 buckets Continues Replication over Secure VPN • Data Gets Copied in to Encrypted S3 buckets • Historical Data is stored in date wise S3 bucket folders AWS CLOUD DMS Instances DMS Tasks S3 Copy CMS Data Lake - Data as a Service
  • 15.
  • 16.