1. Architecting Data for Integration and Longevity
Jonathan Hamilton Solórzano, UCLA ORIS
2. Today’s Objectives
Provide background context for data integration projects within UCLA Research Administration, and in administration more generally
Illustrate business problems driving the need for these integrations
Provide a high level overview of lessons learned and considerations when implementing data services
Specifically, discuss the architecture and utility of a federated data services approach
3. Information Flow: ORA
Campus
Raw info.
ORA
Research Rules
UCLA
Central & Shared Data
ORA applications relate to research
Learn what PI’s are doing to assist & govern
Campus-facing systems gather data in a “friendly” way
Internal systems standardize and determine actions
Cross-reference other campus data sources
Cross-reference internally against other systems
4. The Need
Why can’t the systems just talk to each other?
Why does it take so long when they do?
Why are we locked into this old vendor system?
Source Data
System 3
System 2
System 1
5. Establishing the Data Architecture
Identify Systems of Record
Learn “business rules” for the systems of record
Streamline and update business processes
Design data architecture
Degree of Normalization/Modeling Approach
Requirements for Data Versioning
Required Remote Source Data (internal/external)
Assess and Plan for Data Quality
Downstream Data Flow
6. Business Rules
Understand Business Logic vs. Application Logic. Customized off-the-shelf vendor applications typically bring their own “business” logic from their prior customers or target use case. Work with business users to separate application behavior that they “work around” versus what they actually “work with.”
Document Business Logic. In this way the application documentation is also the business documentation, and vice versa.
Implement Business Logic. Where possible, pull specific business logic upstream from the vendor implementation by leveraging vendor APIs. Operate the business logic against your data domain.
7. Architected Data Delivery
Current ArchitectureTarget ArchitectureService WrappersOperational Data StoreApplication APIsApplication UIService WrappersBusiness Logic ServicesExternal Data WarehouseTransactional DatabasesTransactional DatabasesApplication APIsORA Data WarehouseOperational Data StoreData ServicesBusiness Logic ServicesExternal Data WarehouseApplication UI
Widely varying administrative processes demand unique transactional systems
Organic application deployment results in a hodgepodge approach to data delivery
Implementing a consistent N-tier approach will streamline the architecture and facilitate future development
8. User Experience Under the Hood
Campus users interact with their transactional system and the cross-cutting data access system
Data access interfaces consume a “smart” management service
Management service implements interfaces against API or periodic external snapshot depending on need
9. Serving Federated Data
Source System DataTransactional DBApplication APIOperational Data StoreData ServiceAccessPresent (Data API) Transform (Canonical Model) AccessAccessXML/JSON Data RepresentationPeriodic Refresh
Presentation of data in JSON or XML for downstream interfaces provides performance and reusability
Data service transforms all source data to a consistent canonical model regardless of the source data structure
Data accessors implement a single interface against source of choice returning data in the canonical type
Business logic becomes decoupled from source data schema structures, improving reusability and longevity
10. Investing in Longevity
Significant investment to implement canonical data service
Define the canonical data model
Implement transforms to (and potentially from) the canonical model against the transactional system
Implement transforms from the canonical model to other transactional or representational data models
Significant savings for downstream development efforts
All data consumption becomes an iterative effort, just add another representation to the canonical model
All business logic can be implemented against the canonical model
Allows changing out source transactional systems much more easily which reduces vendor lock-in
11. Additional Technical Considerations
Implementation Details
Iterative, phased approach
Cross-pollination in project implementation teams
Connection Architecture
Connection hardening
Data authorization and access control
Underlying Infrastructure
Server and Storage Stack
Cloud services? (Data Security)
The Future
iPaaSand iSaaS
12. Beyond Technical Architecture
People and Organization
Defining business canonical data model as a collaboration
Agreement on downstream data usage
Communicating change to system consumers (i.e. campus users)
Processes
Information Security Compliance
Data Change SLA
Increased governance on source system changes
Data Dictionary updates
13. Key Take-Aways
1.
Understand your users and how they think about data
2.
Build your internal data structure against only that understanding
3.
Actively determine the degree of normalization and versioning in that data structure
4.
Bridge your specific implementations to your internal data structure
5.
Serve your data from this internal consistent structure
14. Further Reading
Information Security Office and Plan at https://www.itsecurity.ucla.edu/plan/
Campus Data Warehouse information at https://www.it.ucla.edu/accounts/get-access/qdb-access
Gartner Articles:
Altman, Ross et. al. “Gartner G00212138: MDM, SOA, and BPM: Alphabet Soup or a Toolkit to Address Critical Data Management Issues?” Gartner Technical Professional Advice. 27 May 2011; refreshed October 2013.
Selvage, Mei. “Gartner G00250365: Data Integration Decision Point,” Gartner Technical Professional Advice. 4 April 2013.