Building The Agile Database
Upcoming SlideShare
Loading in...5
×
 

Building The Agile Database

on

  • 1,827 views

 

Statistics

Views

Total Views
1,827
Views on SlideShare
1,827
Embed Views
0

Actions

Likes
5
Downloads
55
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Building The Agile Database Building The Agile Database Presentation Transcript

  • Building the Agile Database Larry Burns Consultant PACCAR Data Services
  • What does “agility” mean?
    • The ability to respond quickly and effectively to changes in business requirements and new technology.
    • The agile approach is characterized by an emphasis on personal interaction and collaboration, determining people’s needs, and working quickly to meet those needs.
  • How does AD work?
    • The goal of Agile Development (AD) is to quickly produce solutions that are “good enough” (meeting 80% of the requirements).
    • Software development occurs continuously and iteratively, with new releases taking place every 2-6 weeks.
    • Continuous testing is built into the development process.
  •  
  • Essential concepts of AD
    • Agile Development emphasizes:
      • Individuals and interactions over processes and tools (collaboration and teamwork).
      • Working software over comprehensive documentation (“just enough” process and documentation to get the job done).
      • Customer collaboration over contract negotiation
      • Flexible response to change over fixed plans.
  • Benefits of AD
    • Allows you to speed up “time to market”, and take advantage of narrower “windows of opportunity”.
    • Allows requirements to be adjusted as the product is developed
    • Eliminates the waste of developing features that aren’t needed
  • Benefits of AD
    • Reduces the risk of project failure
    • Problems and risks can be surfaced early
    • Testing is integrated into the development process
    • Final product is more in-line with current user requirements
    • Reduces the risk of outsourcing
  • What does “agility” imply?
    • The ability to reuse application components (and data) is essential to AD.
    • The ability to design and build loosely-coupled systems is essential to AD.
    • The ability to automate routine tasks is essential to AD.
  • What does “agility” imply?
    • The ability to create policy-based (rule-based) components is essential to AD.
    • The ability to enhance processes based on experience is essential to AD.
    • The ability to recognize that problems and exceptions will occur, and to empower people to handle them is essential to AD.
  • What does “agility” imply?
    • The ability to generalize (i.e., understand areas outside your particular domain) as well as specialize (within your domain) is essential to AD.
    • A customer service mindset and positive (“can do”) attitude are essential to AD.
  • Critical Issues for AD
    • Reusability . Coding for reuse takes 50-100% longer, and most developers don’t do this. But DBAs have to – it’s our job!
    • Quality . Application errors are easier to detect and fix (or work around) than data errors. Quality directly affects reusability. It also affects ROI (“time to money”)!
    • Waste . AD methods can generate large amounts of “scrap and rework”.
  • Critical Issues for AD
    • Resources . AD works best when resources are 100% dedicated to a single project, but DBAs have to support multiple projects. Also, AD is more resource-intensive than other methodologies.
    • Focus . Development focus is on a single application; DBA focus is on designing and building an infrastructure that meets current and future data needs.
  • Critical Issues for AD
    • Maintainability . Somebody has to maintain the application after it’s written; maintenance expense far exceeds development expense.
    • Personnel . AD projects involve long hours, frequent requirements changes, intensive collaboration and lots of stress. This may be a difficult adjustment for data professionals not used to this sort of work environment.
  • Principles of data management
    • Reusability – Ability to reuse data for multiple applications and multiple business purposes (e.g., quality improvement, business process improvement, customer relationship management, strategic planning, etc.).
      • Important: Organizations have data needs outside of the data requirements of particular applications!
  • Principles of data management
    • Integrity – ensuring that data always has a valid business meaning and value, and always reflects a valid state of the business. Data should also be, as much as possible, self-monitoring and self-correcting.
  • Principles of data management
    • Security – Ensure true and accurate data is always available to authorized persons, but only to authorized persons. We also want to make sure that the privacy concerns of all our stakeholders – including our customers, partners, and government regulators – are met.
  • Principles of data management
    • Performance and Ease of Use : ensuring quick and easy access to data by approved users in a usable and business-relevant form, maximizing the business value of both our applications and our data, and improving our relationships with our customers and business users.
  • Principles of data management
    • Low Cost of Maintenance : ensure that all data work is done at a cost that yields value ; that the cost of creating, using, and disposing of data doesn’t exceed its value to the business. We also want to ensure the fastest possible response to changes in business processes and new business requirements .
  • Principles of data management
    • P erformance and Ease of Use
    • R eusability
    • I ntegrity
    • S ecurity
    • M aintainability
  • Agile Data Management
    • Design and build highly-cohesive, loosely coupled (i.e., normalized ) data structures
    • Make data available in application-friendly, non-normalized forms (e.g., views )
    • Abstract and encapsulate database functionality – eliminate coupling
    • Refactor at a virtual level, not at the database schema level
  • Agile Data Management
    • Learn to manage non-relational data
    • Make the database do the data work (the n-tier approach)
    • Automate as much of the database development process as possible
    • Learn to collaborate
    • Learn to work iteratively (within reason!)
    • Develop a customer service mindset
  • Abstraction and Encapsulation
    • Abstraction
      • Identifying the “what”
    • Encapsulation
      • Packaging the “how”
    • Present an easy-to-use interface that enables the “what” and hides the “how”
    • Exs: light switch, car dashboard
  • Database Abstraction
      • Fundamental Stored Procedures (FSPs)
      • Data Access Components / Layers
      • Data Integration Web Services
      • Views
      • Work Tables
  • Database Abstraction
      • ADO.NET Datasets
      • Stored Procedures
      • Triggers
      • User-defined datatypes (UDTs)
      • User-defined functions (UFTs)
  • Database Issues
    • Performance
    • Maintainability
    • Portability
    • Refactoring
  • Database Issues
    • Performance
      • Limit the scope of views
      • Wrap views in parameterized procedures or functions
      • Use work tables in the database for denormalization (read-only)
      • Use replication as necessary
      • Create reporting databases or data marts
  • Database Issues
    • Maintainability
      • Views and wrapper procedures/functions don’t require much (if any) maintenance
      • New code can be written as needed
      • Good idea to document all code, including the application it was written for, and all known applications that use it
      • Make sure all procedure code is testable
  • Database Issues
    • Portability
      • Usually not an issue (except for commercial software packages)
      • The database code has to go somewhere!
      • Database code performs better in the DBMS
      • Migration is usually not that difficult (and lots of help is available from vendors and user groups!)
      • Cost of migrating is far exceeded by the economic benefit of data reuse and data quality
  • Database Issues
    • Refactoring
      • Much easier (and less costly) to refactor at the virtual level, not the base schema level.
      • Denormalizing too early can mask key data and lead to data corruption, making future refactoring impossible!
      • Denormalizing can complicate queries and lead to performance problems!
  • Tasks Tasks null smallint OverTimeHours null smallint ActHours7 null smallint EstHours7 null smallint etc, etc, etc… null smallint ActHours2 null smallint EstHours2 null smallint ActHours1 null smallint EstHours1 null tinyint WeekNo null varchar EmployeeName null int EmployeeNo null varchar ProjectMgr null int ProjectNo null varchar TaskDesc null varchar TaskTitle not null IDENTITY TaskID
  • Event varchar EventDesc datetime EventDateTime … etc. etc. etc. int EventType3Key int EventType2Key int EventType1Key smallint EventTypeCode IDENTITY EventID
  • The parameter list for your access procedure will have to look like this: CREATE PROCEDURE csEventProcedure (@EventTypeCode smallint, @EventType1Key int = null, @EventType2Key int = null, @EventType3Key int = null…) And the WHERE clause for the SELECT will have to look something like this: WHERE (@EventType1Key IS NOT NULL AND @EventType1Key = Event.EventType1Key)        OR (@EventType2Key IS NOT NULL AND @EventType2Key = Event.EventType2Key)        OR (@EventType3Key IS NOT NULL AND @EventType3Key = Event.EventType3Key)         … Or perhaps like this: WHERE (@EventTypeCode = 1 AND @EventType1Key = Event.EventType1Key)        OR (@EventTypeCode = 2 AND @EventType2Key = Event.EventType2Key)        OR (@EventTypeCode = 3 AND @EventType3Key = Event.EventType3Key)
  • Non-Database Issues
    • Preferred approach (Engineer vs. Artist)
    • Perspective (Enterprise vs. Application)
      • It’s about the business!
    • Architectural Myopia
      • Process-only view
      • Data-only view
      • Information Systems view
  • Resolving the Conflicts
    • Have a system of checks and balances in place:
      • Architecture group
      • Project Management group
      • Quality group
      • Data group
      • Application groups (dev. & maint.)
      • Business and IT management
  • Resolving the Conflicts
    • Commit to finding a workable solution:
      • Understand each group’s concerns
      • Accept the inevitable “trial and error”
      • Maintain an “agile attitude”
      • Focus on maintaining positive working relationships
  • Resolving the Conflicts
    • Negotiate compromises:
      • Data group involvement in req’s gathering and analysis
      • Physical database design review (to promote opportunities for data virtualization)
      • Data group commits to supporting an iterative approach
  • Bonus Slides
    • Approaches for data virtualization
    • Examples
    • Developing an “Agile Attitude”
  • Fundamental Stored Procedures
      • Handle transaction control
      • Perform error handling
      • Enforce security
      • Maintain supertype/subtype relationships
  • Fundamental Stored Procedures
      • Handle concurrency control via timestamp checking
      • Provide multi-language text support
      • Automatically generated from DB schema
  • Fundamental Stored Procedures
    • FSP Example – Table Definition
    • Examples of FSPs
  • Data Access Component
      • Automatically handles dataset and datatable updates using FSPs
      • Creates, populates and executes ADO.NET objects for queries, procedures, typed datasets, connections, transactions, etc.
  • Data Access Component
      • Maintains database timestamps and uses them for updates
      • Uses a few simple overloaded functions (CreateConnection, GetData, UpdateData, etc.)
      • Supports parameterized SQL queries
      • Works with any RDBMS
  • Data Access Component
    • DAC Methods
  • Data Integration Web Services
      • Abstracts data combined from multiple sources
      • Decouples applications from data sources
      • Makes data more easily transportable and consumable
  • Data Synchronization Integration Diagram (includes event queue tables) Mainframe Databases CICS Reformat as XML Application Server CICS Trans Web Service 1 Web Method A Web Method C Web Method D Web Method B Web Method E Web Method F Web Method G Web Method H Web Method I SQL Database Integration Server CICS Listener TCP to MSMQ MSMQ Message Broker Web Service 2 Web Method J Metadata CICS Trans CICS Client
  • Joined Views
      • Create an application-specific view of data, enabling a database to support multiple applications
      • Developers don’t have to code complex SQL joins
      • Results in greatly improved performance
      • Views can be optimized and indexed
  • Joined Views
      • Views can map directly to application objects
      • Can join relational and XML data
      • Can enforce security
      • Allows different users to have different views of the data
      • Can be used to support encryption
  • Joined Views
      • Can support customized application trigger code via “Instead-Of” triggers
      • Data fields can be given user-friendly names
      • Column widths and datatypes can be changed from standard classword format
      • Supports data conversion and reformatting
  • Joined Views
      • Encapsulates data access without introducing coupling (denormalization) or diminishing cohesion in the database
      • Decouples application from database schema
      • Can be developed incrementally as the application develops, supporting a true “agile” approach!
  • Task TaskIdentifier [IDENTITY] TaskDescription [varchar(2000)] ProjectIdentifier [int – FK] AccountingCode [char(4) – FK] OvertimeApprovedIndicator [bit] TaskEnteredDateTime [datetime] Timestamp [timestamp] TaskStartDateTime [datetime] TaskEndDateTime [datetime] Account AccountingCode [char(4)] AccountDescription [varchar(75)] Timestamp [timestamp] Employee EmployeeIdentifier [IDENTITY] EmployeeLastName [varchar(75)] EmployeeFirstName [varchar(75)] Timestamp [timestamp] EmployeePhoneNo [varchar(12)] EmployeeEmail [varchar(255)] OTHoursToDate [decimal] ProjectDescription [varchar(2000)] Project ProjectIdentifier [IDENTITY] ProjectMgrEmployeeID [int – FK] Timestamp [timestamp] ProjectDescription [varchar(75)] TaskAssignment AssignmentStartDate [datetime] Timestamp [timestamp] TaskIdentifier [int – FK] EmployeeIdentifier [int – FK] ScheduledEndDate [datetime] HoursWorkedToDate [decimal] OTHoursToDate [decimal] Normalized application tables
  • EmployeeTasks Account [varchar(75)] OTApproved [char(3)] HoursToDate [decimal] StartDate [char(10)] EndDate [char(10)] EmpName [varchar(120)] Project [varchar(75)] ProjectMgr [varchar(120)] Task [varchar(75)] OverTime [decimal] Customized application view
  • SQL code to create the view CREATE VIEW EmployeeTasks (EmpName, Project, ProjectMgr, Task, Account, OTApproved, StartDate, EndDate, HoursToDate, OverTime) AS SELECT CONVERT(varchar(120), emp.EmployeeFirstName + ‘ ‘ + emp.EmployeeLastName), proj.ProjectDescription, CONVERT(varchar(120), emp2.EmployeeFirstName + ‘ ‘ + emp2.EmployeeLastName), CONVERT(varchar(75), task.TaskDescription), acct.AccountDescription, CASE task.OvertimeApprovedIndicator WHEN 1 THEN ‘Yes’ ELSE ‘No’ END, CONVERT(varchar, ta.AssignmentStartDate, 101), CONVERT(varchar, ta.AssignmentEndDate, 101), ta.OTHoursToDate FROM TaskAssignment ta INNER JOIN Task task ON ta.TaskIdentifier = task.TaskIdentifier INNER JOIN Employee emp ON ta.EmployeeIdentifier = emp.EmployeeIdentifier INNER JOIN Project proj ON task.ProjectIdentifier = proj.ProjectIdentifier INNER JOIN Employee emp2 ON proj.ProjectMgrEmployeeID = emp2.EmployeeIdentifier INNER JOIN Account acct ON task.AccountingCode = acct.AccountingCode
  • Mapping Object to View AssignTask (EmpName, Project, Task, StartDate, EndDate) CompleteTask(EmpName, Project, Task) ApproveOT (EmpName, Project, Task, ProjectMgr) EmployeeTasks Account [varchar(75)] OTApproved [char(3)] HoursToDate [decimal] StartDate [char(10)] EndDate [char(10)] EmpName [varchar(120)] Project [varchar(75)] ProjectMgr [varchar(120)] Task [varchar(75)] OverTime [decimal] EmployeeTask Account [varchar(75)] OTApproved [char(3)] HoursToDate [decimal] StartDate [char(10)] EndDate [char(10)] EmpName [varchar(120)] Project [varchar(75)] ProjectMgr [varchar(120)] Task [varchar(75)] OverTime [decimal]
  • Work Tables
      • Allow pre-joining of data without normalizing base tables
      • Can improve application performance
      • Useful for unpacking recursive data to support applications (and views)
      • Impose a maintenance burden, so use sparingly and carefully!
  • Work Tables
      • Are generally application-specific
      • Need to manage redundancy; base tables contain the “data of record”
      • Can be updated transactionally (from application procedure call) or periodically (via a scheduled process)
  • ADO.NET
      • ADO.NET datasets can be updated using stored procedures
      • XML can easily be converted to dataset form for updating
      • LINQ will provide the ability to create updateable encapsulation objects in .NET
  • ADO.NET
    • Example of .NET Dataset Updating
  • Stored Procedures
      • Can encapsulate data-specific application or business processes
      • Results in greatly improved performance
      • Reduce network traffic
  • Stored Procedures
      • Makes debugging, performance tuning and maintenance easier
      • Can be used to enforce security
      • Should be testable and reusable!
  • Stored Procedures
    • Sample Application View
    • Sample Wrapper Procedure
  • Triggers
      • Can encapsulate data-specific application or business processes
      • Useful for complex and cross-database RI checking and updating
      • “ Instead Of” triggers can be used to map updates on views to underlying base tables
  • Triggers
      • Can be used to support auditing of database updates
      • Can send messages to applications and invoke application objects
      • Can use CLR code in triggers to replace extended stored procedures and OLE Automation
  • Triggers
    • Example: sample application view
    • Instead-Of trigger on application view
    • Example: database audit trigger
  • User-defined Datatypes
      • Scalar UDTs can be used to help enforce domain constraints
      • Object UDTs can be used to create complex data structures that map more readily to application objects (Oracle Jpublisher; MS LINQ)
      • XML UDTs support hierarchical data and can enable relational data to be more easily accessed by web services
  • User-defined Functions
      • Useful for managing UDTs
      • Useful for datatype conversion and data reformatting
      • Are a useful wrapper for views
      • Cannot be used for updating
      • Cannot display or print in functions
  • User-defined Functions
      • Can make relational data look like XML (and vice-versa)
      • Can only call other functions and extended stored procedures from functions
      • SQL code in functions is NOT optimized; may cause performance problems in joins
  • User-defined Functions
    • Sample application view
    • Function to get application data
    • Procedure to update application data
    • Sample execution
    • Output from sample execution
  • User-defined Functions (cont’d)
    • Function to consolidate data records
    • Function to return consolidated records
    • Function to parse character map
    • Sample execution of parsing functions
    • Output from execution
  • User-defined Functions (cont’d)
    • Function to return data as XML
    • Procedure to parse XML to table
    • Sample execution
    • Output from sample execution
  • Merging SQL and XML
    • Merging SQL and XML Example 1
    • Output from Example 1
    • Merging SQL and XML Example 2
    • Output from Example 2
  • Developing an “Agile Attitude”
    • Make using the database (and developing applications for databases) as quick, easy, and painless as possible.
    • Stay business-focused; the objective is meeting the business requirements and deriving the maximum business value from the project.
  • Developing an “Agile Attitude”
    • Adopt a “can do” attitude, and be as helpful as possible.
    • Don’t let database standards become a threat to the success of a project. Accept any defeats and failures encountered during a project as “lessons learned”, that can be applied to future projects.
  • Developing an “Agile Attitude”
    • Communicate with people on their level, and in their terms.
    • Concentrate on solving other people’s problems, not your own.
  • Developing an “Agile Attitude”
    • Learn as much as possible about what your developers and business users do, and how and why they do it. Learn to become more of a “generalist”; this adds to your value.
    • Be flexible, and open to new ideas and new ways of doing things. But make sure that the things that need doing get done.
  • Bio and Contact Information
    • Larry Burns has worked in IT for more than 25 years as a database administrator, application developer, consultant and teacher.  He holds a B.S. in Mathematics from the University of Washington and a Masters degree in Software Engineering from Seattle University.  He currently works for Paccar ITD Data Services as a database consultant on numerous application development projects, and teaches a series of data management classes for application developers.  He has been an instructor and advisor in the certificate program for Data Resource Management at the University of Washington in Seattle.  You can contact him at [email_address] .