Database Refactoring Sreeni Ananthakrishna 2006 Nov

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Database Refactoring Sreeni Ananthakrishna 2006 Nov - Presentation Transcript

    1. Database Refactoring An introduction to Refactoring Databases & Evolutionary Database Design (Amber and Sadalage)
    2. Agenda
      • What is database refactoring about?
      • Evolutionary database development techniques
      • Refactoring Strategies
      • Classification of refactorings and examples
    3. What is database refactoring about?
        • Improving database design
        • Making small and incremental changes to the schema
        • Maintain existing information and behaviour
        • Functionality is not added/removed
        • Not just limited to the database, but also the applications that use it
    4. A simple example… Customer accesses balance Customer SynchronizeAccountBalance {event = on update |on delete|on insert, drop date = <date> } balance SynchronizeCustomerBalance {event = on update |on delete|on insert, drop date = <date> } {drop date = <date>} App A App B maintainbalance() maintainbalance() customerId <<PK>> name Account accountId <<PK>> customerId <<FK>>
    5. Why refactor ?
      • Data models built upfront tend to be complex and need cleaning
      • Maintain consistency between application domain and data model
      • Address performance requirements
      • Identify and eliminate db smells
    6. Database Smells
      • Multipurpose Column - eg. Customer dob & employee start date
      • Multipurpose Table – eg. Customer table with person/corps
      • Redundant Data – same information in different tables
      • Table with too many columns – eg. Customer with many address
      • Table with too many rows
      • Smart columns – eg. Data has positional context
      • Fear of change – too risky to change schema, time to refactor!
    7. Evolutionary Database Development
      • Evolve data models vs upfront design
      • Database regression testing
      • Configuration management of database artifacts
      • Developer Sandboxes
    8. Database regression testing
      • Test the schema
        • Check logic in stored procedures and triggers
        • Test check and referential constraints
        • View definitions
        • Default Values and Invariants
      • Test application code
        • Unit tests around application code which queries the db.
      • Test data migration
    9. Config management of DB Artifacts
      • Schema creation scripts
      • Data loading/migration scripts
      • Reference data
      • Stored procedures
      • View definitions
      • Test data
      • Regression Tests
    10. Developer Sandboxes
    11. Database Refactoring Strategies
      • Apply small changes
        • Small changes allow easy/early detection of errors
      • Identify Individual Refactorings
        • Instead of doing “move column” and “rename column” in one go, version each individually.
      • Create database configuration table
        • Helps identify current version of the database and can be used in migrations.
    12. Database Refactoring Strategies (contd.)
      • Determine synchronization strategies during transition period
        • Triggers do real time update but might have performance impacts.
        • Views might not supports updates but do not move data
        • Batch synch can be used during non-peak loads but might have to deal with multiple updates
      • Encapsulate Database Access
        • Abstract database access eg. By using persistence frameworks
    13. Database Refactoring Classification
      • Structural
      • Data Quality
      • Referential
      • Architectural
      • Method
    14. Structural Refactorings
      • Related to structure of Tables, Views
      • eg. Move Column, Rename Table, Split Table, Merge Column
      • Issues to consider when implementing:
        • Cyclic Triggers
        • Broken Views, Procedures, Triggers
        • Transition period in multi-application setup
    15. Introduce Surrogate Key
      • Motivations
        • Reduce coupling between schema and business domain
        • Increase consistency by having a uniform key strategy
        • Improve performance by having index based on simpler key
      • Potential Tradeoffs
        • Surrogate keys are not suitable for all situations
        • Introducing a new key might require further key consolidation and more effort
      “ Replace an existing natural key with a surrogate key”
    16. Introduce Surrogate Key (contd.) contains balance PopulateOrderId {event = on insert drop date = <date> } orderId <<FK>> <<surrogate>> orderId <<PK>> <<surrogate>> {drop date = <date>} Order customerNumber <<PK>> <<FK>> <<Natural>> storeId <<PK>> <<Natural>> OrderItem customerNumber <<PK>> <<FK>> <<Natural>> storeId <<PK>> <<Natural>> orderItemNumber <<PK>>
    17. Data Quality Refactorings
      • Related to improving quality of information in db
      • eg. Add Lookup Table, Introduce column constraint, Introduce common format
      • Issues to consider when implementing:
        • Constraint violations
        • Broken logic in procedures
        • Broken where clauses in Views
        • Updating large amounts of data
    18. Add Lookup Table
      • Motivations
        • Introduce referential integrity for a column
        • Provide code lookup (move enum to the db)
        • Replace column constraint with set of expected values in lookup table
      • Potential Tradeoffs
        • Identifying the data to populate (especially for multiple apps)
        • Possible performance impact due to additional joins
      “ Create a lookup table for an existing column”
    19. Add Lookup Table (contd.) Address Street <<FK>> 1. Identify the column 4. Introduce FK constraint 3. Populate Data 2. Create Lookup Table State PostCode State State <<PK>> Name
    20. Referential Integrity Refactorings
      • Changes that improve referential integrity of data
      • eg. Add Foreign Key Constraint, Introduce cascading delete, Introduce trigger for history
      • Issues to consider when implementing:
        • Fix broken CRUD logic in procedure
        • Data cleansing to make new constraints work
    21. Introduce Cascading Delete
      • Motivations
        • Preserve referential integrity of the parent /child rows
        • Remove responsibility for child deletion in the application
      • Potential Tradeoffs
        • Deadlock ?
        • Trigger accidental mass deletion when deleting root nodes
        • Duplicate functionality is introduced when using persistence frameworks like Hibernate/Toplink
      “ Delete the child record(s) when the parent is deleted”
    22. Introduce Cascading Delete (contd.) Policy PolicyId <<PK>> Claim ClaimId <<PK>> 1. Identify the column 2. Choose cascading mechanism (triggers or using cascade clause during constraint creation) PolicyId <<FK>> DeleteClaim {event = on delete}
    23. Architectural Refactorings
      • Changes that improve performance, portability and define the architecture within the database
      • eg. Encapsulate Table with View, Introduce Calculation Method, Replace Method(s) with View, Introduce trigger for history
      • Issues to consider when implementing:
        • Performance vs Data redundancy
        • Keeping business logic in the application vs database
    24. Introduce Index
      • Motivations
        • Increase performance of read queries
      • Potential Tradeoffs
        • Too many indexes degrade performance during insert/update/deletes
        • Existing data containing duplicates might need cleansing when introducing unique indexes
      “ Introduce a unique or non-unique Index”
    25. Introduce Index (contd.) Customer CustomerId <<PK>> TFN <<index>> 1. Determine type of index – unique vs non-unique 3. Add a new index TFN <<AK>> Name 4. Add more disk space for index maintenance 2. Eliminate duplicate rows when using unique index
    26. Method Refactorings
      • Changes that improve code representing stored procedures, functions and triggers
      • eg. Rename Method, Reorder Parameters, Replace literal with Table Lookup
      • Issues to consider when implementing:
        • Broken triggers, procedures, functions
        • Tool support
    27. Refactoring Tools
      • Schema Migration – Rails Migration, Sundog
      • Unit Testing –JUnit, DBUnit
      • Refactor Stored Procedures – SQLRefactor(SQLServer Only)

    + melbournepatternsmelbournepatterns, 3 years ago

    custom

    252 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 252
      • 252 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 4
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories