Evolutionary db development

1,101 views

Published on

话题来自OpenParty “清雨榕香”活动

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,101
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Changing from waterfall to Agile At an early stage identify requirements , reconcile , design , begin coding minimize changes due extensive preliminary work
  • automation
  • explain
  • Classify and Prioritize data Client list the most important data Try to figure out Data pattern( Data cleaning) Only working on meaningful data Verify data by system logics ( data referencing, records number…) Check it randomly
  • Evolutionary db development

    1. 1. Evolutionary Database Development
    2. 2. Software Development Process is Changing © ThoughtWorks 2009 Fix / Integrate $ Test Code Design Specifications Use Cases / Functional Specs Requirements Gathering Project Plan/Estimation $ $ $ $ Level Stories Vision & High- Iteration Release
    3. 4. Team Collaboration DB Apps BackEnd Traditional Team In Agile team, DBA <ul><li>Role!= Person </li></ul><ul><li>knowledge of the functionality </li></ul><ul><li>Acknowledge team in interaction </li></ul>DEVS I’m responsible for Web design DBA I’m responsible for Database Domain expert I’m responsible for backend apps DBA Our responsible for business value Our responsible for business value It’s our responsibility DEVS Domain Expert Agile Team
    4. 5. <ul><li>DBA sit close to all the roles </li></ul><ul><li>Help make a decision </li></ul><ul><li>Educating Developers write better SQL </li></ul><ul><li>How to make the DBA redundant </li></ul>Best Practices
    5. 6. Refactoring “ A disciplined way to make change to your source code to improve its design, making it easier to work with” Martin Folwer
    6. 7. Database Refactoring Before: After: Behavioral semantics <ul><li>Change structure </li></ul>A small change to Database schema that improves its design while retaining both its behavioral and informational semantics <ul><li>Data Migration </li></ul><ul><li>Verify data quality </li></ul>Informational semantics Customer … Balance Account AccountID(PK) Customer … Account AccountID(PK) Balance
    7. 8. <ul><li>Structural </li></ul><ul><li>Data Quality </li></ul><ul><li>Referential Integrity </li></ul><ul><li>Architectural </li></ul><ul><li>Method </li></ul><ul><li>Non-Refactoring Transformation </li></ul>What will be changed in DB Refactoring? <ul><li>Small Steps </li></ul><ul><li>Frequent Changes </li></ul><ul><li>Test First </li></ul>Principles
    8. 9. <ul><li>Test your Database Schema </li></ul><ul><li>Test the way Apps uses schema </li></ul><ul><li>Validate your data migration </li></ul><ul><li>Test external program code </li></ul><ul><li>Check data quality </li></ul><ul><li>… </li></ul>Test-Driven is also important for DB design TEST Fail Fast Fail Often
    9. 10. <ul><li>Not trying to &quot;Get it Right up Front“ </li></ul><ul><li>Build the simplest thing that can possibly work </li></ul><ul><li>Treat changes as database refactoring ... Every DAY </li></ul><ul><li>functionality added in increments </li></ul>Evolutionary Modeling Design
    10. 11. A Story
    11. 12. Background <ul><li>CRM system </li></ul><ul><ul><li>Transactional operations </li></ul></ul><ul><ul><li>Reporting/ Statistics functions for manager </li></ul></ul><ul><ul><li>Based on Legacy system </li></ul></ul><ul><ul><li>24*7 , a very busy system </li></ul></ul><ul><li>Client’s expectation </li></ul><ul><ul><li>Improvement of Usability and performance </li></ul></ul><ul><li>Legacy Database </li></ul><ul><ul><li>Sql server 2005 </li></ul></ul><ul><ul><li>200G </li></ul></ul><ul><li>Team </li></ul><ul><ul><li>Distributed Agile team. (Beijing + Hongkong) </li></ul></ul>
    12. 13. Refactoring database schema Introduce transition period for SAFTY
    13. 14. Never get rid of old schema immediately Data sync in real time
    14. 15. What we did in this project Legacy DB NEW DB FlatFile Triger SystemNameID Newvalue OldValue columnName Read Data Near real Time Read Data systemName SystemNameID(PK) SystemLogin … systemName_log SystemNameID(PK) OperDatetime Account AccountID(PK) AccountName SystemNameID Isdelete
    15. 16. Onetime Migration <ul><li>Need to figure out logical mapping </li></ul><ul><li>Problem: </li></ul><ul><li>There are over 400 procs, over 1000 tables, over 150 views </li></ul><ul><li>Data Concept/logics changed </li></ul><ul><li>How we resolve: </li></ul><ul><li>Communication is very important </li></ul><ul><li>Split a function into small parts, more smaller more easy more safe </li></ul>Person Organisation Person User contact
    16. 17.   The process of database refactoring
    17. 18. <ul><ul><li>Verify data quality by domain logics after one time migration </li></ul></ul>1.One user Should not have 2 office telephone Select user From user a left join telephone b On a.userid=b.userid Group by a.user Having count(*)>=2 2. A telephone should be assigned to a user Select telephone From telephone Where userid is null 3. A telehpone shoulde not be assigned to 2 or more user Select officeTele From telephone a left join user Group by officeTele Having count(userid)>=2 … adding more verification sql User UserID(PK) FirstName LastName … Telephone TeleID(PK) OfficeTele HomeTele UserID(FK)
    18. 19. Sanity Check <ul><li>Brainstorming </li></ul><ul><li>Feedback from CI/Production </li></ul>Don’t make an issue to happen for two times
    19. 20. Put test scripts into CI Put verification scripts into CI Run verification scripts in production often, be aware of production data
    20. 21. Our workflow Production App Pre-Production App Highly iterative Development System and Acceptance Testing Operations and support Devs’ App DB App Bug Reports Bug Reports Bug Reports Project-level Integration Testing Frequent Deployment Frequent Deployment Highly Controled Deployment CVS <ul><li>Tests </li></ul><ul><li>SanityCheck </li></ul>Brainstorming Production Users QA Devs Devs
    21. 22. The strategy for migration performance <ul><li>Online system don’t allow that migration spend too long time </li></ul>Migration Update contactNote a Set ContactNote=c1,ContactId=c2 From ContactInfo where a.messid=b.Messid Too long ContactInfo Messid(pk) C1 C2 … ContactNote ContactNoteId(pk) ContactNote ContactId MessId
    22. 23. Delcare row_num number:=0; Begin for c_CN in (select MessId,C1,C2 from contactInfo a left join CI_Mig_log b on a.Messid=b.messid where flag=0) update contactNote set contactNote=c_CN.c1, contactId=c_CN.c2 where Messid=c_CN.Messid update CI_MIG_log f set f.flag=1 where messid=c_Cn.MessId row_num:=row_num+1; if mod(row_num,1000)=0 then commit; end if; End loop; Cancel it if there is no enough resource for migration Migration ContactInfo Messid(pk) C1 C2 … ContactNote ContactNoteId(pk) ContactNote ContactId MessId CI_Mig_log MessID(PK) Flag
    23. 24. Versioning Database <ul><li>All database schemas can be thought of as DB refactoring </li></ul><ul><li>As updates are applied to a database, the changes will be recorded in a table similar to the following: </li></ul>Change Date 1_Create_Customer_Table.sql 4-15-07 2_Add_e-mail_address_column.sql 4-17-07 3_Add_fax_number_column.sql 4-18-07 4_Add_transaction_table.sql 4-21-07 5_Add_transaction_status_column.sql 4-24-07 6_Add_customer-transaction_view.sql 4-27-07
    24. 25. Put them under configuration management control CREATE TABLE money ( eek NUMBER ); //Test for… DB DDL Insert into AA(mydata) Values(11); Meta Data Delete from .. DML Create index optimization Merge into Data Migration Tests Installation scripts
    25. 26. Configuration of DB project Database project under version control Tiny db backup Deltascripts of dbdeploy For defining db objects which depends on schema For data onetime migration For data sync For checking dirty data Tool exclusively for database project other scripts…
    26. 27. Database Deployment An automated process is needed to make the process of upgrading out-of-date databases
    27. 28. Our Practices <ul><li>Nothing is used only once </li></ul><ul><li>Automate tasks such as </li></ul><ul><ul><li>Physical table deployment </li></ul></ul><ul><ul><li>Usage statistics </li></ul></ul><ul><ul><li>Schema verification </li></ul></ul><ul><ul><li>Data migration verification </li></ul></ul><ul><li>Introduce tools ,like </li></ul><ul><ul><li>Ant, dbdeploy </li></ul></ul>
    28. 29. Management DB deploymnet DB Deploy - http://dbdeploy.com/
    29. 30. DBDeploy http://dbdeploy.com/documentation/getting-started/rules-for-using-dbdeploy/ <ul><li>N aming convention for delta scripts : </li></ul><ul><ul><li>NUMBER COMMENT.SQL </li></ul></ul><ul><ul><li>e.g. 1_Create_Customer_Table.sql </li></ul></ul><ul><ul><li>… </li></ul></ul><ul><li>Undo section – marked by comments: </li></ul>CREATE TABLE FOO ( FOO_ID INTEGER NOT NULL, FOO_VALUE VARCHAR(30) ); ALTER TABLE FOO ADD CONSTRAINT PK_FOO PRIMARY KEY (FOO_ID); --//@UNDO DROP TABLE FOO;
    30. 31. <ul><li><target name=&quot;gen-and-exec-delta-script&quot;> </li></ul><ul><li><dbdeploy </li></ul><ul><ul><li>driver=&quot;oracle.jdbc.OracleDriver&quot; </li></ul></ul><ul><ul><li>url=&quot;jdbc:oracle:thin:@localhost:1521:XE&quot; </li></ul></ul><ul><ul><li>userid=&quot;dylan&quot; </li></ul></ul><ul><ul><li>password=&quot;nalyd&quot; </li></ul></ul><ul><ul><li>dir=&quot;./sql/deltas/&quot; </li></ul></ul><ul><ul><li>outputfile=&quot;./build_output/db-deltas-hsql.sql&quot; </li></ul></ul><ul><ul><li>dbms=&quot;ora&quot;/> </li></ul></ul><ul><ul><li><sql </li></ul></ul><ul><ul><li>driver=&quot; oracle.jdbc.OracleDriver&quot; </li></ul></ul><ul><ul><li>url=&quot; jdbc:oracle:thin:@localhost:1521:XE &quot; </li></ul></ul><ul><ul><li>userid=&quot;dylan&quot; </li></ul></ul><ul><ul><li>password=&quot;nalyd&quot; </li></ul></ul><ul><ul><li>src=&quot;./build_output/db-deltas.sql&quot; </li></ul></ul><ul><ul><li>onerror=&quot;abort&quot;/> </li></ul></ul><ul><li></target> </li></ul>Ant
    31. 32. DBDeploy <ul><li>Go to directory with SQL files: </li></ul><ul><ul><li>“ 1 create_customer_table.sql” </li></ul></ul><ul><ul><li>“ 2 add_customer_id_constraint.sql” </li></ul></ul><ul><li>Run “ant” </li></ul>Output: gen-and-exec-delta-script: [dbdeploy] dbdeploy v2.11 [dbdeploy] Reading change scripts from directory C:Projectsdbdeploy-demosqldeltas... [dbdeploy] Changes currently applied to database: [dbdeploy] 1, 2 [dbdeploy] Scripts available: [dbdeploy] 1, 2, 3, 4 [dbdeploy] To be applied: [dbdeploy] 3, 4 [sql] Executing file: C:Projectsdbdeploy-demouild_outputdb-deltas.sql [sql] 8 of 8 SQL statements executed successfully
    32. 33. Automate Tasks <ul><li>CreateNewTestDB </li></ul><ul><li>upgradeDB </li></ul><target name= “-parseDbScxripts” >…</target> … <target name=&quot;-upgradeDB&quot; depends=&quot;-parseDbScripts, -dbdeploy, -runDeltaScript, -dropDbLogic, -createDbLogic&quot; description=&quot;Upgrade specified database to latest version&quot; /> <target name=&quot;rebuildDB&quot; depends=&quot; -parseDbScripts, -dropDb, -createDb, -initialiseDb, -dbdeploy, -runDeltaScript, -createDbLogic&quot; description=&quot;drop, recreate and initialise the Connect database&quot; />
    33. 34. Batch file: library ant ant.exe –buildfile:evovle.build –D:rebuildDB -logfile:build.txt Shared to devs Do what we want by just One Command
    34. 35. <ul><li>Reference: </li></ul><ul><ul><li>Evolutionary Database Design </li></ul></ul><ul><ul><li>http://martinfowler.com/articles/evodb.html </li></ul></ul><ul><ul><li>Refactoring Databases: Evolutionary Database Design </li></ul></ul>
    35. 36. Q&A

    ×