SlideShare a Scribd company logo
1 of 46
Download to read offline
Causality-Based Versioning
    Kiran-Kumar Muniswamy-Reddy and David A. Holland
    Slides By Authors And Aleatha Parker-Wood




Tuesday, June 1, 2010
Versioning


    •    Already popular

    •    Saves back up “versions” of files as they change

    •    Two flavors: versioning (event based) and snapshotting (time based)

    •    Snapshots: WAFL, Venti...

    •    Versioning: Elephant, VersionFS...



Tuesday, June 1, 2010
Why Version/Snapshot?

    •    Disaster recovery is baked into the file system

    •    “Oops, I needed that...”

    •    “Oops, I didn’t mean to click that virus...”

    •    “Oops, that new driver patch broke everything...”

    •    Maintains backup files to which you can recover (without going
         offsite)


Tuesday, June 1, 2010
Causality

    •    Depends on time (to cause Y, X must be before it)

    •    Uni-directional (If X causes Y, Y cannot cause X)

    •    Defined in terms of data flow

          •    A reads B ⇒ B causes A

          •    A writes B ⇒ A causes B

    •    PASS, Intrusion Dectection Systems (BackTracker, Taser...)


Tuesday, June 1, 2010
Why Causality?



    •    Track propagation of data

    •    Find out what files were modified by what processes

    •    Reconstruct the scene of the crime




Tuesday, June 1, 2010
Causality-Based Versioning

    •    Decide when to version using causal relationships between two files

    •    Has advantages of versioning file systems or snapshots

    •    Eases recovery from corruption, viruses, and user mistakes

    •    In addition, creates causal links between files

    •    Easier to decide what to restore

    •    Sort of like transactions on steroids


Tuesday, June 1, 2010
Applications


    •    Intrusion Recovery

    •    System configuration management

    •    IP compliance

    •    Reproduction of research results




Tuesday, June 1, 2010
A Scenario...


          •    Apache split-logfile Vulnerability

          •    Vulnerability in Apache 1.3

          •    Vulnerability allows attacker to overwrite any file with a .log
               extension

          •    Let’s look at the current versioning options...




Tuesday, June 1, 2010
#'

      $%                 
*+


      $                                         ,-

      '$

       ''                                                           ,-

       '()             *


      !
#$% !       7


Tuesday, June 1, 2010
8)	'

      $%                                                      

                                                               

 
      $                                                        %
                           '.*+


     '$


      '()              *


      !
#$% !           !


Tuesday, June 1, 2010
$%
      $                                                


                        '.*+
                           '$0!
                        (.*+
                           -

                                             .*+



     '$
                                                 /'.*+

                                                  /(.*+

      !
#$% !      5


Tuesday, June 1, 2010
The Goal



    •    One of these has too much information

    •    The other not enough

    •    Can we leverage causality to create just enough versions?




Tuesday, June 1, 2010
Creating Just Enough Versions


    •    Building on top of the Provenance Aware Storage System (PASS)

    •    Two options

          •    Cycle Avoidance

          •    Graph Finesse




Tuesday, June 1, 2010
How PASS works


    •    Translates system calls to provenance records (read/write become
         edges in a dependency graph)

    •    Maintains provenance for transient objects such as pipes and
         processes, and creates virtual objects as needed

    •    Analyzes to ensure there are no cyclic dependencies between objects

    •    Causality based versioning extends the analysis phase



Tuesday, June 1, 2010
The big idea



    •    Cycles are violations of causality

    •    The creation of a cycle is an indicator that this is an interesting event

    •    We can prevent cycles by creating a new version every time a cycle is
         about to occur




Tuesday, June 1, 2010
6)
'

                          3          D
2
!
!
#$% !   5!


Tuesday, June 1, 2010
3         D

            8)
)
'


      !
#$% !         


Tuesday, June 1, 2010
3          D

            8)
)                                     3
'                                          '



      !
#$% !         5


Tuesday, June 1, 2010
3          D

            8)
)                                      3
'   (                                      '



      !
#$% !          


Tuesday, June 1, 2010
3          D

            8)
)                                      3
'   (                                (    '



      !
#$% !            /


Tuesday, June 1, 2010
3              D

            8)
)                                      3
'   (                                (    '



      !
#$% !             0


Tuesday, June 1, 2010
3          D
            8)
)                   3
                45


	+
'    (                                  (   '


      !

More Related Content

Similar to Causality Based Versioning (7)

HTML5 offline
HTML5 offlineHTML5 offline
HTML5 offline
 
Google App Engine - Devfest India 2010
Google App Engine -  Devfest India 2010Google App Engine -  Devfest India 2010
Google App Engine - Devfest India 2010
 
Mobile Strategy & Product Dev. - iRush
Mobile Strategy & Product Dev. - iRushMobile Strategy & Product Dev. - iRush
Mobile Strategy & Product Dev. - iRush
 
Human APIs, the future of mobile
Human APIs, the future of mobileHuman APIs, the future of mobile
Human APIs, the future of mobile
 
OpenStreetMap & Walking-Papers Workflow
OpenStreetMap & Walking-Papers WorkflowOpenStreetMap & Walking-Papers Workflow
OpenStreetMap & Walking-Papers Workflow
 
Ruby Coding Dojo
Ruby Coding DojoRuby Coding Dojo
Ruby Coding Dojo
 
Мерчендайзинг против юзабилити
Мерчендайзинг против юзабилитиМерчендайзинг против юзабилити
Мерчендайзинг против юзабилити
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Recently uploaded (20)

How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 

Causality Based Versioning

  • 1. Causality-Based Versioning Kiran-Kumar Muniswamy-Reddy and David A. Holland Slides By Authors And Aleatha Parker-Wood Tuesday, June 1, 2010
  • 2. Versioning • Already popular • Saves back up “versions” of files as they change • Two flavors: versioning (event based) and snapshotting (time based) • Snapshots: WAFL, Venti... • Versioning: Elephant, VersionFS... Tuesday, June 1, 2010
  • 3. Why Version/Snapshot? • Disaster recovery is baked into the file system • “Oops, I needed that...” • “Oops, I didn’t mean to click that virus...” • “Oops, that new driver patch broke everything...” • Maintains backup files to which you can recover (without going offsite) Tuesday, June 1, 2010
  • 4. Causality • Depends on time (to cause Y, X must be before it) • Uni-directional (If X causes Y, Y cannot cause X) • Defined in terms of data flow • A reads B ⇒ B causes A • A writes B ⇒ A causes B • PASS, Intrusion Dectection Systems (BackTracker, Taser...) Tuesday, June 1, 2010
  • 5. Why Causality? • Track propagation of data • Find out what files were modified by what processes • Reconstruct the scene of the crime Tuesday, June 1, 2010
  • 6. Causality-Based Versioning • Decide when to version using causal relationships between two files • Has advantages of versioning file systems or snapshots • Eases recovery from corruption, viruses, and user mistakes • In addition, creates causal links between files • Easier to decide what to restore • Sort of like transactions on steroids Tuesday, June 1, 2010
  • 7. Applications • Intrusion Recovery • System configuration management • IP compliance • Reproduction of research results Tuesday, June 1, 2010
  • 8. A Scenario... • Apache split-logfile Vulnerability • Vulnerability in Apache 1.3 • Vulnerability allows attacker to overwrite any file with a .log extension • Let’s look at the current versioning options... Tuesday, June 1, 2010
  • 9. #' $% *+ $ ,- '$ '' ,- '() * !
  • 10. #$% ! 7 Tuesday, June 1, 2010
  • 11. 8) ' $% $ % '.*+ '$ '() * !
  • 12. #$% ! ! Tuesday, June 1, 2010
  • 13. $% $ '.*+ '$0! (.*+ - .*+ '$ /'.*+ /(.*+ !
  • 14. #$% ! 5 Tuesday, June 1, 2010
  • 15. The Goal • One of these has too much information • The other not enough • Can we leverage causality to create just enough versions? Tuesday, June 1, 2010
  • 16. Creating Just Enough Versions • Building on top of the Provenance Aware Storage System (PASS) • Two options • Cycle Avoidance • Graph Finesse Tuesday, June 1, 2010
  • 17. How PASS works • Translates system calls to provenance records (read/write become edges in a dependency graph) • Maintains provenance for transient objects such as pipes and processes, and creates virtual objects as needed • Analyzes to ensure there are no cyclic dependencies between objects • Causality based versioning extends the analysis phase Tuesday, June 1, 2010
  • 18. The big idea • Cycles are violations of causality • The creation of a cycle is an indicator that this is an interesting event • We can prevent cycles by creating a new version every time a cycle is about to occur Tuesday, June 1, 2010
  • 19. 6) ' 3 D
  • 20. 2
  • 21. !
  • 22. !
  • 23. #$% ! 5! Tuesday, June 1, 2010
  • 24. 3 D 8)
  • 25. )
  • 26. ' !
  • 27. #$% ! Tuesday, June 1, 2010
  • 28. 3 D 8)
  • 29. ) 3
  • 30. ' ' !
  • 31. #$% ! 5 Tuesday, June 1, 2010
  • 32. 3 D 8)
  • 33. ) 3
  • 34. ' ( ' !
  • 35. #$% ! Tuesday, June 1, 2010
  • 36. 3 D 8)
  • 37. ) 3
  • 38. ' ( ( ' !
  • 39. #$% ! / Tuesday, June 1, 2010
  • 40. 3 D 8)
  • 41. ) 3
  • 42. ' ( ( ' !
  • 43. #$% ! 0 Tuesday, June 1, 2010
  • 44. 3 D 8)
  • 45. ) 3 45 +
  • 46. ' ( ( ' !
  • 47. #$% ! Tuesday, June 1, 2010
  • 48. Version-On-Write? • We could remove cycles using Version-On-Write • Every read creates a new version of the process • Every write creates a new version of the file • But this results in 8 versions • Huge management overhead Tuesday, June 1, 2010
  • 49. Cycle Avoidance Algorithm • Uses local information about the object • Create a new version of an object whenever a new ancestor is added • Different versions are considered to be “new” ancestors • Not every write causes a new version Tuesday, June 1, 2010
  • 50. The Algorithm • Assume new data: A1 depends on B2 • If B is not in A’s dependencies, create a new version of A • Else if B is already in A’s dependencies: • If B2 is in dependencies, discard (no new information) • If B3 is in dependencies, discard (no new causality) • If B1 is in dependencies, create new version of A Tuesday, June 1, 2010
  • 51. 3 D '
  • 52. '
  • 53. )' )(
  • 54. ' ' !
  • 55. #$% ! ! Tuesday, June 1, 2010
  • 56. 3 D '
  • 57. '
  • 58. )( )6 3(
  • 59. ' ( ( ' !
  • 60. #$% ! / Tuesday, June 1, 2010
  • 61. 3 D
  • 62. '
  • 63. '
  • 64. 5 0 )( )6 3( 36 ! ' ( ( ' !
  • 65. #$% ! /5 Tuesday, June 1, 2010
  • 66. Graph Finesse • As before: A1 depends on B2 • If B2 is already in A’s history, discard • Otherwise, check for a path from B2 - A1 • If yes, we have a cycle. Make a new version of A1 • Otherwise, add A1- B2 to the dependency graph Tuesday, June 1, 2010
  • 67. 3 D 9)
  • 68. )' 3' 3( ' ( ( ' !
  • 69. #$% ! /0 Tuesday, June 1, 2010
  • 70. '
  • 71. ' )( )6 3( 36 ' ( ( ' 7 8+
  • 72. 9) )' 3' 3( ' ( ( ' !
  • 73. #$% ! / Tuesday, June 1, 2010
  • 74. '
  • 75. ' 9) . ?' . 9+ * * '
  • 76. !
  • 77. #$% ! /1 Tuesday, June 1, 2010
  • 78. Evaluation • Run-time overhead • Space overhead • Recovery costs • All results are average of 5 runs • Less than 5% standard deviation Tuesday, June 1, 2010
  • 79. Workloads used • Linux compile (CPU intensive) • Postmark (I/O intensive) • Applying patches with Mercurial (developer workload) • blast protein-sequencing (scientific workload) Tuesday, June 1, 2010
  • 80. Algorithms used • Without causal data: • Ext2: Baseline (Lasagna, Harvard’s versioning FS, on top of ext2) • VER: Plain open-close versioning • With causal data • OC: Open-close • CA: Cycle-Avoidance • GF: Graph Finesse • ALL: version on every write Tuesday, June 1, 2010
  • 82. $ 6$$$ ;B+C: , (;$$ A '%+6: ('+6: 'B+': ($$$ ''+: ?@ ';$$ '$$$ ;$$ $ ( = 4 78 !
  • 83. #$% ! 0 Tuesday, June 1, 2010
  • 85. 6+$ '('+D: (+; (+$ ?7@ '+; ';+%: 'B+D: ';+%: (+: '+$ $+; $+$ ( = 4 78 !
  • 86. #$% ! 07 Tuesday, June 1, 2010
  • 88. $ 'C$$+$ , A %+D: '($$+$ D'+6: '$$$+$ (;+: (%+%: (B+: ?@ %$$+$ D$$+$ C$$+$ ($$+$ $+$ ( = 4 78 !
  • 89. #$% ! 5 Tuesday, June 1, 2010
  • 91. '+C ;6+B: '+( 6'+D: 6$+(: 6'+: (D+D: '+$ ?7@ $+% $+D $+C $+( $+$ ( = 4 78 !
  • 92. #$% ! 0 Tuesday, June 1, 2010
  • 93. ' ', )* ' ) ' **' '+',
  • 94. )' ) ' !
  • 95. #$% ! Tuesday, June 1, 2010
  • 97. 3 ) !
  • 98. #$% ! 1 Tuesday, June 1, 2010
  • 99. ' '+',= #)' . 8 1 5 541 04 9 570 04 ?? 41! 5!49 !
  • 100. #$% ! 4 Tuesday, June 1, 2010
  • 101. ' $ 6$ = ?@ (; ($ '; 78 '$ ; $ = ! ' = ! ; = !
  • 102. #$% ! 7 Tuesday, June 1, 2010
  • 103. ' $ %$$ (;+'- B$$ 78 = ?@ D$$ ;$$ 'B+- C$$ 6$$ +6- ($$ '$$ $ = ! ' = ! ; = !
  • 104. #$% ! ! Tuesday, June 1, 2010
  • 105. Conclusions • Both algorithms require less time and space than Version-On-Write • Both algorithms offer finer grained control than Open-Close • Graph-Finesse creates fewer unnecessary versions • Cycle-Avoidance has overhead comparable to Open-Close Tuesday, June 1, 2010
  • 106. Expanding on it • Not just good for disaster recovery • Search • Social network analysis Tuesday, June 1, 2010