MERGING FILES
INTRODUCTION

   Merging in revision control, is a fundamental operation that
    reconciles multiple changes made to a revision-controlled collection
    of files.

   Most often, it is necessary when a file is modified by two people on
    two different computers at the same time.

   When two branches are merged, the result is a single collection of
    files that contains both sets of changes.

   External sorting may be used.
EXTERNAL SORTING
   External sorting is a term for a class of sorting algorithms that can
    handle massive amounts of data.

   External sorting is required when the data being sorted do not fit into
    the main memory of a computing device and instead they must reside
    in the slower external memory.

   External sorting typically uses a sort-merge strategy.

   In the merge phase, the sorted sub files are combined into a single
    larger file.
MERGE TECHNIQUES

   Two Way Merge

   Three Way Merge

   K-Way Merge
TWO WAY MERGE

   A two-way merge performs an automated difference
    analysis between a file 'A' and a file 'B'.



   This method considers the differences between the two
    files alone to conduct the merge and makes a "best-
    guess" analysis to generate the resulting merge.
TWO WAY MERGE
ALGORITHM
DISADVANTAGE OF TWO WAY MERGE

   This type of merge is usually the most error prone.

   Requires user intervention to verify and sometimes
    correct the result of the merge.
THREE WAY MERGE
   A three-way merge is performed after an automated difference
    analysis between a file 'A' and a file 'B' while also considering
    the origin, or parent, of both files.

   This type of merge is more likely to be usable in revision
    control systems, which can guarantee that such a parent exists
    and is known.

   The merge tool examines the differences and patterns
    appearing in the changes between both files as well as the
    parent.
THREE WAY MERGE
ADVANTAGES OF THREE WAY MERGE

   This merge is the most reliable and has performed well
    in practice.

   It has also required the least amount of user intervention.

   In many cases, requiring no intervention at all making
    the process eligible for task automation.
K-WAY MERGE ALGORITHM
Let there be two arrays:

•An array of k lists and
•An array of k index values corresponding to the current
element in each of the k lists, respectively.
Main loop of the K-Way Merge algorithm:
•Find the index of the minimum current item, minItem
•Process minItem(output it to the output list)
•For i=0 until i=k-1 (in increments of 1)
     If the current item of list i is equal to minItem then
    advance list i.
•Go back to the first step.
K-WAY MERGE
   Tournament sort is used.
PERFORMANCE FACTORS

   The number of records to be sorted.

   The size of the records.

   The number of storage devices used.

   The distribution of those devices on the available I/O
    channels.

   The distribution of key values in the input files.
THANK YOU

Merging files (Data Structure)

  • 1.
  • 2.
    INTRODUCTION  Merging in revision control, is a fundamental operation that reconciles multiple changes made to a revision-controlled collection of files.  Most often, it is necessary when a file is modified by two people on two different computers at the same time.  When two branches are merged, the result is a single collection of files that contains both sets of changes.  External sorting may be used.
  • 3.
    EXTERNAL SORTING  External sorting is a term for a class of sorting algorithms that can handle massive amounts of data.  External sorting is required when the data being sorted do not fit into the main memory of a computing device and instead they must reside in the slower external memory.  External sorting typically uses a sort-merge strategy.  In the merge phase, the sorted sub files are combined into a single larger file.
  • 4.
    MERGE TECHNIQUES  Two Way Merge  Three Way Merge  K-Way Merge
  • 5.
    TWO WAY MERGE  A two-way merge performs an automated difference analysis between a file 'A' and a file 'B'.  This method considers the differences between the two files alone to conduct the merge and makes a "best- guess" analysis to generate the resulting merge.
  • 6.
  • 8.
  • 9.
    DISADVANTAGE OF TWOWAY MERGE  This type of merge is usually the most error prone.  Requires user intervention to verify and sometimes correct the result of the merge.
  • 10.
    THREE WAY MERGE  A three-way merge is performed after an automated difference analysis between a file 'A' and a file 'B' while also considering the origin, or parent, of both files.  This type of merge is more likely to be usable in revision control systems, which can guarantee that such a parent exists and is known.  The merge tool examines the differences and patterns appearing in the changes between both files as well as the parent.
  • 11.
  • 12.
    ADVANTAGES OF THREEWAY MERGE  This merge is the most reliable and has performed well in practice.  It has also required the least amount of user intervention.  In many cases, requiring no intervention at all making the process eligible for task automation.
  • 13.
    K-WAY MERGE ALGORITHM Letthere be two arrays: •An array of k lists and •An array of k index values corresponding to the current element in each of the k lists, respectively. Main loop of the K-Way Merge algorithm: •Find the index of the minimum current item, minItem •Process minItem(output it to the output list) •For i=0 until i=k-1 (in increments of 1) If the current item of list i is equal to minItem then advance list i. •Go back to the first step.
  • 14.
    K-WAY MERGE  Tournament sort is used.
  • 17.
    PERFORMANCE FACTORS  The number of records to be sorted.  The size of the records.  The number of storage devices used.  The distribution of those devices on the available I/O channels.  The distribution of key values in the input files.
  • 18.