Instructions:
The input to this homework is a set of DNA reads ,You can assume that if two reads have a
suffix-prefix overlap longer than 50bp,
they can be considered as being sampled from the same region of the genome and be reliably
connected. Also note that minor sequencing errors (e.g. mismatches or insertion/deletion) may
present in the overlapped region. Submit your reconstructed genomic sequence in FASTA
format.
File data :
>1
CCCTGTCTACCACCCAGACTATCGTGTAGTTCTGCCTGTTCCGTAAGTCGTAGATTGCTATCCTGGAAATCATCGTGCTCAGGATGTTAATATCTAGCGT
>2
AGACTATCGTGTAGTTCTGCCTGTTCCGTAAGTCGTAGATTGCTATCCTGGAAATCATCGTGCTCAGGATGTTAATATCTAGCGTCCTACGTTACGAGTT
>3
TCTGCCTGTTCCGTAAGTCGTAGATTGCTATCCTGGAAATCATCGTGCTCAGGATGTTAATATCTAGCGTCCTACGTTACGAGTTGGCAGATGACAGATC
>4
AGTCGTAGATTGCTATCCTGGAAATCATCGTGCTCAGGATGTTAATATCTAGCGTCCTACGTTACGAGTTGGCAGATGACAGATCGTAGTCGTGGTAAGG
>5
TCCTGGAAATCATCGTGCTCAGGATGTTAATATCTAGCGTCCTACGTTACGAGTTGGCAGATGACAGATCGTAGTCGTGGTAAGGGGCATTGCCGCTTGT
>6
TGCTCAGGATGTTAATATCTAGCGTCCTACGTTACGAGTTGGCAGATGACAGATCGTAGTCGTGGTAAGGGGCATTGCCGCTTGTGACCCAGTTCGCGTG
>7
TATCTAGCGTCCTACGTTACGAGTTGGCAGATGACAGATCGTAGTCGTGGTAAGGGGCATTGCCGCTTGTGACCCAGTTCGCGTGCCTAGCAGCACTCCA
>8
GTTACGAGTTGGCAGATGACAGATCGTAGTCGTGGTAAGGGGCATTGCCGCTTGTGACCCAGTTCGCGTGCCTAGCAGCACTCCAAAATAAAGTTTACAG
>9
ATGACAGATCGTAGTCGTGGTAAGGGGCATTGCCGCTTGTGACCCAGTTCGCGTGCCTAGCAGCACTCCAAAATAAAGTTTACAGTACCGTCCGGACGGC
>10
CGTGGTAAGGGGCATTGCCGCTTGTGACCCAGTTCGCGTGCCTAGCAGCACTCCAAAATAAAGTTTACAGTACCGTCCGGACGGCAGAACTGTCCTCTAG
>11
TGCCGCTTGTGACCCAGTTCGCGTGCCTAGCAGCACTCCAAAATAAAGTTTACAGTACCGTCCGGACGGCAGAACTGTCCTCTAGATCGTCCTAACGCCT
>12
AGTTCGCGTGCCTAGCAGCACTCCAAAATAAAGTTTACAGTACCGTCCGGACGGCAGAACTGTCCTCTAGATCGTCCTAACGCCTTAGTCGAATCCCTTG
>13
CAGCACTCCAAAATAAAGTTTACAGTACCGTCCGGACGGCAGAACTGTCCTCTAGATCGTCCTAACGCCTTAGTCGAATCCCTTGCCGTCGGTAACCACT
>14
AAGTTTACAGTACCGTCCGGACGGCAGAACTGTCCTCTAGATCGTCCTAACGCCTTAGTCGAATCCCTTGCCGTCGGTAACCACTGAATAAACTACGCGT
>15
TCCGGACGGCAGAACTGTCCTCTAGATCGTCCTAACGCCTTAGTCGAATCCCTTGCCGTCGGTAACCACTGAATAAACTACGCGTTAGGACTTTGTCAGA
>16
TGTCCTCTAGATCGTCCTAACGCCTTAGTCGAATCCCTTGCCGTCGGTAACCACTGAATAAACTACGCGTTAGGACTTTGTCAGACGCGAGGAGCTAGTA
>17
CCTAACGCCTTAGTCGAATCCCTTGCCGTCGGTAACCACTGAATAAACTACGCGTTAGGACTTTGTCAGACGCGAGGAGCTAGTAGGAGGACAAATCAGC
>18
GAATCCCTTGCCGTCGGTAACCACTGAATAAACTACGCGTTAGGACTTTGTCAGACGCGAGGAGCTAGTAGGAGGACAAATCAGCAAACGACCCTGAATT
>19
GGTAACCACTGAATAAACTACGCGTTAGGACTTTGTCAGACGCGAGGAGCTAGTAGGAGGACAAATCAGCAAACGACCCTGAATTGAACAATGTGAGTAG
>20
AACTACGCGTTAGGACTTTGTCAGACGCGAGGAGCTAGTAGGAGGACAAATCAGCAAACGACCCTGAATTGAACAATGTGAGTAGGTATAACTGTGCTTG
>21
CTTTGTCAGACGCGAGGAGCTAGTAGGAGGACAAATCAGCAAACGACCCTGAATTGAACAATGTGAGTAGGTATAACTGTGCTTGTATGACGTCCCGTTC
>22
GGAGCTAGTAGGAGGACAAATCAGCAAACGACCCTGAATTGAACAATGTGAGTAGGTATAACTGTGCTTGTATGACGTCCCGTTCGGTCGTTCTTGAGCA
>23
ACAAATCAGCAAACGACCCTGAATTGAACAATGTGAGTAGGTATAACTGTGCTTGTATGACGTCCCGTTCGGTCGTTCTTGAGCAACTTCGGCCAGTGCA
>24
ACCCTGAATTGAACAATGTGAGTAGGTATAACTGTGCTTGTATGACGTCCCGTTCGGTCGTTCTTGAGCAACTTCGGCCAGTGCATGCTATGGGGGAAGC
>25
ATGTGAGTAGGTATAACTGTGCTTGTATGACGTCCCGTTCGGTCGT.
Graduate Outcomes Presentation Slides - English (v3).pptx
Instructions- The input to this homework is a set of DNA reads -You ca.pdf
1. Instructions:
The input to this homework is a set of DNA reads ,You can assume that if two reads have a
suffix-prefix overlap longer than 50bp,
they can be considered as being sampled from the same region of the genome and be reliably
connected. Also note that minor sequencing errors (e.g. mismatches or insertion/deletion) may
present in the overlapped region. Submit your reconstructed genomic sequence in FASTA
format.
File data :
>1
CCCTGTCTACCACCCAGACTATCGTGTAGTTCTGCCTGTTCCGTAAGTCGTAGATTGC
TATCCTGGAAATCATCGTGCTCAGGATGTTAATATCTAGCGT
>2
AGACTATCGTGTAGTTCTGCCTGTTCCGTAAGTCGTAGATTGCTATCCTGGAAATCAT
CGTGCTCAGGATGTTAATATCTAGCGTCCTACGTTACGAGTT
>3
TCTGCCTGTTCCGTAAGTCGTAGATTGCTATCCTGGAAATCATCGTGCTCAGGATGTT
AATATCTAGCGTCCTACGTTACGAGTTGGCAGATGACAGATC
>4
AGTCGTAGATTGCTATCCTGGAAATCATCGTGCTCAGGATGTTAATATCTAGCGTCC
TACGTTACGAGTTGGCAGATGACAGATCGTAGTCGTGGTAAGG
>5
TCCTGGAAATCATCGTGCTCAGGATGTTAATATCTAGCGTCCTACGTTACGAGTTGG
CAGATGACAGATCGTAGTCGTGGTAAGGGGCATTGCCGCTTGT
>6
TGCTCAGGATGTTAATATCTAGCGTCCTACGTTACGAGTTGGCAGATGACAGATCGT
AGTCGTGGTAAGGGGCATTGCCGCTTGTGACCCAGTTCGCGTG
>7
TATCTAGCGTCCTACGTTACGAGTTGGCAGATGACAGATCGTAGTCGTGGTAAGGGG
CATTGCCGCTTGTGACCCAGTTCGCGTGCCTAGCAGCACTCCA
>8
GTTACGAGTTGGCAGATGACAGATCGTAGTCGTGGTAAGGGGCATTGCCGCTTGTGA
CCCAGTTCGCGTGCCTAGCAGCACTCCAAAATAAAGTTTACAG
>9
ATGACAGATCGTAGTCGTGGTAAGGGGCATTGCCGCTTGTGACCCAGTTCGCGTGCC
TAGCAGCACTCCAAAATAAAGTTTACAGTACCGTCCGGACGGC
>10
CGTGGTAAGGGGCATTGCCGCTTGTGACCCAGTTCGCGTGCCTAGCAGCACTCCAAA
ATAAAGTTTACAGTACCGTCCGGACGGCAGAACTGTCCTCTAG
>11
TGCCGCTTGTGACCCAGTTCGCGTGCCTAGCAGCACTCCAAAATAAAGTTTACAGTA
CCGTCCGGACGGCAGAACTGTCCTCTAGATCGTCCTAACGCCT
>12
AGTTCGCGTGCCTAGCAGCACTCCAAAATAAAGTTTACAGTACCGTCCGGACGGCA
GAACTGTCCTCTAGATCGTCCTAACGCCTTAGTCGAATCCCTTG