BIOPERL
Table of contents
Problem
statement
Code in Bioperl
Explanation Output
01 02
03 04
Problem
statement
01
• Write a Perl program that reads a
DNA sequence and calculates the
frequency of each codon.
• Extend the script to determine
the most and least frequently
used codons in the sequence.
• Output the results in a formatted
table.
Code in
BioPerl
02
#!/usr/bin/perl
use strict;
use warnings;
# prompt user for dna sequence
print "enter a dna sequence: ";
my $dna = <stdin>;
chomp($dna);
# initialize a hash to store codon frequencies
my %codon_count;
# iterate through the dna sequence in steps of 3 (codon length)
for (my $i = 0; $i < length($dna) - 2; $i += 3)
{
my $codon = substr($dna, $i, 3);
$codon_count{$codon}++;
}
# find the most and least frequently used codons
my ($most_frequent, $least_frequent);
my $max_count = 0;
my $min_count = undef;
foreach my $codon (keys %codon_count)
{
if ($codon_count{$codon} > $max_count)
{
$max_count = $codon_count{$codon};
$most_frequent = $codon;
}
if (!defined $min_count || $codon_count{$codon} < $min_count)
{
$min_count = $codon_count{$codon};
$least_frequent = $codon;
}
}
# print results in a formatted table
print "ncodon frequencies:n";
print "codontfrequencyn";
foreach my $codon (sort keys %codon_count)
{
printf "%st%dn", $codon, $codon_count{$codon};
}
print "nmost frequent codon: $most_frequent ($max_count
times)n";
print "least frequent codon: $least_frequent ($min_count
times)n";
Explanatio
n
03
The script is structured as follows:
1. User Input: It prompts the user to enter a DNA
sequence.
2. Codon Counting: It iterates through the sequence in
steps of three to count the occurrences of each
codon.
3. Frequency Analysis: It determines the most and least
frequently occurring codons.
4. Output: Finally, it prints the frequency of each codon
in a formatted table along with the most and least
frequent codons.
output
04
Input:
Enter a DNA
sequence:
AAAAAAATTTTTTTTTGGGGGGCCCCCCCCC
Output:
Codon Frequencies:
Codon Frequency
AAA 2
ATT 1
CCC 2
GCC 1
GGG 1
TGG 1
TTT 2
Most Frequent Codon: CCC (2 times)
Least Frequent Codon: GCC (1 times)
Thanks!

Bioperl language in Bioinformatics .pptx

  • 1.
  • 2.
    Table of contents Problem statement Codein Bioperl Explanation Output 01 02 03 04
  • 3.
  • 4.
    • Write aPerl program that reads a DNA sequence and calculates the frequency of each codon. • Extend the script to determine the most and least frequently used codons in the sequence. • Output the results in a formatted table.
  • 5.
  • 6.
    #!/usr/bin/perl use strict; use warnings; #prompt user for dna sequence print "enter a dna sequence: "; my $dna = <stdin>; chomp($dna); # initialize a hash to store codon frequencies my %codon_count;
  • 7.
    # iterate throughthe dna sequence in steps of 3 (codon length) for (my $i = 0; $i < length($dna) - 2; $i += 3) { my $codon = substr($dna, $i, 3); $codon_count{$codon}++; } # find the most and least frequently used codons my ($most_frequent, $least_frequent); my $max_count = 0; my $min_count = undef;
  • 8.
    foreach my $codon(keys %codon_count) { if ($codon_count{$codon} > $max_count) { $max_count = $codon_count{$codon}; $most_frequent = $codon; } if (!defined $min_count || $codon_count{$codon} < $min_count) { $min_count = $codon_count{$codon}; $least_frequent = $codon; } }
  • 9.
    # print resultsin a formatted table print "ncodon frequencies:n"; print "codontfrequencyn"; foreach my $codon (sort keys %codon_count) { printf "%st%dn", $codon, $codon_count{$codon}; } print "nmost frequent codon: $most_frequent ($max_count times)n"; print "least frequent codon: $least_frequent ($min_count times)n";
  • 10.
  • 11.
    The script isstructured as follows: 1. User Input: It prompts the user to enter a DNA sequence. 2. Codon Counting: It iterates through the sequence in steps of three to count the occurrences of each codon. 3. Frequency Analysis: It determines the most and least frequently occurring codons. 4. Output: Finally, it prints the frequency of each codon in a formatted table along with the most and least frequent codons.
  • 12.
  • 13.
    Input: Enter a DNA sequence: AAAAAAATTTTTTTTTGGGGGGCCCCCCCCC Output: CodonFrequencies: Codon Frequency AAA 2 ATT 1 CCC 2 GCC 1 GGG 1 TGG 1 TTT 2 Most Frequent Codon: CCC (2 times) Least Frequent Codon: GCC (1 times)
  • 14.