Bioperl language in Bioinformatics .pptx

Table of contents
Problem
statement
Code in Bioperl
Explanation Output
01 02
03 04

• Write a Perl program that reads a
DNA sequence and calculates the
frequency of each codon.
• Extend the script to determine
the most and least frequently
used codons in the sequence.
• Output the results in a formatted
table.

#!/usr/bin/perl
use strict;
use warnings;
# prompt user for dna sequence
print "enter a dna sequence: ";
my $dna = <stdin>;
chomp($dna);
# initialize a hash to store codon frequencies
my %codon_count;

# iterate through the dna sequence in steps of 3 (codon length)
for (my $i = 0; $i < length($dna) - 2; $i += 3)
{
my $codon = substr($dna, $i, 3);
$codon_count{$codon}++;
}
# find the most and least frequently used codons
my ($most_frequent, $least_frequent);
my $max_count = 0;
my $min_count = undef;

foreach my $codon (keys %codon_count)
{
if ($codon_count{$codon} > $max_count)
{
$max_count = $codon_count{$codon};
$most_frequent = $codon;
}
if (!defined $min_count || $codon_count{$codon} < $min_count)
{
$min_count = $codon_count{$codon};
$least_frequent = $codon;
}
}

# print results in a formatted table
print "ncodon frequencies:n";
print "codontfrequencyn";
foreach my $codon (sort keys %codon_count)
{
printf "%st%dn", $codon, $codon_count{$codon};
}
print "nmost frequent codon: $most_frequent ($max_count
times)n";
print "least frequent codon: $least_frequent ($min_count
times)n";

The script is structured as follows:
1. User Input: It prompts the user to enter a DNA
sequence.
2. Codon Counting: It iterates through the sequence in
steps of three to count the occurrences of each
codon.
3. Frequency Analysis: It determines the most and least
frequently occurring codons.
4. Output: Finally, it prints the frequency of each codon
in a formatted table along with the most and least
frequent codons.

Input:
Enter a DNA
sequence:
AAAAAAATTTTTTTTTGGGGGGCCCCCCCCC
Output:
Codon Frequencies:
Codon Frequency
AAA 2
ATT 1
CCC 2
GCC 1
GGG 1
TGG 1
TTT 2
Most Frequent Codon: CCC (2 times)
Least Frequent Codon: GCC (1 times)

Bioperl language in Bioinformatics .pptx

More Related Content

Recently uploaded

Featured

Bioperl language in Bioinformatics .pptx