This document describes a program for multiple sequence alignment using the star alignment method. The program takes DNA or protein sequences as input, performs pairwise alignments of each sequence to a central sequence, and outputs an alignment scoring matrix and phylogenetic tree showing the evolutionary relationships between the sequences. It was developed in C# using Visual Studio.NET for a graphical user interface. The key program files take input, perform the dynamic programming alignment algorithm, and output XHTML files with the alignment results.
Comparative Analysis of Algorithms for Single Source Shortest Path ProblemCSCJournals
The single source shortest path problem is one of the most studied problem in algorithmic graph theory. Single Source Shortest Path is the problem in which we have to find shortest paths from a source vertex v to all other vertices in the graph. A number of algorithms have been proposed for this problem. Most of the algorithms for this problem have evolved around the Dijkstra’s algorithm. In this paper, we are going to do comparative analysis of some of the algorithms to solve this problem. The algorithms discussed in this paper are- Thorup’s algorithm, augmented shortest path, adjacent node algorithm, a heuristic genetic algorithm, an improved faster version of the Dijkstra’s algorithm and a graph partitioning based algorithm.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
EXPERIMENTS ON HYPOTHESIS "FUZZY K-MEANS IS BETTER THAN K-MEANS FOR CLUSTERING"IJDKP
Clustering is one of the data mining techniques that have been around to discover business intelligence by grouping objects into clusters using a similarity measure. Clustering is an unsupervised learning process that has many utilities in real time applications in the fields of marketing, biology, libraries, insurance, city-planning, earthquake studies and document clustering. Latent trends and relationships among data objects can be unearthed using clustering algorithms. Many clustering algorithms came into existence. However, the quality of clusters has to be given paramount importance. The quality objective is to achieve
highest similarity between objects of same cluster and lowest similarity between objects of different clusters. In this context, we studied two widely used clustering algorithms such as the K-Means and Fuzzy K-Means. K-Means is an exclusive clustering algorithm while the Fuzzy K-Means is an overlapping clustering algorithm. In this paper we prove the hypothesis “Fuzzy K-Means is better than K-Means for Clustering” through both literature and empirical study. We built a prototype application to demonstrate the differences between the two clustering algorithms. The experiments are made on diabetes dataset
obtained from the UCI repository. The empirical results reveal that the performance of Fuzzy K-Means is better than that of K-means in terms of quality or accuracy of clusters. Thus, our empirical study proved the hypothesis “Fuzzy K-Means is better than K-Means for Clustering”.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...IJERA Editor
In this paper, a relation for finding type-II discrete sine transform (DST) from type-I discrete Hartley transform (DHT) has been derived. The transform length N is taken as even. Using this relation, the (N - 1) output components of DST can be realized from DHT. The DHT is one of the transforms used for converting data in time domain into frequency domain using only real values.
Comparative Analysis of Algorithms for Single Source Shortest Path ProblemCSCJournals
The single source shortest path problem is one of the most studied problem in algorithmic graph theory. Single Source Shortest Path is the problem in which we have to find shortest paths from a source vertex v to all other vertices in the graph. A number of algorithms have been proposed for this problem. Most of the algorithms for this problem have evolved around the Dijkstra’s algorithm. In this paper, we are going to do comparative analysis of some of the algorithms to solve this problem. The algorithms discussed in this paper are- Thorup’s algorithm, augmented shortest path, adjacent node algorithm, a heuristic genetic algorithm, an improved faster version of the Dijkstra’s algorithm and a graph partitioning based algorithm.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
EXPERIMENTS ON HYPOTHESIS "FUZZY K-MEANS IS BETTER THAN K-MEANS FOR CLUSTERING"IJDKP
Clustering is one of the data mining techniques that have been around to discover business intelligence by grouping objects into clusters using a similarity measure. Clustering is an unsupervised learning process that has many utilities in real time applications in the fields of marketing, biology, libraries, insurance, city-planning, earthquake studies and document clustering. Latent trends and relationships among data objects can be unearthed using clustering algorithms. Many clustering algorithms came into existence. However, the quality of clusters has to be given paramount importance. The quality objective is to achieve
highest similarity between objects of same cluster and lowest similarity between objects of different clusters. In this context, we studied two widely used clustering algorithms such as the K-Means and Fuzzy K-Means. K-Means is an exclusive clustering algorithm while the Fuzzy K-Means is an overlapping clustering algorithm. In this paper we prove the hypothesis “Fuzzy K-Means is better than K-Means for Clustering” through both literature and empirical study. We built a prototype application to demonstrate the differences between the two clustering algorithms. The experiments are made on diabetes dataset
obtained from the UCI repository. The empirical results reveal that the performance of Fuzzy K-Means is better than that of K-means in terms of quality or accuracy of clusters. Thus, our empirical study proved the hypothesis “Fuzzy K-Means is better than K-Means for Clustering”.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...IJERA Editor
In this paper, a relation for finding type-II discrete sine transform (DST) from type-I discrete Hartley transform (DHT) has been derived. The transform length N is taken as even. Using this relation, the (N - 1) output components of DST can be realized from DHT. The DHT is one of the transforms used for converting data in time domain into frequency domain using only real values.
Data Hiding Method With High Embedding Capacity CharacterCSCJournals
Recently, the data hiding method based on the high embedding capacity by using improved EMD method was proposed by Kuo et al.[6]. They claimed that their scheme can not only hide a great deal of secret data but also keep high safety and good image quality. However, in their scheme, the sender and the receiver must share the synchronous random secret seed before they transmit the stego-image each other. Otherwise, they can not recover the correct secret information from the stego-image. In this paper we propose an improved scheme based on EMD and LSB matching method to overcome the above problem, in other words, the sender does not share the synchronous random secret seed the receiver before the stego-image is transmitted. Observing the experimental results, they show that our proposed scheme acquires high embedding capacity and acceptable stego-image quality.
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
In order to treat and analyze real datasets, fuzzy association rules have been proposed. Several
algorithms have been introduced to extract these rules. However, these algorithms suffer from
the problems of utility, redundancy and large number of extracted fuzzy association rules. The
expert will then be confronted with this huge amount of fuzzy association rules. The task of
validation becomes fastidious. In order to solve these problems, we propose a new validation
method. Our method is based on three steps. (i) We extract a generic base of non redundant
fuzzy association rules by applying EFAR-PN algorithm based on fuzzy formal concept analysis.
(ii) we categorize extracted rules into groups and (iii) we evaluate the relevance of these rules
using structural equation model.
Novel Methods of Generating Self-Invertible Matrix for Hill Cipher Algorithm.CSCJournals
In this paper, methods of generating self-invertible matrix for Hill Cipher algorithm have been proposed. The inverse of the matrix used for encrypting the plaintext does not always exist. So, if the matrix is not invertible, the encrypted text cannot be decrypted. In the self-invertible matrix generation method, the matrix used for the encryption is itself self-invertible. So, at the time of decryption, we need not to find inverse of the matrix. Moreover, this method eliminates the computational complexity involved in finding inverse of the matrix while decryption.
Testing of Matrices Multiplication Methods on Different ProcessorsEditor IJMTER
There are many algorithms we found for matrices multiplication. Until now it has been
found that complexity of matrix multiplication is O(n3). Though Further research found that this
complexity can be decreased. This paper focus on the algorithm and its complexity of matrices
multiplication methods.
USING CUCKOO ALGORITHM FOR ESTIMATING TWO GLSD PARAMETERS AND COMPARING IT WI...ijcsit
This study introduces and compares different methods for estimating the two parameters of generalized logarithmic series distribution. These methods are the cuckoo search optimization, maximum likelihood estimation, and method of moments algorithms. All the required derivations and basic steps of each algorithm are explained. The applications for these algorithms are implemented through simulations using different sample sizes (n = 15, 25, 50, 100). Results are compared using the statistical measure mean square error.
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGcscpconf
Data clustering is a process of arranging similar data into groups. A clustering algorithm
partitions a data set into several groups such that the similarity within a group is better than
among groups. In this paper a hybrid clustering algorithm based on K-mean and K-harmonic
mean (KHM) is described. The proposed algorithm is tested on five different datasets. The research is focused on fast and accurate clustering. Its performance is compared with the traditional K-means & KHM algorithm. The result obtained from proposed hybrid algorithm is much better than the traditional K-mean & KHM algorithm
ExcelR is considered as the best Data Science training institute in Bangalore which offers services from training to placement as part of the Data Science training program with over 400+ participants placed in various multinational companies including E&Y, Panasonic, Accenture, VMWare, Infosys, IBM, etc.
I'll found many papers and books talking about category theory, but many peoples still don't know how it can help. On this talk I'll help you better understand how math can help us develop a software more composable.
Coder on Beer - Concrete
2018 - São Paulo
Program 1 (Practicing an example of function using call by referenc.pdfezhilvizhiyan
Program 1: (Practicing an example of function using call by reference)
Write a program that reads a set of information related to students in C++ class and prints them
in a table format. The information is stored in a file called data.txt. Each row of the file contains
student number, grade for assignment 1, grade for assignment 2, grade for assignment 3, and
grade for assignment 4. Your main program should read each row, pass the grades for the three
assignments to a function called ProcessRow to calculate the average of the grades, minimum of
the four assignments, and the maximum of the four assignments. The results (average,
maximum, and minimum) should be returned to the main program and the main program prints
them on the screen in a table format. For example, if the file includes
126534 7 8 10 7
321345 5 6 4 9
324341 8 3 8 5
your program should print
Std-Id A1 A2 A3 A4 Min Max Average
----------------------------------------------------------------------------------
126534 7 8 10 7 7 10 8
321345 5 6 4 9 4 9 6
324341 8 3 8 5 3 8 6
You must use call by reference to do the above question.
_________________________________________________________________
Program 2: (Practicing an example of function using call by value)
Repeat the above question using call by value. This means you need to have three different
functions: one to calculate the average, another to calculate the minimum, and the third one to
calculate the maximum. This is how to call these functions:
Max = CalculateMax(A1,A2,A3, A4);
Min = CalculateMin(A1,A2,A3, A4);
Average = CalculateAvg(A1,A2,A3, A4);
_________________________________________________________________
Program 3:
Write a program with several functions that performs the following tasks. :
Create a function that reads the following 5 float numbers from the file data.txt into array Called
Arr1.
12.0 15.0 29.3 25.0 93.2
Copy array Arr1 into array Arr2 in reverse order
Print array Arr1.
Print array Arr2.
Find the number of elements in array Arr1 that are >= 80 and <=100.
Find the number of the elements in array Arr1 in which their contents are divisible by 5
Find the index of the elements in array Arr1 in which their contents are divisible by 3.
Find mean (average) in array Arr1.
Find the maximum number in array Arr1.
Ask the user to input a key. Then search for the key in array Arr1 and inform the user about the
existence (true / false) of the key in array
Your program should include several functions.
A function for filling up the information from file into an array (part a should call this function)
A function that does the copying of one array into another in reverse order. Arrays must have the
same size (part b should call this function)
Printing any array with any size (part c and d should call this function)
finding and returning the number of elements between 80 and 100 in any array with any size
(part e calls this function)
finding and returning the number of elements in an array that are divisible by 5 (par.
Data Hiding Method With High Embedding Capacity CharacterCSCJournals
Recently, the data hiding method based on the high embedding capacity by using improved EMD method was proposed by Kuo et al.[6]. They claimed that their scheme can not only hide a great deal of secret data but also keep high safety and good image quality. However, in their scheme, the sender and the receiver must share the synchronous random secret seed before they transmit the stego-image each other. Otherwise, they can not recover the correct secret information from the stego-image. In this paper we propose an improved scheme based on EMD and LSB matching method to overcome the above problem, in other words, the sender does not share the synchronous random secret seed the receiver before the stego-image is transmitted. Observing the experimental results, they show that our proposed scheme acquires high embedding capacity and acceptable stego-image quality.
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
In order to treat and analyze real datasets, fuzzy association rules have been proposed. Several
algorithms have been introduced to extract these rules. However, these algorithms suffer from
the problems of utility, redundancy and large number of extracted fuzzy association rules. The
expert will then be confronted with this huge amount of fuzzy association rules. The task of
validation becomes fastidious. In order to solve these problems, we propose a new validation
method. Our method is based on three steps. (i) We extract a generic base of non redundant
fuzzy association rules by applying EFAR-PN algorithm based on fuzzy formal concept analysis.
(ii) we categorize extracted rules into groups and (iii) we evaluate the relevance of these rules
using structural equation model.
Novel Methods of Generating Self-Invertible Matrix for Hill Cipher Algorithm.CSCJournals
In this paper, methods of generating self-invertible matrix for Hill Cipher algorithm have been proposed. The inverse of the matrix used for encrypting the plaintext does not always exist. So, if the matrix is not invertible, the encrypted text cannot be decrypted. In the self-invertible matrix generation method, the matrix used for the encryption is itself self-invertible. So, at the time of decryption, we need not to find inverse of the matrix. Moreover, this method eliminates the computational complexity involved in finding inverse of the matrix while decryption.
Testing of Matrices Multiplication Methods on Different ProcessorsEditor IJMTER
There are many algorithms we found for matrices multiplication. Until now it has been
found that complexity of matrix multiplication is O(n3). Though Further research found that this
complexity can be decreased. This paper focus on the algorithm and its complexity of matrices
multiplication methods.
USING CUCKOO ALGORITHM FOR ESTIMATING TWO GLSD PARAMETERS AND COMPARING IT WI...ijcsit
This study introduces and compares different methods for estimating the two parameters of generalized logarithmic series distribution. These methods are the cuckoo search optimization, maximum likelihood estimation, and method of moments algorithms. All the required derivations and basic steps of each algorithm are explained. The applications for these algorithms are implemented through simulations using different sample sizes (n = 15, 25, 50, 100). Results are compared using the statistical measure mean square error.
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGcscpconf
Data clustering is a process of arranging similar data into groups. A clustering algorithm
partitions a data set into several groups such that the similarity within a group is better than
among groups. In this paper a hybrid clustering algorithm based on K-mean and K-harmonic
mean (KHM) is described. The proposed algorithm is tested on five different datasets. The research is focused on fast and accurate clustering. Its performance is compared with the traditional K-means & KHM algorithm. The result obtained from proposed hybrid algorithm is much better than the traditional K-mean & KHM algorithm
ExcelR is considered as the best Data Science training institute in Bangalore which offers services from training to placement as part of the Data Science training program with over 400+ participants placed in various multinational companies including E&Y, Panasonic, Accenture, VMWare, Infosys, IBM, etc.
I'll found many papers and books talking about category theory, but many peoples still don't know how it can help. On this talk I'll help you better understand how math can help us develop a software more composable.
Coder on Beer - Concrete
2018 - São Paulo
Program 1 (Practicing an example of function using call by referenc.pdfezhilvizhiyan
Program 1: (Practicing an example of function using call by reference)
Write a program that reads a set of information related to students in C++ class and prints them
in a table format. The information is stored in a file called data.txt. Each row of the file contains
student number, grade for assignment 1, grade for assignment 2, grade for assignment 3, and
grade for assignment 4. Your main program should read each row, pass the grades for the three
assignments to a function called ProcessRow to calculate the average of the grades, minimum of
the four assignments, and the maximum of the four assignments. The results (average,
maximum, and minimum) should be returned to the main program and the main program prints
them on the screen in a table format. For example, if the file includes
126534 7 8 10 7
321345 5 6 4 9
324341 8 3 8 5
your program should print
Std-Id A1 A2 A3 A4 Min Max Average
----------------------------------------------------------------------------------
126534 7 8 10 7 7 10 8
321345 5 6 4 9 4 9 6
324341 8 3 8 5 3 8 6
You must use call by reference to do the above question.
_________________________________________________________________
Program 2: (Practicing an example of function using call by value)
Repeat the above question using call by value. This means you need to have three different
functions: one to calculate the average, another to calculate the minimum, and the third one to
calculate the maximum. This is how to call these functions:
Max = CalculateMax(A1,A2,A3, A4);
Min = CalculateMin(A1,A2,A3, A4);
Average = CalculateAvg(A1,A2,A3, A4);
_________________________________________________________________
Program 3:
Write a program with several functions that performs the following tasks. :
Create a function that reads the following 5 float numbers from the file data.txt into array Called
Arr1.
12.0 15.0 29.3 25.0 93.2
Copy array Arr1 into array Arr2 in reverse order
Print array Arr1.
Print array Arr2.
Find the number of elements in array Arr1 that are >= 80 and <=100.
Find the number of the elements in array Arr1 in which their contents are divisible by 5
Find the index of the elements in array Arr1 in which their contents are divisible by 3.
Find mean (average) in array Arr1.
Find the maximum number in array Arr1.
Ask the user to input a key. Then search for the key in array Arr1 and inform the user about the
existence (true / false) of the key in array
Your program should include several functions.
A function for filling up the information from file into an array (part a should call this function)
A function that does the copying of one array into another in reverse order. Arrays must have the
same size (part b should call this function)
Printing any array with any size (part c and d should call this function)
finding and returning the number of elements between 80 and 100 in any array with any size
(part e calls this function)
finding and returning the number of elements in an array that are divisible by 5 (par.
The Array is the most commonly used Data Structure.
An array is a collection of data elements that are of the same type (e.g., a collection of integers, collection of characters, collection of doubles).
OR
Array is a data structure that represents a collection of the same types of data.
The values held in an array are called array elements
An array stores multiple values of the same type – the element type
The element type can be a primitive type or an object reference
Therefore, we can create an array of integers, an array of characters, an array of String objects, an array of Coin objects, etc.
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...ijcsa
The intelligent agent based model is a popular approach in constructing Distributed Data Mining (DDM) systems to address scalable mining over large scale and ever increasing distributed data. In an agent based
distributed system, variety of agents coordinate and communicate with each other to perform the various
tasks of the Data Mining (DM) process. In this study a serial computing mode of a multi-agent system
(MAS) called Agent enabled Mining of Globally Strong Association Rules (AeMGSAR) is presented based
on the serial itinerary of the mobile agents. A Running environment is also designed for the implementation and performance study of AeMGSAR system.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
A Program for Multiple Sequence Alignment by Star Alignment
1. International Journal of Electrical, Electronics and Computers (EEC Journal) [Vol-3, Issue-6, Nov-Dec 2018]
https://dx.doi.org/10.22161/eec.3.6.1 ISSN: 2456-2319
www.eecjournal.com Page | 1
A Program for Multiple Sequence Alignment by
Star Alignment
Dibyaranjan Samal*1, Nibedita Sahoo1, Meesala Krishna Murthy2
1Department of Biotechnology, AMIT College, Khurda 752057, Odisha, India
2Department of Zoology, Mizoram University, Aizawl 796004, Mizoram, India
*Dibyaranjan Samal, Email: Dibyaranjan.daredivya@gmail.com
Abstract— Sequence alignment is the process of lining up
two or more sequencesto achieve maximal label of identity,
further purpose of accessing is degree of similarity and the
possibility of homology. Comparison of more than two
string is known as multiple sequence alignment. Multiple
sequence alignment (MSA) is a powerful tool in locating
similar patternsin biological (DNA and Protein) sequences.
The objective of this application is to develop a Multiple
sequence alignment tool using star alignment method. This
application / project has been developed using Visual
Studio.Net and C# as page language was found to be
appropriate for a rich user experience, faster code
construction, automatic memory allocation and faster
debugging.
Keywords— Star alignment, Multiple Sequence
Alignment, Algorithms.
I. INTRODUCTION
The multiple sequence alignment is one of the
challenging tasks in bioinformatics. It plays an essential role
in finding conserved region among a set of sequences and
inducing the evolutionary history and common properties of
some species (Richer et al., 2007). It is known to be NP-
complete and the current implementations of multiple
alignment algorithms are heuristics owing to the high
complexity on space and time needed when performing
alignments (Elias & Isaac, 2006).Most existing algorithms
for multiple sequence alignment are classified into three
categories (Brudno et al., 2003). The first class is those
algorithms that use high quality heuristics very close to
optimality. They can only handle a small number of
sequences with length less than 20 and limited to the sum-
of-pairs objective function (Carrillo & Lipman, 1988). The
second class is those algorithms using progressive
alignments strategy and is by far the most widely used
heuristics to align a large number of sequences (Sze et al.,
2006). The third class is those algorithms using iterative
refinement strategies to improve an initial alignment until
convergence or reaching the maximum user-defined number
of iterations (Hirosawa et al., 1995).We need heuristics to
compute the MSA (Notredame, 2007). Usually, a heuristic
does not guarantee the quality of the resulting alignment, it
is faster, and in many cases,gives reasonably good answers.
In star-alignment we build multiple alignment based on
pairwise alignments between one of the sequences (call it
the center of the star) and all the other sequences
(Kleinjung, 2002).
This application has been developed in visual
studio.net and C# as page language. XHTML, CSS is also
used for better result display and adobe photoshop for
animation (FIG 1).
II. PROGRAM
Three Important program files have been used to get the
result of Multiple Sequence alignment.
Program.cs
Input.cs
Result.cs
Program.cs:
The program.cs file has been used for global configuration
as to which file would run first.
using System;
using System.Collections.Generic;
using System.Windows.Forms;
namespace STARALIGNMENT
{
staticclassProgram
{
///<summary>
/// The main entry point for the application.
///</summary>
[STAThread]
staticvoid Main()
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
2. International Journal of Electrical, Electronics and Computers (EEC Journal) [Vol-3, Issue-6, Nov-Dec 2018]
https://dx.doi.org/10.22161/eec.3.6.1 ISSN: 2456-2319
www.eecjournal.com Page | 2
Application.Run(newresult());
}
}
}
Input.cs:
This is the input page where sequence is given to the tool.
This is the important portion of this application as to
dynamic programming algorithm is used in this form.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using System.IO;
namespace STARALIGNMENT
{
publicpartialclassINPUT : Form
{
public INPUT()
{
InitializeComponent();
}
privatevoid button2_Click(object sender,EventArgs e)
{
this.Close();
}
//GET MAX VALUE
privateint ReturnMax(int i, int j, int k)
{
int retval = i;
if (j > retval)
{
retval = j;
}
if (k > retval)
{
retval = k;
}
return retval;
}
//CONSTRUCT MATRIX
privatestring[,] ReturnArray(string seq)
{
int gap = Convert.ToInt32(textBox7.Text);
int match = Convert.ToInt32(textBox5.Text);
int misMatch = Convert.ToInt32(textBox6.Text);
int cols = textBox1.Text.Length + 2;
int rows = seq.Length + 2;
string[,] ar = newstring[rows, cols];
for (int i = 0; i <= rows - 1; i++)
{
for (int j = 0; j <= cols - 1; j++)
{
if (i == 0 && (j == 0 || j == 1))
{
ar[i, j] = " ";
}
elseif (i == 0 && (j != 0 || j != 1))
{
ar[i, j] = textBox1.Text[j -
2].ToString().ToUpper();
}
elseif (j == 0 && i == 1)
{
ar[i, j] = " ";
}
elseif (j == 0 && i != 1)
{
ar[i, j] = textBox2.Text[i -
2].ToString().ToUpper();
}
elseif (j == 1 && i == 1)
{
ar[i, j] = "0";
}
elseif (i == 1 && j != 1)
{
ar[i, j] =
Convert.ToString(Convert.ToInt32(ar[i, j - 1]) + gap);
}
elseif (i != 1 && j == 1)
{
ar[i, j] =
Convert.ToString(Convert.ToInt32(ar[i - 1, j]) + gap);
}
else
{
int val1 = Convert.ToInt32(ar[i - 1, j]) + gap;
6. International Journal of Electrical, Electronics and Computers (EEC Journal) [Vol-3, Issue-6, Nov-Dec 2018]
https://dx.doi.org/10.22161/eec.3.6.1 ISSN: 2456-2319
www.eecjournal.com Page | 6
seq = seq.Substring(0, seq.Length - 1);
return seq;
}
}
}
III. RESULTS
This is the result form of this tool where a XHTML
document is written using filehandling (SYSTEM.IO
namespace) and displayed on this application.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
namespace STARALIGNMENT
{
publicpartialclassresult : Form
{
public result()
{
InitializeComponent();
}
privatevoid button2_Click_1(object sender, EventArgs e)
{
this.Close();
}
privatevoid result_Load(object sender, EventArgs e)
{
webBrowser1.Navigate(Application.StartupPath +
"/rpt.html");
}
}
}
IV. CONCLUSION AND FUTURE SCOPE
This program only convert the star alignment method to
program where fixed number of sequence are taken into
consideration and we are also assuming our own central
sequence for programming efficiency of the method. This
should be done dynamically, where the central sequence
should be selected and as many multiple numbers of
sequences has to be taken. A single database of protein,
DNA and RNA repository has to be made in the next
version of this Project / Application.
REFERENCES
[1] Richer J.M. Derrien V., Hao J.K. (2007). A New
Dynamic Programming Algorithm for Multiple
Sequence Alignment. COCOA LNCS, 4616, 52-61.
[2] Elias, Isaac. (2006). Settling The Intractability Of
Multiple Alignment. Journal of Computational
Biology, 13(7), 1323-1339.
[3] Brudno M.,Chapman M., Gottgens B., Batzoglou S.,
Morgenstern B. (2003). Fast And Sensitive Multiple
Alignments Of Large Genomic Sequences. BMC
Bioinformatics, 4, 66.
[4] Carrillo H., Lipman D.J. (1988). The Multiple
Sequence Alignment Proelem in Biology. SIAM
Journal of Applied Mathematics, 48(5), 1073-1082.
[5] Sze S.H., Lu Y., Yang Q. A. (2006). Polynomial Tie
Solvable Formulation of Multiple Sequence Aignment.
Journal of Computational Biology, 13(2), 309-319.
[6] Hirosawa M., Totoki Y., Hoshida M., Ishikawa M.
(1995). Comprehensive Study On Iterative Algorithms
Of Multiple Sequence Alignment. Comput Appl
Biosci, 11, 13-18.
[7] Notredame C. (2007). Recent Evolutions of Multiple
Sequence Alignment Algoritms.PLOS Computational
Biology, 3(8:e123), 1405-1408.
[8] Kleinjung J., DouglasN., Heringa J. (2002). Multiple
Sequence Alignments with parallel Computing.
Bioinformatics, 18(9), 1270-1271.
7. International Journal of Electrical, Electronics and Computers (EEC Journal) [Vol-3, Issue-6, Nov-Dec 2018]
https://dx.doi.org/10.22161/eec.3.6.1 ISSN: 2456-2319
www.eecjournal.com Page | 7
Fig.1: The Visual studio .net Integrated Development Environment where the application has been developed.