This document describes a study that developed an integrated biomedical ontology for extracting information from Medline abstracts about Alzheimer's disease. The ontology integrated the Gene Ontology and Medical Subject Headings by mapping gene names, GO terms, and MeSH keywords related to Alzheimer's. The integrated ontology was validated structurally, syntactically, and semantically. It was then used to discover significant associations between proteins, genes, and Alzheimer's disease extracted from Medline abstracts.
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLSIJDKP
The biomedical research literature is one among many other domains that hides a precious knowledge, and
the biomedical community made an extensive use of this scientific literature to discover the facts of
biomedical entities, such as disease, drugs,etc.MEDLINE is a huge database of biomedical research
papers which remain a significantly underutilized source of biological information. Discovering the useful
knowledge from such huge corpus leads to various problems related to the type of information such as the
concepts related to the domain of texts and the semantic relationship associated with them. In this paper,
we propose a Two-level model for Self-supervised relation extraction from MEDLINE using Unified
Medical Language System (UMLS) Knowledge base. The model uses a Self-supervised Approach for
Relation Extraction (RE) by constructing enhanced training examples using information from UMLS. The
model shows a better result in comparison with current state of the art and naïve approaches
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Publishing House
This document describes a proposed natural language processing (NLP) system to retrieve medical information from clinical documents for disease diagnosis. The system would use NLP techniques like named entity recognition, part-of-speech tagging, and relationship extraction to process both clinical documents and user queries. For queries asking for disease information, the system would retrieve and score relevant documents, then output disease information. For queries describing symptoms, the system would attempt to output the corresponding disease name. The system would be implemented using modules for data extraction, processing, query analysis, document retrieval and scoring, and output filtering.
A Semantic Retrieval System for Extracting Relationships from Biological Corpusijcsit
The World Wide Web holds a large size of different information. Sometimes while searching the World Wide Web, users always do not gain the type of information they expect. In the subject of information extraction, extracting semantic relationships between terms from documents become a challenge. This
paper proposes a system helps in retrieving documents based on the query expansion and tackles the extracting of semantic relationships from biological documents. This system retrieved documents that are relevant to the input terms then it extracts the existence of a relationship. In this system, we use Boolean
model and the pattern recognition which helps in determining the relevant documents and determining the place of the relationship in the biological document. The system constructs a term-relation table that accelerates the relation extracting part. The proposed method offers another usage of the system so the
researchers can use it to figure out the relationship between two biological terms through the available information in the biological documents. Also for the retrieved documents, the system measures the percentage of the precision and recall.
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...IJNSA Journal
In health research, one of the major tasks is to retrieve, and analyze heterogeneous databases containing
one single patient’s information gathered from a large volume of data over a long period of time. The
main objective of this paper is to represent our ontology-based information retrieval approach for
clinical Information System. We have performed a Case Study in the real life hospital settings. The results
obtained illustrate the feasibility of the proposed approach which significantly improved the information
retrieval process on a large volume of data over a long period of time from August 2011 until January
2012
The document describes Carlos Manuel Estévez-Bretón's doctoral research on functionally characterizing metabolic networks. The goals are to classify metabolic pathways based solely on their functional features using machine learning methods, develop a system for functionally representing metabolic networks, and apply machine learning methods to study systems biology in new ways. The methodology involves using data from MetaCyc and KEGG databases, developing a functional representation model, classifying networks with supervised and unsupervised machine learning methods, and evaluating the results using various metrics.
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONcsandit
Sequencing projects arising from high throughput technologies including those of sequencing DNA microarrays allowed to simultaneously measure the expression levels of millions of genes of a biological sample as well as annotate and identify the role (function) of those genes. Consequently, to better manage and organize this significant amount of information,
bioinformatics approaches have been developed. These approaches provide a representation and a more 'relevant' integration of data in order to test and validate the hypothesis of researchers throughout the experimental cycle. In this context, this article describes and discusses some of techniques used for the functional analysis of gene expression data.
1) Systems biology aims to understand biology at the system level rather than just individual components. This requires advanced modeling and data analysis techniques.
2) Challenges in systems biology include understanding complex relationships between components, dynamic behavior over time, and controlling systems with unknown functions.
3) Artificial intelligence can help address these challenges through techniques like machine learning, knowledge representation, and problem solving. It has already been applied to tasks like gene alignment modeling and phylogenetic inference.
University of Southampton - ORC seminarOgan Gurel MD
Dr. Ogan Gurel will present on "Protein Electrodynamics & Terahertz Medicine: A New Frontier?". Proteins exhibit dynamic behavior with vibrations at terahertz frequencies that are essential to their function. These vibrations interact with electromagnetic radiation in the terahertz band, as confirmed by experiments showing specific absorption of terahertz radiation by met-hemoglobin. This suggests terahertz molecular medical imaging and manipulation of protein motions for new therapies. Dr. Gurel is a director at Samsung Advanced Institute of Technology with experience in biomedicine, biophysics, and computer science.
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLSIJDKP
The biomedical research literature is one among many other domains that hides a precious knowledge, and
the biomedical community made an extensive use of this scientific literature to discover the facts of
biomedical entities, such as disease, drugs,etc.MEDLINE is a huge database of biomedical research
papers which remain a significantly underutilized source of biological information. Discovering the useful
knowledge from such huge corpus leads to various problems related to the type of information such as the
concepts related to the domain of texts and the semantic relationship associated with them. In this paper,
we propose a Two-level model for Self-supervised relation extraction from MEDLINE using Unified
Medical Language System (UMLS) Knowledge base. The model uses a Self-supervised Approach for
Relation Extraction (RE) by constructing enhanced training examples using information from UMLS. The
model shows a better result in comparison with current state of the art and naïve approaches
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Publishing House
This document describes a proposed natural language processing (NLP) system to retrieve medical information from clinical documents for disease diagnosis. The system would use NLP techniques like named entity recognition, part-of-speech tagging, and relationship extraction to process both clinical documents and user queries. For queries asking for disease information, the system would retrieve and score relevant documents, then output disease information. For queries describing symptoms, the system would attempt to output the corresponding disease name. The system would be implemented using modules for data extraction, processing, query analysis, document retrieval and scoring, and output filtering.
A Semantic Retrieval System for Extracting Relationships from Biological Corpusijcsit
The World Wide Web holds a large size of different information. Sometimes while searching the World Wide Web, users always do not gain the type of information they expect. In the subject of information extraction, extracting semantic relationships between terms from documents become a challenge. This
paper proposes a system helps in retrieving documents based on the query expansion and tackles the extracting of semantic relationships from biological documents. This system retrieved documents that are relevant to the input terms then it extracts the existence of a relationship. In this system, we use Boolean
model and the pattern recognition which helps in determining the relevant documents and determining the place of the relationship in the biological document. The system constructs a term-relation table that accelerates the relation extracting part. The proposed method offers another usage of the system so the
researchers can use it to figure out the relationship between two biological terms through the available information in the biological documents. Also for the retrieved documents, the system measures the percentage of the precision and recall.
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...IJNSA Journal
In health research, one of the major tasks is to retrieve, and analyze heterogeneous databases containing
one single patient’s information gathered from a large volume of data over a long period of time. The
main objective of this paper is to represent our ontology-based information retrieval approach for
clinical Information System. We have performed a Case Study in the real life hospital settings. The results
obtained illustrate the feasibility of the proposed approach which significantly improved the information
retrieval process on a large volume of data over a long period of time from August 2011 until January
2012
The document describes Carlos Manuel Estévez-Bretón's doctoral research on functionally characterizing metabolic networks. The goals are to classify metabolic pathways based solely on their functional features using machine learning methods, develop a system for functionally representing metabolic networks, and apply machine learning methods to study systems biology in new ways. The methodology involves using data from MetaCyc and KEGG databases, developing a functional representation model, classifying networks with supervised and unsupervised machine learning methods, and evaluating the results using various metrics.
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONcsandit
Sequencing projects arising from high throughput technologies including those of sequencing DNA microarrays allowed to simultaneously measure the expression levels of millions of genes of a biological sample as well as annotate and identify the role (function) of those genes. Consequently, to better manage and organize this significant amount of information,
bioinformatics approaches have been developed. These approaches provide a representation and a more 'relevant' integration of data in order to test and validate the hypothesis of researchers throughout the experimental cycle. In this context, this article describes and discusses some of techniques used for the functional analysis of gene expression data.
1) Systems biology aims to understand biology at the system level rather than just individual components. This requires advanced modeling and data analysis techniques.
2) Challenges in systems biology include understanding complex relationships between components, dynamic behavior over time, and controlling systems with unknown functions.
3) Artificial intelligence can help address these challenges through techniques like machine learning, knowledge representation, and problem solving. It has already been applied to tasks like gene alignment modeling and phylogenetic inference.
University of Southampton - ORC seminarOgan Gurel MD
Dr. Ogan Gurel will present on "Protein Electrodynamics & Terahertz Medicine: A New Frontier?". Proteins exhibit dynamic behavior with vibrations at terahertz frequencies that are essential to their function. These vibrations interact with electromagnetic radiation in the terahertz band, as confirmed by experiments showing specific absorption of terahertz radiation by met-hemoglobin. This suggests terahertz molecular medical imaging and manipulation of protein motions for new therapies. Dr. Gurel is a director at Samsung Advanced Institute of Technology with experience in biomedicine, biophysics, and computer science.
Sample Work For Engineering Literature Review and Gap IdentificationPhD Assistance
Sample Work For Engineering Literature Review and Gap Identification - PhD Assistance - http://bit.ly/2E9fAVq
2.1 INTRODUCTION
2.2 RESEARCH GAPS IN EXISTING METHODS
2.3 OBJECTIVES OF THIS WORK
Read More : http://bit.ly/2Rl7XT5
#gapanalysis #strategicmanagement #datagapanalysis #gapanalysisppt #gapanalysishealthcare #gapanalysisfinance #gapanalysisEngineering
International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...CSCJournals
This document is the front matter of the International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 1 published on February 28, 2008. It includes information about the editor in chief, copyright details, a table of contents listing one paper, and brief descriptions of the paper titled "Inference Networks for Molecular Database Similarity Searching" which explores using Bayesian networks for molecular similarity searching in chemical databases.
This paper proposes Natural language based Discourse Analysis method used for extracting
information from the news article of different domain. The Discourse analysis used the Rhetorical Structure
theory which is used to find coherent group of text which are most prominent for extracting information
from text. RST theory used the Nucleus- Satellite concept for finding most prominent text from the text
document. After Discourse analysis the text analysis has been done for extracting domain related object
and relates this object. For extracting the information knowledge based system has been used which
consist of domain dictionary .The domain dictionary has a bag of words for domain. The system is
evaluated according gold-of-art analysis and human decision for extracted information.
A Review of Various Methods Used in the Analysis of Functional Gene Expressio...ijitcs
Sequencing projects arising from high-throughput technologies including those of sequencing DNA microarray allowed measuring simultaneously the expression levels of millions of genes of a biological sample as well as to annotate and to identify the role (function) of those genes. Consequently, to better manage and organize this significant amount of information, bioinformatics approaches have been developed. These approaches provide a representation and a more 'relevant' integration of data in order to test and validate the researchers’ hypothesis. In this context, this article describes and discusses some techniques used for the functional analysis of gene expression data.
This document discusses the use of ontologies for big data. It provides examples of several ontology projects including the Informed Consent Ontology, miRNA and Aging Ontology, and Ontology of Drug Neuropathy Adverse Events. It also describes projects linking ontologies and big data like LINCS-BD2K and the SOCR Analytics Dashboard which provides graphical querying and exploration of complex datasets. Ontologies are concluded to be important for big data integration and reuse, though not a complete solution, and their use can go beyond reasoning to include various analytical techniques when combined with large datasets.
Construction of phylogenetic tree from multiple gene trees using principal co...IAEME Publication
This document describes a method for constructing a phylogenetic tree from multiple gene trees using principal component analysis. Multiple gene trees are generated from different protein sequences from various organisms. Distance matrices are calculated for each gene tree and combined into a single data matrix. Principal component analysis is performed on the data matrix to extract the first principal component, which represents the consensus distance vector combining information from all gene trees. A phylogenetic tree is then generated from the consensus distance vector using UPGMA, providing a species tree that integrates information from multiple genes. The method is demonstrated on protein sequence data from primates and placental mammals.
Computational methods to analyze biological data. It is a way to introduce some of the many resources available for analyzing sequence data with bioinformatics software. This paper will cover the theoretical approaches to data resources and we will get knowledge about some sequential alignments with its databases. As an interdisciplinary field of science, bioinformatics combines biology, computer science, information engineering, mathematics, and statistics to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques. Databases are essential for bioinformatics research and applications. Many databases exist, covering various information types for example, DNA and protein sequences, molecular structures, phenotypes, and biodiversity. Databases may contain empirical data. Conceptualizing biology in terms of molecules and then applying informatics techniques from math, computer science, and statistics to understand and organize the information associated with these molecules on a large scale. In this materialistic world, People are studying bioinformatics in different ways. Some people are devoted to developing new computational tools, both from software and hardware viewpoints, for the better handling and processing of biological data. They develop new models and new algorithms for existing questions and propose and tackle new questions when new experimental techniques bring in new data. Other people take the study of bioinformatics as the study of biology with the viewpoint of informatics and systems. Durgesh Raghuvanshi | Vivek Solanki | Neha Arora | Faiz Hashmi "Computational of Bioinformatics" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd30891.pdf Paper Url :https://www.ijtsrd.com/engineering/computer-engineering/30891/computational-of-bioinformatics/durgesh-raghuvanshi
- The document discusses various approaches for applying machine learning and artificial intelligence to drug discovery.
- It describes how molecules and proteins can be represented as graphs, fingerprints, or sequences to be used as input for models.
- Different tasks in drug discovery like target binding prediction, generative design of new molecules, and drug repurposing are framed as questions that AI models can aim to answer.
- Techniques discussed include graph neural networks, reinforcement learning, and conditional generation using techniques like translation models.
- Several recent works applying these approaches for tasks like predicting drug-target interactions and generating synthesizable molecules are referenced.
Knowledge Driven User Interfaces for Complex Biological Queriesalexander garcia
With the explosion of biological data in the postgenomic era, there has been a growing need for semantic data integration, supported by ontologies. Semantic integration techniques enable biologists to construct complex biological queries. However, the construction of these queries and analysis of their results can place a high cognitive load on biologists. This paper presents a proposed information visualisation tool, Digr, to aid biologists in these processes within the context of DigraBase, a graph database for semantic data integration. A working example of a query is presented, to illustrate the complexity of the information spaces under consideration. Visualisation techniques that have been applied to similar problems are discussed in the context of their applicability to the problem of aiding the construction of complex queries over DigraBase, and the interpretation of their results.
Biomedical indexing and retrieval system based on language modeling approachijseajournal
This summarizes a research paper that proposes a biomedical indexing and retrieval system called BIOINSY. It uses a language modeling approach to select the best Medical Subject Headings (MeSH) descriptors to index medical articles from sources like PUBMED. The system first preprocesses articles by splitting text, stemming words, and removing stop words. It then extracts terms using a hybrid linguistic and statistical approach. Terms are weighted based on semantic relationships in MeSH, not just statistics. Descriptors are selected by disambiguating terms and estimating the probability a descriptor was generated by the article's language model. Experiments showed the effectiveness of this conceptual indexing approach.
Inference Networks for Molecular Database Similarity SearchingCSCJournals
Molecular similarity searching is a process to find chemical compounds that are similar to a target compound. The concept of molecular similarity play an important role in modern computer aided drug design methods, and has been successfully applied in the optimization of lead series. It is used for chemical database searching and design of combinatorial libraries. In this paper, we explore the possibility and effectiveness of using Inference Bayesian network for similarity searching. The topology of the network represents the dependence relationships between molecular descriptors and molecules as well as the quantitative knowledge of probabilities encoding the strength of these relationships, mined from our compound collection. The retrieve of an active compound to a given target structure is obtained by means of an inference process through a network of dependences. The new approach is tested by its ability to retrieve seven sets of active molecules seeded in the MDDR. Our empirical results suggest that similarity method based on Bayesian networks provide a promising and encouraging alternative to existing similarity searching methods.
Uses of Artificial Intelligence in BioinformaticsPragya Pai
This presentation is about the usage of Artificial Intelligence in Bioinformatics. These slides give the basic knowledge about usage of Artificial Intelligence in Bioinformatics.
Role of Bioinformatics in Cancer Research Akash Arora
The document discusses the role of bioinformatics in cancer research. It explains that cancer is abnormal cell growth caused by chromosomal rearrangements, mutations, and errors in molecular machinery. Bioinformatics is the science of collecting and analyzing complex biological data like genetic codes, using tools to analyze data from databases of cancer information. This data can be used for cancer progression insight, drug target identification, early detection through biomarkers, and personalized medicine through risk analysis and bio-simulations. Software tools and packages like R-Project are used to analyze this genomic and molecular interaction data to further the understanding and treatment of cancer.
This document summarizes the relationship between systems biology and theoretical physics. It discusses how systems biology combines experimental techniques with mathematical modeling to understand biological processes, and how this field draws from both engineering and physics approaches. While engineering aims to numerically simulate biological systems, physics seeks universal principles and laws. The document reviews how concepts from physics, like statistical physics and nonlinear dynamics, have influenced systems biology research and how further integrating theoretical physics perspectives could aid understanding of biological systems.
Improving the effectiveness of information retrieval system using adaptive ge...ijcsit
The document describes research into improving the effectiveness of information retrieval systems using an adaptive genetic algorithm. A genetic algorithm with variable crossover and mutation probabilities (adaptive GA) is investigated. The adaptive GA is tested on 242 Arabic abstracts using three information retrieval models: vector space model, extended Boolean model, and language model. Results show the adaptive GA approach improves retrieval effectiveness over traditional genetic algorithms and baseline information retrieval systems, as measured by average recall and precision. Key aspects of the adaptive GA used include variable crossover and mutation probabilities tuned during the search process, and fitness functions based on document retrieval order.
Bioinformatics emerged from the marriage of computer science and molecular biology to analyze massive amounts of biological data, like that produced by the Human Genome Project. It uses algorithms and techniques from computer science to solve problems in molecular biology, like comparing genomic sequences to understand evolution. As genomic data exploded publicly, bioinformatics was needed to efficiently store, analyze, and make sense of this information, which has applications in molecular medicine, drug development, agriculture, and more.
The document discusses how artificial intelligence can be used for human welfare in various fields such as biology, medicine, and agriculture. It provides examples of how AI is inspired by biological systems to make intelligent decisions. AI is being used in medical applications such as cancer treatment, regenerative medicine, and precision agriculture to increase crop yields in a sustainable way. The document concludes that AI systems have great potential to help address challenges in healthcare access and delivery in India by powering virtual assistants and precision farming technologies.
This document discusses using natural language processing (NLP) techniques to extract biological information from literature to help interpret large genomics datasets. The author describes developing a method to identify gene regulatory interactions by parsing Medline abstracts. This information can then be combined with data from experiments to classify protein associations and interactions. While literature provides important context, it should not be used alone. The author also intends to apply these NLP methods to full text articles to extract information from different sections like introductions and discussions.
This article discusses opportunities and challenges for efficient parallel data processing in cloud computing environments. It introduces Nephele, a new data processing framework designed specifically for clouds. Nephele is the first framework to leverage dynamic resource allocation in clouds for task scheduling and execution. The article analyzes how existing frameworks assume static resource environments unlike clouds, and how Nephele addresses this by dynamically allocating different compute resources during job execution. It then provides initial performance results for Nephele and compares it to Hadoop for MapReduce-style jobs on cloud infrastructure.
This document discusses the estimation of very fast transient overvoltages (VFTOs) in a 3-phase 132kV gas insulated substation (GIS). It presents the modeling of a GIS system in MATLAB to analyze VFTOs generated during switching operations and 3-phase faults. Each re-strike of the disconnector switch during opening/closing generates high frequency transient overvoltages that can damage equipment. The paper develops an accurate electrical model of the GIS components using lumped parameters to simulate transient behavior and calculate overvoltage waveforms.
The document discusses how bioinformatics can be used to identify new cancer drug targets. It describes analyzing gene sequences to find homologs of known cancer genes. Microarray data can be mined to find genes that are differentially expressed in cancer versus normal tissues. Digital expression data from EST and SAGE tags provides another method to analyze gene expression levels in cancers. Integrating these diverse genomic and expression datasets through bioinformatics allows detection of cancer-causing mutations, gene amplifications and differentially expressed genes to identify potential new drug targets.
The document discusses the design and implementation of a virtual client honeypot to collect internet malware. A client honeypot is an active security system that simulates client-side software to detect attacks against clients. The proposed virtual client honeypot collects URLs from a database, launches them in a virtual machine, and monitors for malware downloads and changes to the file system and network activity. The honeypot was able to successfully collect malware samples and network packet captures from malicious websites exploiting client-side vulnerabilities.
Sample Work For Engineering Literature Review and Gap IdentificationPhD Assistance
Sample Work For Engineering Literature Review and Gap Identification - PhD Assistance - http://bit.ly/2E9fAVq
2.1 INTRODUCTION
2.2 RESEARCH GAPS IN EXISTING METHODS
2.3 OBJECTIVES OF THIS WORK
Read More : http://bit.ly/2Rl7XT5
#gapanalysis #strategicmanagement #datagapanalysis #gapanalysisppt #gapanalysishealthcare #gapanalysisfinance #gapanalysisEngineering
International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...CSCJournals
This document is the front matter of the International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 1 published on February 28, 2008. It includes information about the editor in chief, copyright details, a table of contents listing one paper, and brief descriptions of the paper titled "Inference Networks for Molecular Database Similarity Searching" which explores using Bayesian networks for molecular similarity searching in chemical databases.
This paper proposes Natural language based Discourse Analysis method used for extracting
information from the news article of different domain. The Discourse analysis used the Rhetorical Structure
theory which is used to find coherent group of text which are most prominent for extracting information
from text. RST theory used the Nucleus- Satellite concept for finding most prominent text from the text
document. After Discourse analysis the text analysis has been done for extracting domain related object
and relates this object. For extracting the information knowledge based system has been used which
consist of domain dictionary .The domain dictionary has a bag of words for domain. The system is
evaluated according gold-of-art analysis and human decision for extracted information.
A Review of Various Methods Used in the Analysis of Functional Gene Expressio...ijitcs
Sequencing projects arising from high-throughput technologies including those of sequencing DNA microarray allowed measuring simultaneously the expression levels of millions of genes of a biological sample as well as to annotate and to identify the role (function) of those genes. Consequently, to better manage and organize this significant amount of information, bioinformatics approaches have been developed. These approaches provide a representation and a more 'relevant' integration of data in order to test and validate the researchers’ hypothesis. In this context, this article describes and discusses some techniques used for the functional analysis of gene expression data.
This document discusses the use of ontologies for big data. It provides examples of several ontology projects including the Informed Consent Ontology, miRNA and Aging Ontology, and Ontology of Drug Neuropathy Adverse Events. It also describes projects linking ontologies and big data like LINCS-BD2K and the SOCR Analytics Dashboard which provides graphical querying and exploration of complex datasets. Ontologies are concluded to be important for big data integration and reuse, though not a complete solution, and their use can go beyond reasoning to include various analytical techniques when combined with large datasets.
Construction of phylogenetic tree from multiple gene trees using principal co...IAEME Publication
This document describes a method for constructing a phylogenetic tree from multiple gene trees using principal component analysis. Multiple gene trees are generated from different protein sequences from various organisms. Distance matrices are calculated for each gene tree and combined into a single data matrix. Principal component analysis is performed on the data matrix to extract the first principal component, which represents the consensus distance vector combining information from all gene trees. A phylogenetic tree is then generated from the consensus distance vector using UPGMA, providing a species tree that integrates information from multiple genes. The method is demonstrated on protein sequence data from primates and placental mammals.
Computational methods to analyze biological data. It is a way to introduce some of the many resources available for analyzing sequence data with bioinformatics software. This paper will cover the theoretical approaches to data resources and we will get knowledge about some sequential alignments with its databases. As an interdisciplinary field of science, bioinformatics combines biology, computer science, information engineering, mathematics, and statistics to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques. Databases are essential for bioinformatics research and applications. Many databases exist, covering various information types for example, DNA and protein sequences, molecular structures, phenotypes, and biodiversity. Databases may contain empirical data. Conceptualizing biology in terms of molecules and then applying informatics techniques from math, computer science, and statistics to understand and organize the information associated with these molecules on a large scale. In this materialistic world, People are studying bioinformatics in different ways. Some people are devoted to developing new computational tools, both from software and hardware viewpoints, for the better handling and processing of biological data. They develop new models and new algorithms for existing questions and propose and tackle new questions when new experimental techniques bring in new data. Other people take the study of bioinformatics as the study of biology with the viewpoint of informatics and systems. Durgesh Raghuvanshi | Vivek Solanki | Neha Arora | Faiz Hashmi "Computational of Bioinformatics" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd30891.pdf Paper Url :https://www.ijtsrd.com/engineering/computer-engineering/30891/computational-of-bioinformatics/durgesh-raghuvanshi
- The document discusses various approaches for applying machine learning and artificial intelligence to drug discovery.
- It describes how molecules and proteins can be represented as graphs, fingerprints, or sequences to be used as input for models.
- Different tasks in drug discovery like target binding prediction, generative design of new molecules, and drug repurposing are framed as questions that AI models can aim to answer.
- Techniques discussed include graph neural networks, reinforcement learning, and conditional generation using techniques like translation models.
- Several recent works applying these approaches for tasks like predicting drug-target interactions and generating synthesizable molecules are referenced.
Knowledge Driven User Interfaces for Complex Biological Queriesalexander garcia
With the explosion of biological data in the postgenomic era, there has been a growing need for semantic data integration, supported by ontologies. Semantic integration techniques enable biologists to construct complex biological queries. However, the construction of these queries and analysis of their results can place a high cognitive load on biologists. This paper presents a proposed information visualisation tool, Digr, to aid biologists in these processes within the context of DigraBase, a graph database for semantic data integration. A working example of a query is presented, to illustrate the complexity of the information spaces under consideration. Visualisation techniques that have been applied to similar problems are discussed in the context of their applicability to the problem of aiding the construction of complex queries over DigraBase, and the interpretation of their results.
Biomedical indexing and retrieval system based on language modeling approachijseajournal
This summarizes a research paper that proposes a biomedical indexing and retrieval system called BIOINSY. It uses a language modeling approach to select the best Medical Subject Headings (MeSH) descriptors to index medical articles from sources like PUBMED. The system first preprocesses articles by splitting text, stemming words, and removing stop words. It then extracts terms using a hybrid linguistic and statistical approach. Terms are weighted based on semantic relationships in MeSH, not just statistics. Descriptors are selected by disambiguating terms and estimating the probability a descriptor was generated by the article's language model. Experiments showed the effectiveness of this conceptual indexing approach.
Inference Networks for Molecular Database Similarity SearchingCSCJournals
Molecular similarity searching is a process to find chemical compounds that are similar to a target compound. The concept of molecular similarity play an important role in modern computer aided drug design methods, and has been successfully applied in the optimization of lead series. It is used for chemical database searching and design of combinatorial libraries. In this paper, we explore the possibility and effectiveness of using Inference Bayesian network for similarity searching. The topology of the network represents the dependence relationships between molecular descriptors and molecules as well as the quantitative knowledge of probabilities encoding the strength of these relationships, mined from our compound collection. The retrieve of an active compound to a given target structure is obtained by means of an inference process through a network of dependences. The new approach is tested by its ability to retrieve seven sets of active molecules seeded in the MDDR. Our empirical results suggest that similarity method based on Bayesian networks provide a promising and encouraging alternative to existing similarity searching methods.
Uses of Artificial Intelligence in BioinformaticsPragya Pai
This presentation is about the usage of Artificial Intelligence in Bioinformatics. These slides give the basic knowledge about usage of Artificial Intelligence in Bioinformatics.
Role of Bioinformatics in Cancer Research Akash Arora
The document discusses the role of bioinformatics in cancer research. It explains that cancer is abnormal cell growth caused by chromosomal rearrangements, mutations, and errors in molecular machinery. Bioinformatics is the science of collecting and analyzing complex biological data like genetic codes, using tools to analyze data from databases of cancer information. This data can be used for cancer progression insight, drug target identification, early detection through biomarkers, and personalized medicine through risk analysis and bio-simulations. Software tools and packages like R-Project are used to analyze this genomic and molecular interaction data to further the understanding and treatment of cancer.
This document summarizes the relationship between systems biology and theoretical physics. It discusses how systems biology combines experimental techniques with mathematical modeling to understand biological processes, and how this field draws from both engineering and physics approaches. While engineering aims to numerically simulate biological systems, physics seeks universal principles and laws. The document reviews how concepts from physics, like statistical physics and nonlinear dynamics, have influenced systems biology research and how further integrating theoretical physics perspectives could aid understanding of biological systems.
Improving the effectiveness of information retrieval system using adaptive ge...ijcsit
The document describes research into improving the effectiveness of information retrieval systems using an adaptive genetic algorithm. A genetic algorithm with variable crossover and mutation probabilities (adaptive GA) is investigated. The adaptive GA is tested on 242 Arabic abstracts using three information retrieval models: vector space model, extended Boolean model, and language model. Results show the adaptive GA approach improves retrieval effectiveness over traditional genetic algorithms and baseline information retrieval systems, as measured by average recall and precision. Key aspects of the adaptive GA used include variable crossover and mutation probabilities tuned during the search process, and fitness functions based on document retrieval order.
Bioinformatics emerged from the marriage of computer science and molecular biology to analyze massive amounts of biological data, like that produced by the Human Genome Project. It uses algorithms and techniques from computer science to solve problems in molecular biology, like comparing genomic sequences to understand evolution. As genomic data exploded publicly, bioinformatics was needed to efficiently store, analyze, and make sense of this information, which has applications in molecular medicine, drug development, agriculture, and more.
The document discusses how artificial intelligence can be used for human welfare in various fields such as biology, medicine, and agriculture. It provides examples of how AI is inspired by biological systems to make intelligent decisions. AI is being used in medical applications such as cancer treatment, regenerative medicine, and precision agriculture to increase crop yields in a sustainable way. The document concludes that AI systems have great potential to help address challenges in healthcare access and delivery in India by powering virtual assistants and precision farming technologies.
This document discusses using natural language processing (NLP) techniques to extract biological information from literature to help interpret large genomics datasets. The author describes developing a method to identify gene regulatory interactions by parsing Medline abstracts. This information can then be combined with data from experiments to classify protein associations and interactions. While literature provides important context, it should not be used alone. The author also intends to apply these NLP methods to full text articles to extract information from different sections like introductions and discussions.
This article discusses opportunities and challenges for efficient parallel data processing in cloud computing environments. It introduces Nephele, a new data processing framework designed specifically for clouds. Nephele is the first framework to leverage dynamic resource allocation in clouds for task scheduling and execution. The article analyzes how existing frameworks assume static resource environments unlike clouds, and how Nephele addresses this by dynamically allocating different compute resources during job execution. It then provides initial performance results for Nephele and compares it to Hadoop for MapReduce-style jobs on cloud infrastructure.
This document discusses the estimation of very fast transient overvoltages (VFTOs) in a 3-phase 132kV gas insulated substation (GIS). It presents the modeling of a GIS system in MATLAB to analyze VFTOs generated during switching operations and 3-phase faults. Each re-strike of the disconnector switch during opening/closing generates high frequency transient overvoltages that can damage equipment. The paper develops an accurate electrical model of the GIS components using lumped parameters to simulate transient behavior and calculate overvoltage waveforms.
The document discusses how bioinformatics can be used to identify new cancer drug targets. It describes analyzing gene sequences to find homologs of known cancer genes. Microarray data can be mined to find genes that are differentially expressed in cancer versus normal tissues. Digital expression data from EST and SAGE tags provides another method to analyze gene expression levels in cancers. Integrating these diverse genomic and expression datasets through bioinformatics allows detection of cancer-causing mutations, gene amplifications and differentially expressed genes to identify potential new drug targets.
The document discusses the design and implementation of a virtual client honeypot to collect internet malware. A client honeypot is an active security system that simulates client-side software to detect attacks against clients. The proposed virtual client honeypot collects URLs from a database, launches them in a virtual machine, and monitors for malware downloads and changes to the file system and network activity. The honeypot was able to successfully collect malware samples and network packet captures from malicious websites exploiting client-side vulnerabilities.
The document describes a proposed model for representing user profiles using ontologies for personalized web information gathering. The model uses both a world knowledge base (encoded from the Library of Congress Subject Headings) and a user's local instance repository to construct personalized ontologies representing the user's concept models and background knowledge. The proposed model is compared against existing benchmark models through experiments using a large standard dataset, and results show the proposed model improves web information gathering performance.
3.[18 22]hybrid association rule mining using ac treeAlexander Decker
This document proposes a new hybrid algorithm called AC Tree (AprioriCOFI tree) for efficiently mining association rules from large datasets at multiple concept levels. The AC Tree algorithm combines aspects of the Apriori, FP-Tree, and COFI Tree algorithms. It first uses Apriori to identify frequent 1-itemsets, then constructs an FP Tree header table and builds smaller trees for each frequent item to mine patterns at different levels. Experimental results on a 20 Newsgroups dataset show that AC Tree outperforms Apriori, FP-Tree, and APFT algorithms by discovering more interesting patterns faster.
Java fue creado en la década de 1990 por Sun Microsystems para mejorar los lenguajes de programación existentes y hacerlos más simples, modernos y potentes. Fue diseñado para ser multiplataforma y seguro, y se adaptó rápidamente a Internet. Presentado públicamente en 1995, Java ofrece ventajas como ser orientado a objetos, potente y seguro, además de no poder ejecutar virus.
This document provides a summary of a seminar presentation on bio-ontology and its application in bioinformatics. It discusses key topics like the goals and elements of ontology, applications of ontology including in bioinformatics, importance of bioinformatics, need for ontology in bioinformatics, types of bioinformatics ontologies and relations used in cancer ontologies. It also summarizes the growth of bio-ontology papers over time, top ontologies in different biology domains, limitations and future prospects.
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Journals
Abstract NLP Based Retrieval of Medical Information is the extraction of medical data from narrative clinical documents. In this paper, we provide the way to diagnose diseases with the help of natural language interpretation and classification techniques. However extraction of medical information is difficult task due to complex symptom names and complex disease names. For diagnosis we will be using two approaches, one is getting disease names with the help of classifiers and another way is using the patterns with the help of NLP for getting the information related to diseases. These both approaches will be applied according to the question type. Keywords: NLP, narrative text, extraction, medical information, expert system
Bioinformatics is the application of computer technology to the management of biological information. It plays a role in areas like experimental molecular biology, genetics, genomics, and structural biology. It helps analyze and organize the large amounts of data generated by projects like the Human Genome Project. It is important for understanding diseases and developing new drug targets. It also aids research in fields like systems biology, genomics, and proteomics.
Bioinformatics is the application of computer technology to the management of biological information. It plays a role in areas like experimental molecular biology, genetics, genomics, and structural biology. It helps analyze and organize the large amounts of data generated by projects like the Human Genome Project. It is important for understanding diseases and developing new drug targets. It also aids research in fields like systems biology, genomics, and proteomics.
Bioinformatics is an interdisciplinary field involving biology, computer science, mathematics and statistics. It addresses large-scale biological problems from a computational perspective. Common problems include modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution typically involves collecting statistics from biological data, building a computational model, solving a computational problem, and testing the algorithm. Bioinformatics plays a role in areas like structural genomics, functional genomics and nutritional genomics. It is used for applications such as transcriptome analysis, drug discovery, cheminformatics analysis, and more. It is an important tool in fields like molecular medicine, gene therapy, microbial genome applications, antibiotic resistance, and evolutionary studies. Biological databases are important for organizing
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
This document summarizes a research paper that proposes a machine learning approach to identify disease-treatment relationships from biomedical text. It extracts sentences mentioning diseases and treatments from medical publications and classifies the semantic relationships between them. The researchers evaluate their methodology on a dataset of sentences annotated with diseases, treatments and their relationships. Their results show the machine learning models can reliably extract this information and outperform previous methods on the same data. The proposed approach could be integrated into applications to disseminate healthcare information from published literature to medical professionals and patients.
The document discusses the exponential growth of biomedical research data and literature. It describes challenges researchers face in keeping up with the vast amount of information. Text mining techniques can help by automatically extracting relevant information and facts from literature and organizing them into structured knowledgebases. Named entity recognition is an important text mining task that involves identifying mentions of biomedical entities in text. Both rule-based and machine learning approaches have been used for named entity recognition.
Ontologies for Semantic Normalization of Immunological DataYannick Pouliot
This document discusses using ontologies to semantically normalize immunological data from the Human Immune Profiling Consortium (HIPC). 57 ontologies covering domains like anatomy, disease, pathways were evaluated. Text from HIPC datasets and protocols was annotated using these ontologies, with the NCI Thesaurus, Medical Subject Headings, and Gene Ontology mapping to the most terms. Many failures were due to missing commercial reagent terms. The conclusions are that ImmPort, the HIPC data repository, could adopt ontology-based encoding with additions to ontologies and text pre-processing.
introduction,history scope and applications of
relation to other fields , bioinformatics,biological databases,computers internet,sequence development, and
introduction to sequence development and alignment
CDAO presentation.
The idea of the comparative analysis ontoloty has been presented worldwide, including: NESCent (USA), IGBMC (France), UFRJ (Brazil). Providing a semantic framework for evolutionary analysis in a high-throughtput way after the next and third generation sequencing is the way to approach evolutionary-based studies into genome-wide analysis. The darwinian core of reasoning also allows CDAO to be used with other entities.
Integrating Large, Disparate, Biomedical Ontologies to Boost Organ Developmen...Chimezie Ogbuji
This document discusses integrating large biomedical ontologies like the Gene Ontology and Foundational Model of Anatomy to increase connectivity in biological networks. The goal is to link genetic diseases to anatomical abnormalities through biological processes. An initial mapping was done linking GO development terms to FMA anatomy concepts. Evaluation found this significantly increased paths between diseases, genes, and anatomy. While overlap between ontologies was low, the mapping provides opportunities for more meaningful integration and network analysis.
2011-10-11 Open PHACTS at BioIT World Europeopen_phacts
The document discusses the Innovative Medicines Initiative's Open PHACTS project, which aims to develop robust standards and apply them in a semantic integration platform ("Open Pharmacological Space") to integrate drug discovery data from various public and private sources. The project brings together partners from industry, academia, and non-profits to build an open infrastructure for linking drug discovery knowledge and supporting ongoing research. It outlines the technical approach, priorities, and initial progress on developing exemplar applications and a prototype "lash up" system.
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...ijcsa
This document summarizes a research paper that introduces a text mining-based method for answering biological queries and testing hypotheses. The proposed approach analyzes hypotheses stated as natural language questions and measures their statistical significance based on existing literature. It computes a p-value to determine whether to accept or reject each hypothesis. The method also generates a network of related biological entities to provide context and suggest new hypotheses for further investigation. The goal is to help researchers quantitatively evaluate assumptions and guide relevant discovery of new biological knowledge.
An Automatic Approach for Bilingual Tuberculosis Ontology Based on Ontology D...TELKOMNIKA JOURNAL
Ontology is a representation term used to describe and represent a domain of knowledge. Manually ontology development is currently considered complex, requiring a lot of time and effort. This research was proposed to develop methods to build automatic domain ontology bilingual in Indonesian and English by using corpus and ontology design patterns (ODPs) in tuberculosis disease. In this study, the methods used were to combine ontology learning from text and ontology design patterns to decrease the role of expert knowledge. The methods in this research consist of six stages are term and relation extraction, matching with Tuberculosis glossary, matching with ODPs, score computation similarity term and relations with ODPs, ontology building and ontology evaluation. The results of ontology construction were 362 terms and 44 relations with 260 terms were added. The calculation accuracy of ontology construction was 71%. Ontology construction had higher complexity and shorter time as well as decreases the role of the expert knowledge which proof that the automatic ontology evaluation is better than manual ontology construction.
Domain ontology development for communicable diseasescsandit
This document discusses the development of a domain ontology for communicable diseases. The researchers developed an ontology with concepts like diseases, symptoms, and causes arranged in a taxonomy. They created over 600 concepts with properties and relations. The ontology development process included specification, conceptualization, creation of instances, and evaluation using a description logic reasoner to verify the concepts and relations were correctly represented. The ontology will be expanded to include more diseases and connections to related web content to provide information retrieval.
DOMAIN ONTOLOGY DEVELOPMENT FOR COMMUNICABLE DISEASEScscpconf
Web has become the very first resource to search for any kind of information. With the emergence of semantic web, our search queries have started generating more informed results.Ontologies are at the core of any semantic web application. They help in rapid development of
distributed systems by providing information on the fly. This key feature of distribution and
sharing of information has made ontologies as a new knowledge representation mechanism. A
mechanism which is strongly backed by a sound inference system. In this paper, we shall discuss the development, verification and validation of an ontology in a health domain.
The document describes a framework for biological relation extraction using biomedical ontologies and text mining. It discusses introducing biomedical text mining and outlines the problem, motivation, and challenges. It then presents the overall system components and architecture, including searching/browsing, Swanson's algorithm, protein-protein interactions, and gene clustering applications. The framework concept issues, design issues, sequence diagram, and database are also covered at a high level.
The document discusses bioinformatics, defining it as the application of information technology to the field of molecular biology. It describes how bioinformatics uses biology, mathematics, and computer science to analyze and manage biological data. Some key applications of bioinformatics mentioned are sequence analysis, prediction of protein structure, genome annotation, comparative genomics, and health/drug discovery. Several important bioinformatics resources are also outlined, including NCBI, PubMed, EMBL, and OMIM.
The proposed research aims to develop a computational approach to analyze associations between transcription factor genes and diseases like cancer. The approach will extract gene-disease relationships from literature based on supporting evidence between genes, diseases, and evidence. Relationships will be quantitatively evaluated to extract strongly supported gene-disease linkages and rank them. Existing methods are reviewed that use properties of representative disease genes to find similar candidate genes, but the proposed method will emphasize verifiable evidence for predicted associations and their strength. The goal is to predict gene-disease relationships based on relationships between other entities to help discover disease genes.
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksIJERD Editor
Distributed Denial of Service (DDoS) Attacks became a massive threat to the Internet. Traditional
Architecture of internet is vulnerable to the attacks like DDoS. Attacker primarily acquire his army of Zombies,
then that army will be instructed by the Attacker that when to start an attack and on whom the attack should be
done. In this paper, different techniques which are used to perform DDoS Attacks, Tools that were used to
perform Attacks and Countermeasures in order to detect the attackers and eliminate the Bandwidth Distributed
Denial of Service attacks (B-DDoS) are reviewed. DDoS Attacks were done by using various Flooding
techniques which are used in DDoS attack.
The main purpose of this paper is to design an architecture which can reduce the Bandwidth
Distributed Denial of service Attack and make the victim site or server available for the normal users by
eliminating the zombie machines. Our Primary focus of this paper is to dispute how normal machines are
turning into zombies (Bots), how attack is been initiated, DDoS attack procedure and how an organization can
save their server from being a DDoS victim. In order to present this we implemented a simulated environment
with Cisco switches, Routers, Firewall, some virtual machines and some Attack tools to display a real DDoS
attack. By using Time scheduling, Resource Limiting, System log, Access Control List and some Modular
policy Framework we stopped the attack and identified the Attacker (Bot) machines
Hearing loss is one of the most common human impairments. It is estimated that by year 2015 more
than 700 million people will suffer mild deafness. Most can be helped by hearing aid devices depending on the
severity of their hearing loss. This paper describes the implementation and characterization details of a dual
channel transmitter front end (TFE) for digital hearing aid (DHA) applications that use novel micro
electromechanical- systems (MEMS) audio transducers and ultra-low power-scalable analog-to-digital
converters (ADCs), which enable a very-low form factor, energy-efficient implementation for next-generation
DHA. The contribution of the design is the implementation of the dual channel MEMS microphones and powerscalable
ADC system.
Influence of tensile behaviour of slab on the structural Behaviour of shear c...IJERD Editor
-A composite beam is composed of a steel beam and a slab connected by means of shear connectors
like studs installed on the top flange of the steel beam to form a structure behaving monolithically. This study
analyzes the effects of the tensile behavior of the slab on the structural behavior of the shear connection like slip
stiffness and maximum shear force in composite beams subjected to hogging moment. The results show that the
shear studs located in the crack-concentration zones due to large hogging moments sustain significantly smaller
shear force and slip stiffness than the other zones. Moreover, the reduction of the slip stiffness in the shear
connection appears also to be closely related to the change in the tensile strain of rebar according to the increase
of the load. Further experimental and analytical studies shall be conducted considering variables such as the
reinforcement ratio and the arrangement of shear connectors to achieve efficient design of the shear connection
in composite beams subjected to hogging moment.
Gold prospecting using Remote Sensing ‘A case study of Sudan’IJERD Editor
Gold has been extracted from northeast Africa for more than 5000 years, and this may be the first
place where the metal was extracted. The Arabian-Nubian Shield (ANS) is an exposure of Precambrian
crystalline rocks on the flanks of the Red Sea. The crystalline rocks are mostly Neoproterozoic in age. ANS
includes the nations of Israel, Jordan. Egypt, Saudi Arabia, Sudan, Eritrea, Ethiopia, Yemen, and Somalia.
Arabian Nubian Shield Consists of juvenile continental crest that formed between 900 550 Ma, when intra
oceanic arc welded together along ophiolite decorated arc. Primary Au mineralization probably developed in
association with the growth of intra oceanic arc and evolution of back arc. Multiple episodes of deformation
have obscured the primary metallogenic setting, but at least some of the deposits preserve evidence that they
originate as sea floor massive sulphide deposits.
The Red Sea Hills Region is a vast span of rugged, harsh and inhospitable sector of the Earth with
inimical moon-like terrain, nevertheless since ancient times it is famed to be an abode of gold and was a major
source of wealth for the Pharaohs of ancient Egypt. The Pharaohs old workings have been periodically
rediscovered through time. Recent endeavours by the Geological Research Authority of Sudan led to the
discovery of a score of occurrences with gold and massive sulphide mineralizations. In the nineties of the
previous century the Geological Research Authority of Sudan (GRAS) in cooperation with BRGM utilized
satellite data of Landsat TM using spectral ratio technique to map possible mineralized zones in the Red Sea
Hills of Sudan. The outcome of the study mapped a gossan type gold mineralization. Band ratio technique was
applied to Arbaat area and a signature of alteration zone was detected. The alteration zones are commonly
associated with mineralization. The alteration zones are commonly associated with mineralization. A filed check
confirmed the existence of stock work of gold bearing quartz in the alteration zone. Another type of gold
mineralization that was discovered using remote sensing is the gold associated with metachert in the Atmur
Desert.
Reducing Corrosion Rate by Welding DesignIJERD Editor
This document summarizes a study on reducing corrosion rates in steel through welding design. The researchers tested different welding groove designs (X, V, 1/2X, 1/2V) and preheating temperatures (400°C, 500°C, 600°C) on ferritic malleable iron samples. Testing found that X and V groove designs with 500°C and 600°C preheating had corrosion rates of 0.5-0.69% weight loss after 14 days, compared to 0.57-0.76% for 400°C preheating. Higher preheating reduced residual stresses which decreased corrosion. Residual stresses were 1.7 MPa for optimal X groove and 600°C
Router 1X3 – RTL Design and VerificationIJERD Editor
Routing is the process of moving a packet of data from source to destination and enables messages
to pass from one computer to another and eventually reach the target machine. A router is a networking device
that forwards data packets between computer networks. It is connected to two or more data lines from different
networks (as opposed to a network switch, which connects data lines from one single network). This paper,
mainly emphasizes upon the study of router device, it‟s top level architecture, and how various sub-modules of
router i.e. Register, FIFO, FSM and Synchronizer are synthesized, and simulated and finally connected to its top
module.
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...IJERD Editor
This paper presents a component within the flexible ac-transmission system (FACTS) family, called
distributed power-flow controller (DPFC). The DPFC is derived from the unified power-flow controller (UPFC)
with an eliminated common dc link. The DPFC has the same control capabilities as the UPFC, which comprise
the adjustment of the line impedance, the transmission angle, and the bus voltage. The active power exchange
between the shunt and series converters, which is through the common dc link in the UPFC, is now through the
transmission lines at the third-harmonic frequency. DPFC multiple small-size single-phase converters which
reduces the cost of equipment, no voltage isolation between phases, increases redundancy and there by
reliability increases. The principle and analysis of the DPFC are presented in this paper and the corresponding
simulation results that are carried out on a scaled prototype are also shown.
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRIJERD Editor
Power quality has been an issue that is becoming increasingly pivotal in industrial electricity
consumers point of view in recent times. Modern industries employ Sensitive power electronic equipments,
control devices and non-linear loads as part of automated processes to increase energy efficiency and
productivity. Voltage disturbances are the most common power quality problem due to this the use of a large
numbers of sophisticated and sensitive electronic equipment in industrial systems is increased. This paper
discusses the design and simulation of dynamic voltage restorer for improvement of power quality and
reduce the harmonics distortion of sensitive loads. Power quality problem is occurring at non-standard
voltage, current and frequency. Electronic devices are very sensitive loads. In power system voltage sag,
swell, flicker and harmonics are some of the problem to the sensitive load. The compensation capability
of a DVR depends primarily on the maximum voltage injection ability and the amount of stored
energy available within the restorer. This device is connected in series with the distribution feeder at
medium voltage. A fuzzy logic control is used to produce the gate pulses for control circuit of DVR and the
circuit is simulated by using MATLAB/SIMULINK software.
Study on the Fused Deposition Modelling In Additive ManufacturingIJERD Editor
Additive manufacturing process, also popularly known as 3-D printing, is a process where a product
is created in a succession of layers. It is based on a novel materials incremental manufacturing philosophy.
Unlike conventional manufacturing processes where material is removed from a given work price to derive the
final shape of a product, 3-D printing develops the product from scratch thus obviating the necessity to cut away
materials. This prevents wastage of raw materials. Commonly used raw materials for the process are ABS
plastic, PLA and nylon. Recently the use of gold, bronze and wood has also been implemented. The complexity
factor of this process is 0% as in any object of any shape and size can be manufactured.
Spyware triggering system by particular string valueIJERD Editor
This computer programme can be used for good and bad purpose in hacking or in any general
purpose. We can say it is next step for hacking techniques such as keylogger and spyware. Once in this system if
user or hacker store particular string as a input after that software continually compare typing activity of user
with that stored string and if it is match then launch spyware programme.
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...IJERD Editor
This paper presents a blind steganalysis technique to effectively attack the JPEG steganographic
schemes i.e. Jsteg, F5, Outguess and DWT Based. The proposed method exploits the correlations between
block-DCTcoefficients from intra-block and inter-block relation and the statistical moments of characteristic
functions of the test image is selected as features. The features are extracted from the BDCT JPEG 2-array.
Support Vector Machine with cross-validation is implemented for the classification.The proposed scheme gives
improved outcome in attacking.
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeIJERD Editor
- Data over the cloud is transferred or transmitted between servers and users. Privacy of that
data is very important as it belongs to personal information. If data get hacked by the hacker, can be
used to defame a person’s social data. Sometimes delay are held during data transmission. i.e. Mobile
communication, bandwidth is low. Hence compression algorithms are proposed for fast and efficient
transmission, encryption is used for security purposes and blurring is used by providing additional
layers of security. These algorithms are hybridized for having a robust and efficient security and
transmission over cloud storage system.
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...IJERD Editor
A thorough review of existing literature indicates that the Buckley-Leverett equation only analyzes
waterflood practices directly without any adjustments on real reservoir scenarios. By doing so, quite a number
of errors are introduced into these analyses. Also, for most waterflood scenarios, a radial investigation is more
appropriate than a simplified linear system. This study investigates the adoption of the Buckley-Leverett
equation to estimate the radius invasion of the displacing fluid during waterflooding. The model is also adopted
for a Microbial flood and a comparative analysis is conducted for both waterflooding and microbial flooding.
Results shown from the analysis doesn’t only records a success in determining the radial distance of the leading
edge of water during the flooding process, but also gives a clearer understanding of the applicability of
microbes to enhance oil production through in-situ production of bio-products like bio surfactans, biogenic
gases, bio acids etc.
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraIJERD Editor
- Gesture gaming is a method by which users having a laptop/pc/x-box play games using natural or
bodily gestures. This paper presents a way of playing free flash games on the internet using an ordinary webcam
with the help of open source technologies. Emphasis in human activity recognition is given on the pose
estimation and the consistency in the pose of the player. These are estimated with the help of an ordinary web
camera having different resolutions from VGA to 20mps. Our work involved giving a 10 second documentary to
the user on how to play a particular game using gestures and what are the various kinds of gestures that can be
performed in front of the system. The initial inputs of the RGB values for the gesture component is obtained by
instructing the user to place his component in a red box in about 10 seconds after the short documentary before
the game is finished. Later the system opens the concerned game on the internet on popular flash game sites like
miniclip, games arcade, GameStop etc and loads the game clicking at various places and brings the state to a
place where the user is to perform only gestures to start playing the game. At any point of time the user can call
off the game by hitting the esc key and the program will release all of the controls and return to the desktop. It
was noted that the results obtained using an ordinary webcam matched that of the Kinect and the users could
relive the gaming experience of the free flash games on the net. Therefore effective in game advertising could
also be achieved thus resulting in a disruptive growth to the advertising firms.
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...IJERD Editor
-LLC resonant frequency converter is basically a combo of series as well as parallel resonant ckt. For
LCC resonant converter it is associated with a disadvantage that, though it has two resonant frequencies, the
lower resonant frequency is in ZCS region[5]. For this application, we are not able to design the converter
working at this resonant frequency. LLC resonant converter existed for a very long time but because of
unknown characteristic of this converter it was used as a series resonant converter with basically a passive
(resistive) load. . Here, it was designed to operate in switching frequency higher than resonant frequency of the
series resonant tank of Lr and Cr converter acts very similar to Series Resonant Converter. The benefit of LLC
resonant converter is narrow switching frequency range with light load[6] . Basically, the control ckt plays a
very imp. role and hence 555 Timer used here provides a perfect square wave as the control ckt provides no
slew rate which makes the square wave really strong and impenetrable. The dead band circuit provides the
exclusive dead band in micro seconds so as to avoid the simultaneous firing of two pairs of IGBT’s where one
pair switches off and the other on for a slightest period of time. Hence, the isolator ckt here is associated with
each and every ckt used because it acts as a driver and an isolation to each of the IGBT is provided with one
exclusive transformer supply[3]. The IGBT’s are fired using the appropriate signal using the previous boards
and hence at last a high frequency rectifier ckt with a filtering capacitor is used to get an exact dc
waveform .The basic goal of this particular analysis is to observe the wave forms and characteristics of
converters with differently positioned passive elements in the form of tank circuits.
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...IJERD Editor
LLC resonant frequency converter is basically a combo of series as well as parallel resonant ckt. For
LCC resonant converter it is associated with a disadvantage that, though it has two resonant frequencies, the
lower resonant frequency is in ZCS region [5]. For this application, we are not able to design the converter
working at this resonant frequency. LLC resonant converter existed for a very long time but because of
unknown characteristic of this converter it was used as a series resonant converter with basically a passive
(resistive) load. . Here, it was designed to operate in switching frequency higher than resonant frequency of the
series resonant tank of Lr and Cr converter acts very similar to Series Resonant Converter. The benefit of LLC
resonant converter is narrow switching frequency range with light load[6] . Basically, the control ckt plays a
very imp. role and hence 555 Timer used here provides a perfect square wave as the control ckt provides no
slew rate which makes the square wave really strong and impenetrable. The dead band circuit provides the
exclusive dead band in micro seconds so as to avoid the simultaneous firing of two pairs of IGBT’s where one
pair switches off and the other on for a slightest period of time. Hence, the isolator ckt here is associated with
each and every ckt used because it acts as a driver and an isolation to each of the IGBT is provided with one
exclusive transformer supply[3]. The IGBT’s are fired using the appropriate signal using the previous boards
and hence at last a high frequency rectifier ckt with a filtering capacitor is used to get an exact dc
waveform .The basic goal of this particular analysis is to observe the wave forms and characteristics of
converters with differently positioned passive elements in the form of tank circuits. The supported simulation
is done through PSIM 6.0 software tool
Amateurs Radio operator, also known as HAM communicates with other HAMs through Radio
waves. Wireless communication in which Moon is used as natural satellite is called Moon-bounce or EME
(Earth -Moon-Earth) technique. Long distance communication (DXing) using Very High Frequency (VHF)
operated amateur HAM radio was difficult. Even with the modest setup having good transceiver, power
amplifier and high gain antenna with high directivity, VHF DXing is possible. Generally 2X11 YAGI antenna
along with rotor to set horizontal and vertical angle is used. Moon tracking software gives exact location,
visibility of Moon at both the stations and other vital data to acquire real time position of moon.
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...IJERD Editor
Simple Sequence Repeats (SSR), also known as Microsatellites, have been extensively used as
molecular markers due to their abundance and high degree of polymorphism. The nucleotide sequences of
polymorphic forms of the same gene should be 99.9% identical. So, Microsatellites extraction from the Gene is
crucial. However, Microsatellites repeat count is compared, if they differ largely, he has some disorder. The Y
chromosome likely contains 50 to 60 genes that provide instructions for making proteins. Because only males
have the Y chromosome, the genes on this chromosome tend to be involved in male sex determination and
development. Several Microsatellite Extractors exist and they fail to extract microsatellites on large data sets of
giga bytes and tera bytes in size. The proposed tool “MS-Extractor: An Innovative Approach to extract
Microsatellites on „Y‟ Chromosome” can extract both Perfect as well as Imperfect Microsatellites from large
data sets of human genome „Y‟. The proposed system uses string matching with sliding window approach to
locate Microsatellites and extracts them.
Importance of Measurements in Smart GridIJERD Editor
- The need to get reliable supply, independence from fossil fuels, and capability to provide clean
energy at a fixed and lower cost, the existing power grid structure is transforming into Smart Grid. The
development of a smart energy distribution grid is a current goal of many nations. A Smart Grid should have
new capabilities such as self-healing, high reliability, energy management, and real-time pricing. This new era
of smart future grid will lead to major changes in existing technologies at generation, transmission and
distribution levels. The incorporation of renewable energy resources and distribution generators in the existing
grid will increase the complexity, optimization problems and instability of the system. This will lead to a
paradigm shift in the instrumentation and control requirements for Smart Grids for high quality, stable and
reliable electricity supply of power. The monitoring of the grid system state and stability relies on the
availability of reliable measurement of data. In this paper the measurement areas that highlight new
measurement challenges, development of the Smart Meters and the critical parameters of electric energy to be
monitored for improving the reliability of power systems has been discussed.
Study of Macro level Properties of SCC using GGBS and Lime stone powderIJERD Editor
The document summarizes a study on the use of ground granulated blast furnace slag (GGBS) and limestone powder to replace cement in self-compacting concrete (SCC). Tests were conducted on SCC mixes with 0-50% replacement of cement with GGBS and 0-20% replacement with limestone powder. The results showed that replacing 30% of cement with GGBS and 15% with limestone powder produced SCC with the highest compressive strength of 46MPa, meeting fresh property requirements. The study concluded that this ternary blend of cement, GGBS and limestone powder can improve SCC properties while reducing costs.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Digital Artefact 1 - Tiny Home Environmental Design
www.ijerd.com
1. International Journal of Engineering Research and Development
ISSN: 2278-067X, Volume 1, Issue 11 (July 2012), PP.01-10
www.ijerd.com
Design and Development of Integrated Biomedical Ontology for
Information Extraction from Medline Abstracts
Dr.B.LShivakumar1, R.Porkodi2
1
Professor and Head, Department of Computer Applications, SNR Sons College,Coimbatore -6
2
Assistant Professor, Department of Computer Science, Bharathair University, Coimbatore – 46
Abstract––Due to the ever-increasing amount of scientific articles in the bio-medical domain, Information Extraction in Text
Mining has been recognized as one of the key technologies for future bio-medical research. The Information extraction from
these biomedical domains plays a vital role in bioinformatics field. Thus, bioinformatics researchers extend their work in
both information extraction and construction of biomedical knowledge sources. In knowledge extraction, researchers are
involved to develop efficient and effective technique by combining Natural Language Processing (NLP) and text mining
techniques to find out and extract information and significant associations among the extracted information. In the other
side bioinformatics researchers are busy with the construction of knowledge sources or repositories related to biomedical
domain which simplify the work of the researchers in knowledge extraction process. This paper presents a semiautomatic
framework that integrates the well-known two ontologies Gene Ontology (GO) and Medical Subject Heading (MeSH)
ontology by adding of semantic mappings or relations between GO terms, Gene names and MESH keywords related to a
particular disease (Alzheimer disease). The integrated ontology has validated in all three aspects such as structural,
syntactic and semantic validation measures. This framework is used to discover significant associations or relationships
between proteins and genes related to Alzheimer disease that are extracted from Medline abstracts.
Keywords––Ontology, Alzheimer Disease, GO, MeSH, Stemming, Tagging.
I. INTRODUCTION
NCBI is a fast growing knowledge source for bioinformatics community, which has Medline database [1] that
currently contains over 15 million citations of biological abstracts and it is growing by more than 40,000 abstracts per
month. To extract desired information directly from biological literature is a challenging problem in text mining and Natural
Language Processing (NLP). Many biomedical information sources have been developed and used in extraction process.
Most text mining methods use vector space model to represent a document. The vector space model represents a
document as a feature vector of terms contained in it. Each feature vector contains term weights and similarity between
documents is computed using various similarity measures. This approach not considered the semantic relations of terms in
documents. The ontology approach represents an effective knowledge representation within controlled vocabulary. The
Wordnet ontology [14] is a lexical database for general English covering most of the general English concepts. In biomedical
domain, the Unified Medical Language System (UMLS) framework [13] includes much biomedical ontology.
This paper integrates the Gene ontology (GO), Medical Subject Headings (MeSH) and all human genes which
include genes that cause Alzheimer disease in human. Alzheimer‟s disease (AD) is the most common cause of progressive
decline of cognitive function in aged humans, and it is characterized by the presence of numerous senile plaques and
neurofibrillary tangles accompanied by neuronal loss. The integrated ontology is developed in protégé tool which is the
famous tool for designing ontology. The protégé tool provides facilities such as visualization of concepts which clearly show
the semantic relations of a concept, query the some results based on the concepts, object properties and data properties, etc.
The paper is organized as follows: Review of literature related to this work is presented in section 2. In section 3,
brief introduction on biomedical ontologies are presented and section 4 the ontology based framework is elaborately
discussed. The experimental design and results discussion is presented in section 5. Finally, this paper is concluded in
section 6.
II. RELATED WORK
A lot of NLP based works have been reported for the past decades related to concept extraction [2], association
rule discovery [3, 4] and extracting relationships among various concepts [5, 6]. Many approaches have been developed for
extracting significant associations and interactions among various biological entities [5, 6, and 7] and discovering protein-
disease associations. However, these approaches have not been produced promising results, due to inconsistencies prevailed
in gene names. Related to gene names extractions, paper [8] has presented the extraction of gene names from articles‟ titles
and abstracts and identified genes related to colon cancer disease. The paper [6] has presented a statistical approach for
discovering group of genes related to breast cancer disease. In paper [5], author constructed a relationships network among
biomedical entities which are extracted from Medline abstracts.
In paper [9] the authors proposed new text mining approach which utilizes the concept of expectation, evidence a
Z-score in determining significant associations between genes and Alzheimer disease. In paper [10], researchers expressed
1
2. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
the method using association and functional relationship discovery algorithm in extracting gene relations from Medline
abstracts.
Recent works have been reported that ontology is a useful tool to improve the performance of any text mini ng
tasks such as text clustering and association rule mining. In text clustering the paper [11] uses conceptual features that are
extracted from text using ontology and prove that ontology could improve the performance of text clustering. The paper
[12] shows the case study on the integration of biomedical information in to ontology. In paper [21] author proposed bio
ontology methodology and compared this with other bio-ontologies. The limitations and benefits of GO ontology are
expressed in paper [22]. The author had studied the strength and limitation of biomedicine ontologies based on its text and
concept representation [23]
This paper presents a new ontology that integrates the famous two ontologies such as Gene Ontology (GO) and
MeSH by adding of semantic mappings or relations between GO terms, Gene names and MESH keywords related to a
particular disease (Alzheimer disease). Finally the integrated ontology has validated based on syntactic, structural and
semantic validation measures in order to prove its correctness and validity.
III. BIOMEDICAL ONTOLOGIES
The Molecular Biology Ontology (MBO) [16] was the first attempt to begin to define the entities in the domain to
promote consistent interpretation across resources. A second phase saw the adoption of ontology by the biological
community itself. Pre-eminent among these is the Gene Ontology (GO) [15]. The Microarray Gene Expression Data
(MGED) ontology [18] provides a vocabulary for describing a biological sample used in an experiment, the treatment that
the sample receives in the experiment and the microarray chip technology used in the experiment. The Functional Genomics
Ontology (FUGO) [17] is another type of ontology in the field of bioinformatics. The next popular ontology MeSH thesaurus
is the NLM's controlled vocabulary for subject indexing in MEDLINE. It is structured in a hierarchy of descriptors, with
each descriptor including a set of concepts, and each concept itself containing a set of terms, which are synonyms and lexical
variants.
The next coming sections give a brief explanation on Gene Ontology (GO) and MeSH ontology that are integrated
in our work.
3.1 GO
Biological knowledge is most often represented in „bio-ontologies‟ that are formal representations of knowledge
areas in which the essential concepts are combined with properties that describe relationships between concepts. Bio-
ontologies are constructed according to textual descriptions of biological activities. One of the most popular bio-ontology is
Gene Ontology (GO) [15] that contains more than 18 thousands terms. The GO ontology is a controlled vocabulary of gene
and protein roles in cells, addressing the need for consistent description of gene products. This is mainly used in almost all
biological researches and to predict the gene functions based on patterns of annotation. The GO describes the molecular
function of a gene product, the biological process in which the gene product participates, and the cellular component where
the gene product can be found.
3.2 MeSH
Medical Subject Headings (MeSH) [24] is another popular ontology designed by the National Library of Medicine
which mainly consists of the controlled vocabulary and a MeSH Tree. The controlled vocabulary contains several different
types of terms, such as Descriptor, Qualifiers, Scope note, Tree number and Entry terms. Descriptor terms are main
concepts or main headings. Entry terms are the synonyms or the related terms to descriptors. For example, “Amyloid beta-
Protein Precursor” as a descriptor has the following entry terms “Amyloid A4 Protein Precursor”, “Amyloid beta Precursor
Protein”, “Amyloid Protein Precursor”, etc. MeSH descriptors are organized in a MeSH Tree, which can be seen as a MeSH
Concept Hierarchy. In the MeSH Tree there are 15 categories (e.g. category A for anatomic terms) and each category is
further divided into subcategories.
For example, the MeSH tree structure of Alzheimer Disease is shown in Fig. 1. For each subcategory,
corresponding descriptors are hierarchically arranged from most general to most specific. In addition to its ontology role,
MeSH descriptors are originally used to index MEDLINE articles.
2
3. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
Fig. 1 MeSH Tree Structure for Alzheimer Disease
IV. PROPOSED FRAMEWORK
The proposed framework shown in Fig. 2 integrates the two popular ontologies GO and MeSH by mapping with
Gene details used to extract significant associations among concepts from Medline abstracts related to Alzheimer disease.
The main objective of the integration of two ontologies is making use of semantic relations among the concepts in Medline
abstracts using MeSH ontology terms. The gene products of genes are referred from GO in order to find out the associations
of gene products related to Alzheimer disease genes. The integrated ontology consists of MeSH concepts related to
Alzheimer disease, linking of Alzheimer disease MeSH concepts to proteins that cause this Alzheimer disease, linking of
Alzheimer disease proteins to genes that inhibit Alzheimer disease and finally linking of Alzheimer disease genes to
respective gene products in which we identify the exact molecular functions that result in Alzheimer disease, biological
processes in which the gene product participates to result in Alzheimer disease and the cellular component where the gene
product of Alzheimer disease can be found. The components of this framework are explained below.
Fig. 2 The Proposed Ontology Framework
The first step in this work is to do preprocessing to transform Medline abstracts, which typically are strings of
characters into a suitable representation.
a. Removal of stop-words: The stop-words are high frequent words that carry no information (i.e. pronouns,
prepositions, conjunctions etc.). Removal of stop-words improves clustering results [19].
b. Stemming: By word stemming it means the process of suffix removal to generate word stems. The Porter stemmer
[20] which is a well-known algorithm is used for this task.
c. Filtering: Domain vocabulary V in ontology is used for filtering. By filtering, document is considered with related
domain words (term). It can reduce the documents dimensions. The filtering task used in our work filters the
documents related to Alzheimer disease.
d. Tagging: The concepts in Medline abstracts are identified using Genia tagger [26] and the identified concepts are
mapped with concepts related to the categories specified in the proposed ontology. The categories used in the
proposed ontology are gene, MeSH and GO.
3
4. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
e. Semantic analysis & Concept mapping:
Class/Concept Description
GO_0003674 This is the base class for molecular function. All molecular functions are the subclass of
GO_0003674 and the respective molecular function class for a gene is mapped with a
particular gene.
GO_0005575 This is the base class for cellular components. All cellular components are the subclass of
GO_0005575 and the respective cellular component for a gene is mapped with a particular
gene.
GO_0008150 This is the base class for biological process. All biological process are the subclass of
GO_0008150 and the respective biological process for a gene is mapped with a particular
gene.
GO_Functionality This class specifies the three GO functionalities as class such as cellular component, molecular
function and biological process and these classes are mapped with the above three classes.
Gene This specifies all human genes.
Gene_Type This specifies the gene type for a gene such as protein coding, pseudo coding and unknown.
Mesh This specifies the MeSH keywords that include proteins, disease level, etc. for Alzheimer
disease.
Table 1. Main Concepts in Proposed Ontology
After preprocessing, the extracted concepts in Medline abstracts are analyzed in terms of semantic meaning and
added to ontology, if it is not available. The concepts or classes used in this integrated ontology are shown in Table 1. The
first step for adding concept is to find out the equivalent concept from the ontology and add the concept and possible
semantic relations which includes object properties and data properties into ontology, if it is not found. Some of the
important object properties and data properties created in the integrated ontology shown in Table 2 and 3.
For example, the gene name “A2MP1” is extracted from Medline abstract and this term or concept is to be added
to the ontology. If there is an equivalent gene “A2M” is already in the ontology, add “A2MP1” to ontology and assign is-a
relationship along with other possible properties such as has_go, has_synonym, has_genetype_as, has_inducing_protien,
inhibits, etc. between “A2M” and “A2MP1”.
Object properties Description
belongs_to This property is used to map GO classes.
curated_GO_References This property is used to map genes referred in Pub Med literature
has_gene This is used to relate gene with GO
has_genetype_as The gene types protein coding, pseudo coding and unknown is
mapped with genes.
has_go This is used to all applicable GO concepts are mapped with gene at all
functional level.
has_inducing_protien This is used to map disease with respective proteins.
has _synonym This is used to specify all possible synonyms for a gene.
Inhibits This property is used to map gene with disease.
is_found_in This is used to map disease with gene and it has transitive relationship
with inhibits property.
Table 2. List of Object Properties in Proposed Ontology
Data Properties Description
gene_annotations This property is used to specify different data base reference for a particular
gene such as ENSENBL, HNGC, HPRD, MIM and UNIPROT.
gene_descriptions This is used to specify the alternate name and full name for gene.
has_gene_id This is used to specify the gene identifier for gene
is_in_chromosome This property is used to specify the chromosome map location and
chromosome number.
Table 3. List of Data Properties in Proposed Ontology
V. EXPERIMENTAL DESIGN & RESULTS
The integrated ontology consists of three different main concepts or classes which are GO term functionalities, all
human genes with or without related to a particular disease (in this ontology all human genes with or without related to
Alzheimer disease are considered) and MeSH terms related to a particular disease. The sub classes for GO term
functionalities class are all possible GO functionalities for the genes that are added into the ontology and the Gene main
class consists of all human genes with or without related to a particular disease. The subclasses created for MeSH class
includes all disease branches in which Alzheimer disease is derived, amino acids, peptides and proteins related to a particular
disease. The integrated ontology provides all types of information related to GO terms, genes and a particular disease details.
4
5. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
The ontology can be manipulated in different ways in which the most important manipulation techniques are using OntGraf
and DL query. The structural evaluation is necessary for ontology to verify the consistency, if it is not structurally evaluated,
it may produce some wrong results or inconsistent results when we manipulate information from the ontology. The
visualization of concepts with its semantic relations is experimented using OntGraf tool. The important subclasses for MeSH
main concepts or classes are shown in Fig. 3. The visualization of human genes related to Alzheimer disease is shown in Fig.
4. The visualization of proteins that are inducing Alzheimer disease is shown in Fig. 5.
Fig. 3 The Overview of MeSH Concept
Another way to manipulate the ontology is using DL query tool. This is an effective tool to retrieve any kind of
semantic related information from the given ontology. Some of the information retrieval queries and results are shown in
below figures. The Fig. 6 shows the extraction of Alzheimer Disease
Fig. 4 Genes Related to Alzheimer Disease
5
6. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
related human genes with the respective DL query and the Fig. 7 shows the extraction of protein names that inducing
Alzheimer disease. Finally the Fig. 8 shows the extraction of human genes that inhibits Alzheimer disease in particular
chromosome level, in our data set there are 3 genes (gene identifiers mapped with genes are shown in Fig. 8) that inhibits
Alzheimer disease in chromosome level “10”.
Fig. 5 Proteins Related to Alzheimer Disease
Fig. 6 Genes Extracted by DL query “Gene and inhibits some Alzheimer_Disease”
6
7. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
Fig. 7 Proteins Extracted by DL query “Proteins and has_inducing_disease some Alzheimer_Disease
Fig. 8 Mapping Identifiers for Genes Extracted by DL query
5.1 Validating the Ontology
The integrated ontology has to be validated to check the correctness. This section explores the evaluation methods
used for validating our proposed ontology framework. The proposed ontology is syntactically verified for its consistency
using FACT ++ reasoner available in protégé tool. The next validation method is semantic validation and the semantic
validation of the ontology is verified by the domain experts. This ontology is validated by domain experts in biological field.
Another evaluation method to validate the ontology is structural validation. The structural validation is performed by the
different metrics defined in paper [25] that are class match measure, density measure, betweeness measure and semantic
similarity measure. The ontology concept is ranked based on the total score of all the four metrics. The weights are assigned
based on the concept representation and the weights are assigned in such a way the overall score lies between 0 and 1.
Class Match Measure (CMM) – This measure evaluates the ontology for the specified concepts. The specified concepts are
searched in the ontology to determine the occurrence of it. If it occurs directly as a concept, the maximum weight will be
given for the specified concept. If it partially occurs as instances of any class, then the 50% of maximum weight may be
assigned. The CMM evaluates the concepts either as exact match or partial match found in the ontology.
Density Measure (DEM) - The DEM evaluates the ontology based on the degree of richness of attributes of a specified
concept and includes the details of subclasses, inner attributes, siblings and relations with other classes in the ontology. The
weight may be assigned based on the degree of richness of attributes of a concept.
Betweenness Measure (BEM) – This measure evaluates the ontology based on centrality of a specified concept in the
ontology. The centrality of a concept is computed using the count of shortest path between the specified concept and other
concepts in the ontology. Based on the shortest path, weight may be assigned.
7
8. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
Semantic Similarity Measure (SSM) – The SSM evaluates the ontology based on the proximity of classes in the ontology
the specified concept matches, that is the count of links the specified concept has to map with the existing concepts in the
ontology.
5.2 Dataset
We framed 4 corpuses from our integrated ontology to validate it and each corpus consists of concepts of ontology
and its important properties. Each corpus is the superset of the previous one. The corpus C1 consists of main concepts that
include more subclasses and have rich relations or links with other sub concepts. The corpus C2 consists of sub concepts in
C1 and other concepts in the ontology. The corpus C3 is the subset of C1 and has concepts in C1 and two important
properties related to those concepts. The corpus C4 contains concepts in C1 and three important properties related to those
concepts. All 4 corpuses framed from the ontology shown in Table 4 and the overall score is computed as follows from the
above mentioned measures. Let O be the set of corpuses framed from the proposed ontology; Let wi be a weight factor and M
be the different similarity metrics such as CMM, DEM, SSM and BEM.
From the overall score it is found that the corpus C1 has the maximum score as it considered concepts as direct
match. The score may be lesser when the concept with partial match is found. The corpus C2 is found to have less score and
ranked as 4 due to the DEM and BEM measure score values. The DEM and BEM measure gives lowest score, because
concepts in C2 have no related inner attributes and links with other concepts in C2. The corpus C3 is found to have second
highest score due to the CMM and SSM score values, since the concepts in C2 are direct concepts and have good number of
links among concepts in C3. The corpus C4 is found to have third highest score due to the CMM and SSM score values and
also C4 is the sub set of C3.
All the four metrics are provided with equal weights and we found that some of the corpuses may produce low
score due to the DEM and BEM measures. The DEM and BEM score values may be increased when we use different
weights. In our proposed ontology, we found that the concepts and its relations are linked correctly and further some of the
missing relations may be added in future as to produce more promising results.
Corpus(constructed from Score Rank
Ontology)
C1 0.79 1
C2 0.39 4
C3 0.46 2
C4 0.41 3
Table 4. Overall Scores and Ranks for Corpuses
Finally the class match measure produces high score when there is an exact match found in the ontology. This
score may decrease when there is a partial match found in the ontology. The density measure score found to be good when
more relations exists among concepts. The betweenness measure found to be good when the concepts related with more
other concepts in the ontology. The semantic similarity measure is found to be good when the concept have more synonyms
and its relations. The corpuses Vs metrics score is represented in a bar chart is shown in Fig. 9.
Fig. 9 The Bar chart of Corpus Vs Similarity Metric Score
8
9. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
VI. CONCLUSION
We studied almost all biomedical ontologies and identified all their merits and demerits. In consideration with this
in mind the integrated ontology has proposed by accumulating the essential features represented in all specified ontologies.
The integrated ontology is implemented in protégé tool that consists of three main concepts namely GO term functionalities,
all human genes with or without related to Alzheimer Disease and MeSH terms related to the same disease. This frame work
also addresses the problems of GO ontology, in which all information are given as annotations and that are not directly
accessible by the user, because information of these kind are given as http links. The MeSH ontology represents the entry
terms for the particular term and associated links in various repositories such as Pub Med, Medline, MIM, etc. In our work,
all associated information of GO functionalities, genes are specified directly in our ontology, not as links. This ontology
gives all possible semantic relations applicable for all concepts defined in the ontology. The ontology is also evaluated for its
correctness and validity using various metrics. In the results of the experiments, we found that the ontology is modeled
correctly by providing necessary concepts and relations. The ontology may be further improved by adding more relations to
the existing concepts of gene and MeSH to get a higher score.
Further the integrated ontology to be used in association rule mining to extract the significant associations among
the proteins, genes that are related to Alzheimer Disease. Instead of using simple vector space model to calculate the term
frequency and inverse document frequency from the Medline abstracts, we decide to use ontology approach to consider the
semantic relationships of terms that appear in the Medline abstracts may give better results.
VII. ACKNOWLEDGMENT
This work was performed as part of the Minor Research Project, which is supported and funded by University
Grants Commission, New Delhi, India.
REFERENCES
[1]. NCBI PubMed, http://www.ncbi.nlm.nih. Gov /entrez/query.fcgi
[2]. Uramoto, N., H. Matsuzawa, T. Nagano, A. Murami and H. Takeuchi, 2004. A text-mining system for knowledge
discovery from biomedical documents.
[3]. Hristovski, D., J. Stare, B. Peterlin and S. Dzeroski, 2001. Supporting discovery in medicine by association rule
mining in Medline and UMLS. Proc. MedInfo Conf., London, England, Sep. 2-5, 10: 1344-1348.
[4]. Creighton, C. and S. Hanash, 2003. Mining gene expression databases for association rules. Bioinformatics, 19-1:
79-86.
[5]. Wren, J.D., R. Bekeredjian, J.A. Stewart, R.V. Shohet and H.R. Garner, 2004. Knowledge discovery by automated
identification and ranking of implicit relationships. Bioinformatics, 20: 3.
[6]. Adamic, L.A., D. Wilkinson, B.A. Huberman and E. Adar, 2002. A literature based method for identifying gene-
disease connections. IEEE Computer Soc. Bioinformatics Conf.
[7]. Palakal, M., M. Stephens, S. Mukhopadhay, R. Raje and S. Rhodes, 2002. A Multi-level Text Mining Method to
Extract Biological Relationships. Proc. IEEE Computer Soc. Bioinformatics (CSB) Conf., pp: 97-108.
[8]. Wilkinson, D.M. and B.A. Huberman, 2004. A method for finding communities of related genes. Proc. Natl. Acad.
Sci. U.S.A., 101 Suppl. 1: 5241- 5248.
[9]. Hisham Al-Mubaid and Rajit K Singh, 2005. A New Text Mining Approach for Finding Protein-to-Disease
Associations, American Journal of Biochemistry and Biotechnology 1 (3): 145-152, ISSN 1553-3668.
[10]. M. Stephens, M. Palakal, S. Mukhopadhyay, R. Raje, 2001. Detecting Gene Relations From Medline Abstracts,
Pacific Symposium on Biocomputing 6:483-496.
[11]. A. Hotho,A.Maedche and S.Staab, “Ontology-based text document clustering”[A], Proc. of the Conf. on Intelligent
Information Systems[C],2003.
[12]. Paulo Gottgtroy1, Prof. Nik Kasabov1, Stephen MacDonell1, 2004. An ontology driven approach for
knowledge discovery in Biomedicine.
[13]. R. Kleinsorge, C. Tilley, and J.Willis. (2000). Unified Medical Language System (UMLS) Basics [Online].
Available: http://www.nlm.nih.gov/ research/umls/pdf/UMLS_Basics.pdf.
[14]. G. A. Miller, “WordNet: A lexical database for English,” Commun. ACM, vol. 38, pp. 39–41, 1995.
[15]. http://www.geneontology.org/
[16]. Schulze-Kremer S. Adding semantics to genome databases: Towards ontology for molecular biology. In:
Proceedings of the Fifth International Conference for Intelligent Systems for Molecular Biology Conference
(ISMB), 1997; pp. 272–5.
[17]. http://www.fugo.org
[18]. Whetzel PL, Parkinson H, Causton HC, et al. The MGED Ontology: a resource for semantics-based description of
microarray experiments. Bioinformatics 2006;22:866–73.
[19]. Mark Sinka and David Corne, “A Large Benchmark Dataset for Web Document Clustering”, In Soft Computering
Sytems:Design,Management And Application, Vol.87 of Frontiers in Artifical Intelligence and Applications, pages
881-890,2002.
[20]. M.F.Porter, “An Algorithm for Suffix Stripping”, Program 14(3), July 1980, pp.130-137.
[21]. Robert Stevebs et.al., “Ontology based knowledge representation for bioinformatics”, published in briefings in
Bioinformatics, 2000.
[22]. Barry Smith, et. Al., “The Ontology of the Gene Ontology”, Proceedings of AMIA Symposium 2003.
9
10. Design and Development of Integrated Biomedical Ontology for Information Extraction from …
[23]. Olivier Corby et.al., “Searching the Semantic Web: Approximate Query processing based on bio ontologies”.
Published in IEEE Computer Society, 2006.
[24]. http://www.ncbi.nlm.nih.gov/mesh/ meshhome.html
[25]. Amal Zouaq, Roger Nkambou, “Building Domain Ontologies from Text for Educational purposes”. IEEE
Transactions on Learning Technologies, Vol.1, No.1, Jan-Mar 2008.
[26]. http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/tagger/
10