Identifying drug targets and candidate sequences is an important process and an unmet challenge in drug development. Creative Biolabs has developed an original AI-augmented drug discovery platform to accelerate drug discovery.
https://ai.creative-biolabs.com/ai-augmented-drug-discovery.htm
Over the past decades, Creative Biolabs has become a leader in antibody drug discovery and manufacturing, providing high quality services to clients in academia and industry around the world. Now, we are able to provide solutions to accelerate drug discovery and development by deploying artificial intelligence technologies. Here, we will briefly introduce the basics of AI-augmented drug discovery, algorithm classification, common AI models, and related services.
Machine learning algorithms show promise in improving medical image analysis and diagnosis by helping physicians more accurately interpret images. Such algorithms can be trained using labeled medical image data to learn the differences between benign and malignant tumors, and then apply that learning to analyze new images and predict the likelihood of tumors being benign or malignant. However, it is important to address the potential pitfalls of machine learning and ensure its safe and effective use in medical applications.
Health Care Application using Machine Learning and Deep LearningIRJET Journal
This document presents a study on using machine learning and deep learning techniques for healthcare applications like disease prediction. It discusses algorithms like logistic regression, decision trees, random forests, SVMs and deep learning models like VGG16 applied on various disease datasets. For diabetes, heart and liver diseases, ML algorithms were used while CNN models were used for malaria and pneumonia image datasets. Random forest achieved the highest accuracy of 84.81% for diabetes prediction, SVM had 81.57% accuracy for heart disease and random forest was best at 83.33% for liver disease. The VGG16 model attained accuracies of 94.29% and 95.48% for malaria and pneumonia respectively. The study aims to develop an intelligent healthcare application for predicting different
This document discusses a thesis that uses machine learning algorithms to diagnose mental illness using MRI brain scans. Specifically, it analyzes schizophrenia, bipolar disorder, and healthy control subject data from multiple imaging modalities. It trains and tests eight machine learning classifiers - support vector machines, k-nearest neighbors, logistic regression, naive Bayes, and random forests - on the raw imaging data as well as data transformed through dimensionality reduction techniques. The results aim to demonstrate the efficacy of these algorithms at classifying subjects based on their brain scans and diagnosing their mental condition.
This document discusses using machine learning algorithms to screen for mental health issues in adolescents. It describes supervised, unsupervised, and reinforcement learning approaches. Classification and regression techniques are discussed for supervised learning, including K-nearest neighbors, grid search, random search, logistic regression, decision trees, random forests, bagging, AdaBoosting, and naive Bayes algorithms. The document outlines implementing these algorithms to classify mental health status and discusses advantages like early intervention identification, and disadvantages like potential stigmatization or incorrect results. Future applications of machine learning in healthcare are also mentioned.
This document discusses using genetic programming (GP) for data classification. It begins with an introduction to classification problems and algorithms like decision trees, neural networks, and Bayesian classification. It then provides background on genetic algorithms and genetic programming, explaining that GP uses tree-based representations of computer programs to evolve solutions. The document presents a case study on using GP for classification, discussing fitness measures, creating training sets, and an interleaved data format to address class imbalance. The goal is to automatically generate classification models for data sets using GP.
This document describes two machine learning techniques, particle swarm optimization with support vector machines (PSO-SVM) and recursive feature elimination with support vector machines (RFE-SVM), that were used to classify autism neuroimaging data from the Autism Brain Imaging Data Exchange database. PSO-SVM was used to select discriminative features for classification, while RFE-SVM ranked features by importance. Both techniques aimed to improve classification accuracy and reduce overfitting by selecting optimal feature subsets from the high-dimensional neuroimaging data. The results could help develop brain-based diagnostic criteria for autism.
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
As we know that health care industry is completely based on assumptions, which after get tested and verified via various tests and patient have to be depend on the doctors knowledge on that topic . so we made a system that uses data mining techniques to predict the health of a person based on various medical test results. so we can predict the health of that person based on that analysis performed by the system.The system currently design only for heart issues, for that we had used Statlog (Heart) Data Set from UCI Machine Learning Repository it includes attributes like age, sex, chest pain type, cholesterol, sugar, outcomes,etc.for training the system. we only need to passed few general inputs in order to generate the prediction and the prediction results from all algorithms are they merged together by calculating there mean value that value shows the actual outcome of the prediction process which entirely works in background
Over the past decades, Creative Biolabs has become a leader in antibody drug discovery and manufacturing, providing high quality services to clients in academia and industry around the world. Now, we are able to provide solutions to accelerate drug discovery and development by deploying artificial intelligence technologies. Here, we will briefly introduce the basics of AI-augmented drug discovery, algorithm classification, common AI models, and related services.
Machine learning algorithms show promise in improving medical image analysis and diagnosis by helping physicians more accurately interpret images. Such algorithms can be trained using labeled medical image data to learn the differences between benign and malignant tumors, and then apply that learning to analyze new images and predict the likelihood of tumors being benign or malignant. However, it is important to address the potential pitfalls of machine learning and ensure its safe and effective use in medical applications.
Health Care Application using Machine Learning and Deep LearningIRJET Journal
This document presents a study on using machine learning and deep learning techniques for healthcare applications like disease prediction. It discusses algorithms like logistic regression, decision trees, random forests, SVMs and deep learning models like VGG16 applied on various disease datasets. For diabetes, heart and liver diseases, ML algorithms were used while CNN models were used for malaria and pneumonia image datasets. Random forest achieved the highest accuracy of 84.81% for diabetes prediction, SVM had 81.57% accuracy for heart disease and random forest was best at 83.33% for liver disease. The VGG16 model attained accuracies of 94.29% and 95.48% for malaria and pneumonia respectively. The study aims to develop an intelligent healthcare application for predicting different
This document discusses a thesis that uses machine learning algorithms to diagnose mental illness using MRI brain scans. Specifically, it analyzes schizophrenia, bipolar disorder, and healthy control subject data from multiple imaging modalities. It trains and tests eight machine learning classifiers - support vector machines, k-nearest neighbors, logistic regression, naive Bayes, and random forests - on the raw imaging data as well as data transformed through dimensionality reduction techniques. The results aim to demonstrate the efficacy of these algorithms at classifying subjects based on their brain scans and diagnosing their mental condition.
This document discusses using machine learning algorithms to screen for mental health issues in adolescents. It describes supervised, unsupervised, and reinforcement learning approaches. Classification and regression techniques are discussed for supervised learning, including K-nearest neighbors, grid search, random search, logistic regression, decision trees, random forests, bagging, AdaBoosting, and naive Bayes algorithms. The document outlines implementing these algorithms to classify mental health status and discusses advantages like early intervention identification, and disadvantages like potential stigmatization or incorrect results. Future applications of machine learning in healthcare are also mentioned.
This document discusses using genetic programming (GP) for data classification. It begins with an introduction to classification problems and algorithms like decision trees, neural networks, and Bayesian classification. It then provides background on genetic algorithms and genetic programming, explaining that GP uses tree-based representations of computer programs to evolve solutions. The document presents a case study on using GP for classification, discussing fitness measures, creating training sets, and an interleaved data format to address class imbalance. The goal is to automatically generate classification models for data sets using GP.
This document describes two machine learning techniques, particle swarm optimization with support vector machines (PSO-SVM) and recursive feature elimination with support vector machines (RFE-SVM), that were used to classify autism neuroimaging data from the Autism Brain Imaging Data Exchange database. PSO-SVM was used to select discriminative features for classification, while RFE-SVM ranked features by importance. Both techniques aimed to improve classification accuracy and reduce overfitting by selecting optimal feature subsets from the high-dimensional neuroimaging data. The results could help develop brain-based diagnostic criteria for autism.
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
As we know that health care industry is completely based on assumptions, which after get tested and verified via various tests and patient have to be depend on the doctors knowledge on that topic . so we made a system that uses data mining techniques to predict the health of a person based on various medical test results. so we can predict the health of that person based on that analysis performed by the system.The system currently design only for heart issues, for that we had used Statlog (Heart) Data Set from UCI Machine Learning Repository it includes attributes like age, sex, chest pain type, cholesterol, sugar, outcomes,etc.for training the system. we only need to passed few general inputs in order to generate the prediction and the prediction results from all algorithms are they merged together by calculating there mean value that value shows the actual outcome of the prediction process which entirely works in background
The document provides an overview of machine learning concepts and applications in bioinformatics. It discusses key topics like supervised vs unsupervised learning, classification vs regression, linear vs non-linear models, and examples of machine learning algorithms like naive Bayes, neural networks, and support vector machines. Specific examples mentioned include using these algorithms for protein function prediction, gene finding, and predicting RNA binding sites in proteins.
Effect of Data Size on Feature Set Using Classification in Health Domaindbpublications
In health domain, the major critical issue is prediction of disease in early stage. Prediction of disease is mainly based on the experience of physician so many machine learning approach contribute their work in the prediction of disease. In existing approaches, either prediction or feature selection has been concentrated. The aim of this paper is to present the effect of data size and set of features in the prediction of disease in health domain using Naïve Bayes. This shows how each attribute or combination of attribute behaves on different size of dataset.
IRJET- Disease Prediction using Machine LearningIRJET Journal
This document discusses using machine learning techniques to predict diseases based on patient symptoms. Specifically, it proposes using naive bayes, k-nearest neighbors (KNN), and logistic regression algorithms on structured and unstructured hospital data to predict diseases like diabetes, malaria, jaundice, dengue, and tuberculosis. The system is intended to make disease prediction more accessible to end users by analyzing their symptoms without needing to visit a doctor. It aims to improve prediction accuracy by handling both structured and unstructured data using machine learning models.
This document summarizes and compares different classification algorithms that can be used for disease prediction in data mining. It first introduces disease prediction and classification processes. It then reviews related works that have used various classification algorithms like random forest, support vector machine, and naive Bayes for tasks like disease diagnosis, text classification, and rainfall forecasting. The document also discusses supervised, unsupervised, and semi-supervised machine learning. It provides details on support vector machine and random forest algorithms, describing how each works and is used for classification. Finally, it analyzes the random forest algorithm construction process.
Classification Of Iris Plant Using Feedforward Neural Networkirjes
The classification and recognition of type on the basis of individual features and behaviors constitute
a preliminary measure and is an important target in the behavioral sciences. Current statistical methods do not
always yield satisfactory answers. A Feed Forward Artificial Neural Network is the computer model inspired by
the structure of the Human Brain. It views as in the set of artificial nerve cells that are interconnected with the
other neurons. The primary aim of this paper is to demonstrate the process of developing the Artificial Neural
network based classifier which classifies the Iris database. The problem concerns the identification of Iris plant
species on the basis of plant attribute measurements. This paper is related to the use of feed forward neural
networks towards the identification of iris plants on the basis of the following measurements: sepal length, sepal
width, petal length, and petal width. Using this data set a Neural Network (NN) is used for the classification of
iris data set. The EBPA is used for training of this ANN. The results of simulations illustrate the effectiveness of
the neural system in iris class identification.
Machine learning techniques can be used to enable computers to learn from data and perform tasks. Some key techniques discussed in the document include decision tree learning, artificial neural networks, Bayesian learning, support vector machines, genetic algorithms, graph-based learning, reinforcement learning, and pattern recognition. Each technique has its own strengths and applications.
Predicting disease at an early stage becomes critical, and the most difficult challenge is to predict it correctly along with the sickness. The prediction happens based on the symptoms of an individual. The model presented can work like a digital doctor for disease prediction, which helps to timely diagnose the disease and can be efficient for the person to take immediate measures. The model is much more accurate in the prediction of potential ailments. The work was tested with four machine learning algorithms and got the best accuracy with Random Forest.
Evolving Efficient Clustering and Classification Patterns in Lymphography Dat...ijsc
Data mining refers to the process of retrieving knowledge by discovering novel and relative patterns from large datasets. Clustering and Classification are two distinct phases in data mining that work to provide an established, proven structure from a voluminous collection of facts. A dominant area of modern-day research in the field of medical investigations includes disease prediction and malady categorization. In this paper, our focus is to analyze clusters of patient records obtained via unsupervised clustering techniques and compare the performance of classification algorithms on the clinical data. Feature selection is a supervised method that attempts to select a subset of the predictor features based on the information gain. The Lymphography dataset comprises of 18 predictor attributes and 148 instances with the class label having four distinct values. This paper highlights the accuracy of eight clustering algorithms in detecting clusters of patient records and predictor attributes and highlights the performance of sixteen classification algorithms on the Lymphography dataset that enables the classifier to accurately perform multi-class categorization of medical data. Our work asserts the fact that the Random Tree algorithm and the Quinlan’s C4.5 algorithm give 100 percent classification accuracy with all the predictor features and also with the feature subset selected by the Fisher Filtering feature selection algorithm.. It is also stated here that the Density Based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm offers increased clustering accuracy in less computation time.
EVOLVING EFFICIENT CLUSTERING AND CLASSIFICATION PATTERNS IN LYMPHOGRAPHY DAT...ijsc
Data mining refers to the process of retrieving knowledge by discovering novel and relative patterns from
large datasets. Clustering and Classification are two distinct phases in data mining that work to provide an
established, proven structure from a voluminous collection of facts. A dominant area of modern-day
research in the field of medical investigations includes disease prediction and malady categorization. In
this paper, our focus is to analyze clusters of patient records obtained via unsupervised clustering
techniques and compare the performance of classification algorithms on the clinical data. Feature
selection is a supervised method that attempts to select a subset of the predictor features based on the
information gain. The Lymphography dataset comprises of 18 predictor attributes and 148 instances with
the class label having four distinct values. This paper highlights the accuracy of eight clustering algorithms
in detecting clusters of patient records and predictor attributes and highlights the performance of sixteen
classification algorithms on the Lymphography dataset that enables the classifier to accurately perform
multi-class categorization of medical data. Our work asserts the fact that the Random Tree algorithm and
the Quinlan’s C4.5 algorithm give 100 percent classification accuracy with all the predictor features and
also with the feature subset selected by the Fisher Filtering feature selection algorithm.. It is also stated
here that the Density Based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm
offers increased clustering accuracy in less computation time.
ARTIFICIAL NEURAL NETWORKING.
FIRST STEP TO KNOWLEDGE IS TO KNOW THAT we are ignorant
Knowledge in medical field is characterized by uncertanity and vagueness
Historically as well as currently this fact remains a motivation for the development of medical decision support system are based on fuzzy logics
Greek philosopher visualized a basic model of brain function as early as 300 bc
Till date nervous system is not completely understood to human kind.
This document discusses the use of artificial intelligence in drug discovery and development. It begins by defining artificial intelligence, machine learning, and deep learning. It then provides examples of how AI is currently used in various stages of the drug development process, including identifying molecular targets, finding hit compounds, optimizing lead compounds, predicting toxicity, and drug repurposing. It also discusses startups applying AI to drug discovery. Finally, it notes some limitations and drawbacks of using AI, such as potential bias in algorithms.
Simplified Knowledge Prediction: Application of Machine Learning in Real LifePeea Bal Chakraborty
Machine learning is the scientific study of algorithms and statistical models that is used by the machines to perform a specific task depending on patterns and inference rather than explicit instructions. This research and analysis aims to observe how precisely a machine can predict that a patient suspected of breast cancer is having malignant or benign cancer.In this paper the classification of cancer type and prediction of risk levels is done by various model of machine learning and is pictorially depicted by various tools of visual analytics.
This document summarizes a presentation on artificial intelligence in medical research given by Dr. Ahmed Elngar. It discusses the history and foundations of AI, as well as key research directions like machine learning, expert systems, computer vision, neural networks, and deep learning. Deep learning is seen as core to AI and medicine, with applications in medical imaging, video analysis, and clinical tasks like screening, diagnosis and monitoring. Challenges remain around data scarcity, bias and ethical deployment, but AI has great potential to improve healthcare accuracy, efficiency and access worldwide.
Early Identification of Diseases Based on Responsible Attribute using Data Mi...IRJET Journal
This document describes a proposed method for early identification of diseases using data mining and classification techniques. It begins with an introduction to classification and discusses how it is commonly used in healthcare for tasks like predicting patient risk levels. It then reviews related literature applying classification methods to diseases like heart disease and diabetes. The document outlines the problem of selecting the best classification technique for a given healthcare dataset. It proposes an architecture and method for disease prediction that assigns recommended values to attributes and classifies unknown data based on calculating totals. The method is experimentally analyzed using a heart disease dataset, and its accuracy is compared to Bayesian classification. In conclusion, the proposed method seeks to reduce attributes and complexity while accurately classifying patient data for early disease identification.
Performance Evaluation of Different Data Mining Classification Algorithm and ...IOSR Journals
This document evaluates the performance of different data mining classification algorithms and predictive analysis. It applies algorithms like decision trees, naive Bayes, k-nearest neighbor, neural networks, and support vector machines to datasets like Iris, liver disorder, and E. coli. The results show that neural networks and k-nearest neighbor achieved the best performance on these datasets, with accuracy rates up to 97.33% for Iris classification. Feature selection techniques like removing zero-weighted attributes were also found to improve some algorithm performance. Predictive analysis experiments found that neural networks and k-nearest neighbor were most accurate at predicting new class labels.
The document discusses machine learning concepts including supervised learning, unsupervised learning, and reinforcement learning. It describes several machine learning algorithms like decision trees, k-nearest neighbors, naive bayes, and support vector machines that are used in supervised learning. Unsupervised learning techniques like clustering, association, and k-means clustering are also covered. The document concludes that machine learning approaches can help with systematic reviews by assisting in document screening and improving reviewer agreement.
How to create your own artificial neural networksAgrata Shukla
See how to create your own neural networks.Artificial neural networks are used to simulate the functioning of the human brain.The machine could not think but it predicts.These ANN’s are inspired from the nervous system of the human brain.
When deep learners change their mind learning dynamics for active learningDevansh16
Abstract:
Active learning aims to select samples to be annotated that yield the largest performance improvement for the learning algorithm. Many methods approach this problem by measuring the informativeness of samples and do this based on the certainty of the network predictions for samples. However, it is well-known that neural networks are overly confident about their prediction and are therefore an untrustworthy source to assess sample informativeness. In this paper, we propose a new informativeness-based active learning method. Our measure is derived from the learning dynamics of a neural network. More precisely we track the label assignment of the unlabeled data pool during the training of the algorithm. We capture the learning dynamics with a metric called label-dispersion, which is low when the network consistently assigns the same label to the sample during the training of the network and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results.
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...ahmad abdelhafeez
Abstract- The goal of this paper is to compare between different classifiers or multi-classifiers fusion with respect to accuracy in discovering breast cancer for four different data sets. We present an implementation among various classification techniques which represent the most known algorithms in this field on four different datasets of breast cancer two for diagnosis and two for prognosis. We present a fusion between classifiers to get the best multi-classifier fusion approach to each data set individually. By using confusion matrix to get classification accuracy which built in 10-fold cross validation technique. Also, using fusion majority voting (the mode of the classifier output). The experimental results show that no classification technique is better than the other if used for all datasets, since the classification task is affected by the type of dataset. By using multi-classifiers fusion the results show that accuracy improved in three datasets out of four.
AI-based One-stop Antibody Discovery Platform.pdfCandy Swift
To help our clients develop a safe and effective human antibody, Creative Biolabs is continuously exploring novel AI technologies and phage display libraries for antibody development solutions.
https://ai.creative-biolabs.com/ai-based-one-stop-antibody-discovery-platform.htm
AntInfect™ Platform at Creative Biolabs.pdfCandy Swift
Equipped with world-leading technology platforms and professional scientific staff, Creative Biolabs has developed an advanced AntInfect™ Platform which provides various excellent strategies for different purposes.
https://www.creative-biolabs.com/antinfect/antinfect-platform.htm
The document provides an overview of machine learning concepts and applications in bioinformatics. It discusses key topics like supervised vs unsupervised learning, classification vs regression, linear vs non-linear models, and examples of machine learning algorithms like naive Bayes, neural networks, and support vector machines. Specific examples mentioned include using these algorithms for protein function prediction, gene finding, and predicting RNA binding sites in proteins.
Effect of Data Size on Feature Set Using Classification in Health Domaindbpublications
In health domain, the major critical issue is prediction of disease in early stage. Prediction of disease is mainly based on the experience of physician so many machine learning approach contribute their work in the prediction of disease. In existing approaches, either prediction or feature selection has been concentrated. The aim of this paper is to present the effect of data size and set of features in the prediction of disease in health domain using Naïve Bayes. This shows how each attribute or combination of attribute behaves on different size of dataset.
IRJET- Disease Prediction using Machine LearningIRJET Journal
This document discusses using machine learning techniques to predict diseases based on patient symptoms. Specifically, it proposes using naive bayes, k-nearest neighbors (KNN), and logistic regression algorithms on structured and unstructured hospital data to predict diseases like diabetes, malaria, jaundice, dengue, and tuberculosis. The system is intended to make disease prediction more accessible to end users by analyzing their symptoms without needing to visit a doctor. It aims to improve prediction accuracy by handling both structured and unstructured data using machine learning models.
This document summarizes and compares different classification algorithms that can be used for disease prediction in data mining. It first introduces disease prediction and classification processes. It then reviews related works that have used various classification algorithms like random forest, support vector machine, and naive Bayes for tasks like disease diagnosis, text classification, and rainfall forecasting. The document also discusses supervised, unsupervised, and semi-supervised machine learning. It provides details on support vector machine and random forest algorithms, describing how each works and is used for classification. Finally, it analyzes the random forest algorithm construction process.
Classification Of Iris Plant Using Feedforward Neural Networkirjes
The classification and recognition of type on the basis of individual features and behaviors constitute
a preliminary measure and is an important target in the behavioral sciences. Current statistical methods do not
always yield satisfactory answers. A Feed Forward Artificial Neural Network is the computer model inspired by
the structure of the Human Brain. It views as in the set of artificial nerve cells that are interconnected with the
other neurons. The primary aim of this paper is to demonstrate the process of developing the Artificial Neural
network based classifier which classifies the Iris database. The problem concerns the identification of Iris plant
species on the basis of plant attribute measurements. This paper is related to the use of feed forward neural
networks towards the identification of iris plants on the basis of the following measurements: sepal length, sepal
width, petal length, and petal width. Using this data set a Neural Network (NN) is used for the classification of
iris data set. The EBPA is used for training of this ANN. The results of simulations illustrate the effectiveness of
the neural system in iris class identification.
Machine learning techniques can be used to enable computers to learn from data and perform tasks. Some key techniques discussed in the document include decision tree learning, artificial neural networks, Bayesian learning, support vector machines, genetic algorithms, graph-based learning, reinforcement learning, and pattern recognition. Each technique has its own strengths and applications.
Predicting disease at an early stage becomes critical, and the most difficult challenge is to predict it correctly along with the sickness. The prediction happens based on the symptoms of an individual. The model presented can work like a digital doctor for disease prediction, which helps to timely diagnose the disease and can be efficient for the person to take immediate measures. The model is much more accurate in the prediction of potential ailments. The work was tested with four machine learning algorithms and got the best accuracy with Random Forest.
Evolving Efficient Clustering and Classification Patterns in Lymphography Dat...ijsc
Data mining refers to the process of retrieving knowledge by discovering novel and relative patterns from large datasets. Clustering and Classification are two distinct phases in data mining that work to provide an established, proven structure from a voluminous collection of facts. A dominant area of modern-day research in the field of medical investigations includes disease prediction and malady categorization. In this paper, our focus is to analyze clusters of patient records obtained via unsupervised clustering techniques and compare the performance of classification algorithms on the clinical data. Feature selection is a supervised method that attempts to select a subset of the predictor features based on the information gain. The Lymphography dataset comprises of 18 predictor attributes and 148 instances with the class label having four distinct values. This paper highlights the accuracy of eight clustering algorithms in detecting clusters of patient records and predictor attributes and highlights the performance of sixteen classification algorithms on the Lymphography dataset that enables the classifier to accurately perform multi-class categorization of medical data. Our work asserts the fact that the Random Tree algorithm and the Quinlan’s C4.5 algorithm give 100 percent classification accuracy with all the predictor features and also with the feature subset selected by the Fisher Filtering feature selection algorithm.. It is also stated here that the Density Based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm offers increased clustering accuracy in less computation time.
EVOLVING EFFICIENT CLUSTERING AND CLASSIFICATION PATTERNS IN LYMPHOGRAPHY DAT...ijsc
Data mining refers to the process of retrieving knowledge by discovering novel and relative patterns from
large datasets. Clustering and Classification are two distinct phases in data mining that work to provide an
established, proven structure from a voluminous collection of facts. A dominant area of modern-day
research in the field of medical investigations includes disease prediction and malady categorization. In
this paper, our focus is to analyze clusters of patient records obtained via unsupervised clustering
techniques and compare the performance of classification algorithms on the clinical data. Feature
selection is a supervised method that attempts to select a subset of the predictor features based on the
information gain. The Lymphography dataset comprises of 18 predictor attributes and 148 instances with
the class label having four distinct values. This paper highlights the accuracy of eight clustering algorithms
in detecting clusters of patient records and predictor attributes and highlights the performance of sixteen
classification algorithms on the Lymphography dataset that enables the classifier to accurately perform
multi-class categorization of medical data. Our work asserts the fact that the Random Tree algorithm and
the Quinlan’s C4.5 algorithm give 100 percent classification accuracy with all the predictor features and
also with the feature subset selected by the Fisher Filtering feature selection algorithm.. It is also stated
here that the Density Based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm
offers increased clustering accuracy in less computation time.
ARTIFICIAL NEURAL NETWORKING.
FIRST STEP TO KNOWLEDGE IS TO KNOW THAT we are ignorant
Knowledge in medical field is characterized by uncertanity and vagueness
Historically as well as currently this fact remains a motivation for the development of medical decision support system are based on fuzzy logics
Greek philosopher visualized a basic model of brain function as early as 300 bc
Till date nervous system is not completely understood to human kind.
This document discusses the use of artificial intelligence in drug discovery and development. It begins by defining artificial intelligence, machine learning, and deep learning. It then provides examples of how AI is currently used in various stages of the drug development process, including identifying molecular targets, finding hit compounds, optimizing lead compounds, predicting toxicity, and drug repurposing. It also discusses startups applying AI to drug discovery. Finally, it notes some limitations and drawbacks of using AI, such as potential bias in algorithms.
Simplified Knowledge Prediction: Application of Machine Learning in Real LifePeea Bal Chakraborty
Machine learning is the scientific study of algorithms and statistical models that is used by the machines to perform a specific task depending on patterns and inference rather than explicit instructions. This research and analysis aims to observe how precisely a machine can predict that a patient suspected of breast cancer is having malignant or benign cancer.In this paper the classification of cancer type and prediction of risk levels is done by various model of machine learning and is pictorially depicted by various tools of visual analytics.
This document summarizes a presentation on artificial intelligence in medical research given by Dr. Ahmed Elngar. It discusses the history and foundations of AI, as well as key research directions like machine learning, expert systems, computer vision, neural networks, and deep learning. Deep learning is seen as core to AI and medicine, with applications in medical imaging, video analysis, and clinical tasks like screening, diagnosis and monitoring. Challenges remain around data scarcity, bias and ethical deployment, but AI has great potential to improve healthcare accuracy, efficiency and access worldwide.
Early Identification of Diseases Based on Responsible Attribute using Data Mi...IRJET Journal
This document describes a proposed method for early identification of diseases using data mining and classification techniques. It begins with an introduction to classification and discusses how it is commonly used in healthcare for tasks like predicting patient risk levels. It then reviews related literature applying classification methods to diseases like heart disease and diabetes. The document outlines the problem of selecting the best classification technique for a given healthcare dataset. It proposes an architecture and method for disease prediction that assigns recommended values to attributes and classifies unknown data based on calculating totals. The method is experimentally analyzed using a heart disease dataset, and its accuracy is compared to Bayesian classification. In conclusion, the proposed method seeks to reduce attributes and complexity while accurately classifying patient data for early disease identification.
Performance Evaluation of Different Data Mining Classification Algorithm and ...IOSR Journals
This document evaluates the performance of different data mining classification algorithms and predictive analysis. It applies algorithms like decision trees, naive Bayes, k-nearest neighbor, neural networks, and support vector machines to datasets like Iris, liver disorder, and E. coli. The results show that neural networks and k-nearest neighbor achieved the best performance on these datasets, with accuracy rates up to 97.33% for Iris classification. Feature selection techniques like removing zero-weighted attributes were also found to improve some algorithm performance. Predictive analysis experiments found that neural networks and k-nearest neighbor were most accurate at predicting new class labels.
The document discusses machine learning concepts including supervised learning, unsupervised learning, and reinforcement learning. It describes several machine learning algorithms like decision trees, k-nearest neighbors, naive bayes, and support vector machines that are used in supervised learning. Unsupervised learning techniques like clustering, association, and k-means clustering are also covered. The document concludes that machine learning approaches can help with systematic reviews by assisting in document screening and improving reviewer agreement.
How to create your own artificial neural networksAgrata Shukla
See how to create your own neural networks.Artificial neural networks are used to simulate the functioning of the human brain.The machine could not think but it predicts.These ANN’s are inspired from the nervous system of the human brain.
When deep learners change their mind learning dynamics for active learningDevansh16
Abstract:
Active learning aims to select samples to be annotated that yield the largest performance improvement for the learning algorithm. Many methods approach this problem by measuring the informativeness of samples and do this based on the certainty of the network predictions for samples. However, it is well-known that neural networks are overly confident about their prediction and are therefore an untrustworthy source to assess sample informativeness. In this paper, we propose a new informativeness-based active learning method. Our measure is derived from the learning dynamics of a neural network. More precisely we track the label assignment of the unlabeled data pool during the training of the algorithm. We capture the learning dynamics with a metric called label-dispersion, which is low when the network consistently assigns the same label to the sample during the training of the network and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results.
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...ahmad abdelhafeez
Abstract- The goal of this paper is to compare between different classifiers or multi-classifiers fusion with respect to accuracy in discovering breast cancer for four different data sets. We present an implementation among various classification techniques which represent the most known algorithms in this field on four different datasets of breast cancer two for diagnosis and two for prognosis. We present a fusion between classifiers to get the best multi-classifier fusion approach to each data set individually. By using confusion matrix to get classification accuracy which built in 10-fold cross validation technique. Also, using fusion majority voting (the mode of the classifier output). The experimental results show that no classification technique is better than the other if used for all datasets, since the classification task is affected by the type of dataset. By using multi-classifiers fusion the results show that accuracy improved in three datasets out of four.
AI-based One-stop Antibody Discovery Platform.pdfCandy Swift
To help our clients develop a safe and effective human antibody, Creative Biolabs is continuously exploring novel AI technologies and phage display libraries for antibody development solutions.
https://ai.creative-biolabs.com/ai-based-one-stop-antibody-discovery-platform.htm
AntInfect™ Platform at Creative Biolabs.pdfCandy Swift
Equipped with world-leading technology platforms and professional scientific staff, Creative Biolabs has developed an advanced AntInfect™ Platform which provides various excellent strategies for different purposes.
https://www.creative-biolabs.com/antinfect/antinfect-platform.htm
Background of Therapeutic Non-IgG Antibodies.pdfCandy Swift
Therapeutic antibodies are mainly used for the treatment of cancer and infectious diseases. In addition to the common IgG antibodies, non-IgG antibodies have received more and more attention to the treatment of cancer and infectious diseases in recent years.
https://non-igg-ab.creative-biolabs.com/non-igg-antibody-application-for-disease-therapy.htm
Based on the advanced strategies and to fit the desired the target-product profile, Creative Biolabs can help develop different formats of BsAbs adapted the valency, structure, half-life and biodistribution, etc.
https://www.creative-biolabs.com/bsab/category/custom-bispecific-antibodies-35.htm
Based on leading-edge facilities and profound knowledge, the seasoned scientific team in Creative Biolabs has developed an integrated CellFace™ conjugate technology platform to bring customers impeccable cancer cell surface engineering.
https://cellface-conjugate.creative-biolabs.com/cancer-cells-surface-engineering-service.htm
Creative Biolabs provides the state-of-the-art adenoviral vector with the various design and construction to meet the demand in basic research and preclinical applications.
https://www.creative-biolabs.com/gene-therapy/adenovirus-vector.htm
This document discusses oncolytic viruses, which are viruses that can selectively target and kill cancer cells. It covers the mechanisms of oncolytic viruses, factors to consider in their development like size and pathogenicity, and different families of oncolytic viruses. Specific oncolytic viruses discussed include herpes simplex virus, T-VEC, HF10, and vaccinia virus. The document also examines combination therapies using oncolytic viruses and biomarkers for assessing their delivery and activity in tumor immunotherapy.
Creative Biolabs unveils advanced services combining the phage display service with the single-chain minor histocompatibility complexes (SC-MiHC) and Minor Histocompatibility Antigen Display Services.
https://allogo.creative-biolabs.com/display-platform.htm
Creative Biolabs offers a series of AI-based antibody screening services based on the prediction of antibody-antigen binding and a unique way to find rare antibody clusters and get more candidate antibody sequences by augmenting our data-driven AI screening services.
https://ai.creative-biolabs.com/ai-based-antibody-screening-services.htm
Creative Biolabs has over a decade of working experience in Anti-Virus Biomolecular Discovery. Based on our advanced AntInfect™ Platform, our seasoned scientists can offer high-quality Zika virus (ZIKV) neutralizing antibody and ZIKV-specific peptide discovery services.
https://www.creative-biolabs.com/antinfect/discovery-of-neutralizing-antibody-nab-and-peptide-targeting-zika-virus.htm
Introduction and Mechanism of Oncolytic Virus Therapy.pdfCandy Swift
With a better understanding of cancer biology and virology, oncolytic virus therapy is becoming a promising treatment modality for tumor targeting and offers unique opportunities for tumor therapy. Taking advantage of our OncoVirapy™ platform, Creative Biolabs can provide high-quality oncolytic virus-based therapy development.
https://www.creative-biolabs.com/oncolytic-virus/introduction-and-mechanism-of-oncolytic-virus-therapy.htm
Gene Editing ZFN, TALEN, and CRISPRCas9.pdfCandy Swift
Creative Biolabs provides Transcription activator-like effector nucleases (TALENs), and CRISPR/Cas (CRISPR associated) systems services to clients all over the world.
https://www.creative-biolabs.com/gene-therapy/gene-editing-for-gene-therapy.htm
Creative Biolabs has developed services combining the phage display service with the single-chain minor histocompatibility complexes (SC-MiHC).
https://allogo.creative-biolabs.com/display-platform.htm
Infectious Diseases and Anti-Virus Biomolecular Discovery.pdfCandy Swift
This document provides information about Anti-Virus Biomolecular Discovery services from Creative Biolabs for developing antibodies and peptides against infectious diseases. It outlines their platform for membrane protein expression including VLP immunization and DNA immunization. Their technologies include fluorescent cell sorting, hybridoma production of monoclonal antibodies, phage display, and high-density peptide arrays. Their goal is to provide cost-effective discovery of functional antibody and peptide candidates against various virus targets.
Tandem Fabs are a class of Fab-based bispecific antibody fragments. Creative Biolab offers a wide range of Tandem Fab products.
https://www.creative-biolabs.com/bsab/category/tandem-fab-28.htm
Oncolytic Virus Biodistribution Study.pdfCandy Swift
The biodistribution assay of an oncolytic virus is equal to the PK study of a small molecular drug development. Scientists in Creative Biolabs have developed efficient methods to study the biodistribution of oncolytic virus in vivo.
https://www.creative-biolabs.com/oncolytic-virus/oncolytic-virus-biodistribution-study.htm
As a frontier biotech service provider, Creative Biolabs provides superior recombinant lentivirus products for our clients all over the world.
https://www.creative-biolabs.com/gene-therapy/category-recombinant-lentivirus-306.htm
With years of experience in cell surface conjugation, Creative Biolabs is dedicated to offering high-quality EV engineering services to modulate their function and interactions.
https://cellface-conjugate.creative-biolabs.com/extracellular-vesicles-surface-engineering-service.htm
Creative Biolabs provides BsAb engineering service to adjust the properties of BsAbs such as valency, size, half-life, flexibility, etc., to meet the specific requirements.
https://www.creative-biolabs.com/bsab/bsab-engineering.htm
Discovery of Antibody and Peptide Targeting Escherichia.pdfCandy Swift
Anti-Bacteria Biomolecular Discovery platform at Creative Biolabs, utilizing diverse technologies such as phage display, hybridoma technology, peptide array, etc., provides the best services to facilate the anti-Escherichia projects for global customers.
https://www.creative-biolabs.com/antinfect/discovery-of-antibody-and-peptide-targeting-escherichia.htm
PPT on Direct Seeded Rice presented at the three-day 'Training and Validation Workshop on Modules of Climate Smart Agriculture (CSA) Technologies in South Asia' workshop on April 22, 2024.
The technology uses reclaimed CO₂ as the dyeing medium in a closed loop process. When pressurized, CO₂ becomes supercritical (SC-CO₂). In this state CO₂ has a very high solvent power, allowing the dye to dissolve easily.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfSelcen Ozturkcan
Ozturkcan, S., Berndt, A., & Angelakis, A. (2024). Mending clothing to support sustainable fashion. Presented at the 31st Annual Conference by the Consortium for International Marketing Research (CIMaR), 10-13 Jun 2024, University of Gävle, Sweden.
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...Scintica Instrumentation
Targeting Hsp90 and its pathogen Orthologs with Tethered Inhibitors as a Diagnostic and Therapeutic Strategy for cancer and infectious diseases with Dr. Timothy Haystead.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
The cost of acquiring information by natural selectionCarl Bergstrom
This is a short talk that I gave at the Banff International Research Station workshop on Modeling and Theory in Population Biology. The idea is to try to understand how the burden of natural selection relates to the amount of information that selection puts into the genome.
It's based on the first part of this research paper:
The cost of information acquisition by natural selection
Ryan Seamus McGee, Olivia Kosterlitz, Artem Kaznatcheev, Benjamin Kerr, Carl T. Bergstrom
bioRxiv 2022.07.02.498577; doi: https://doi.org/10.1101/2022.07.02.498577
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
Authoring a personal GPT for your research and practice: How we created the Q...
AI-augmented Drug Discovery.pdf
1. AI-Augmented
Drug Discovery
Creative Biolabs provides innovative drug discovery services
based on our original Artificial Intelligence-augmented
technology, especially for the discovery of therapeutic
antibodies and small molecules.
Email: info@creative-biolabs.com
Address: SUITE 203, 17 Ramsey Road, Shirley, NY 11967, USA
Web: www.creative-biolabs.com
4. Introducing a new drug to market
can cost pharmaceutical
companies an average $2.6 billion
and 11-15 years of research and
development.
Even once new drug candidates
show potential in laboratory
testing, less than 10% of drug
candidates make it to market
following Phase I trials.
Between 2010 and 2017, 76% of
new drugs approved by the US
Food and Drug Administration
(FDA) are small molecules.
$2.6 B 10% 76%
WHY USE AI IN DRUG
DISCOVERY?
5. After making it through the preclinical development
phase, and receiving approval from the FDA,
researchers begin testing the drug with human
participants. AI can facilitate participant monitoring
during clinical trials—generating a larger set of data
more quickly—and aid in participant retention by
personalizing the trial experience.
AI in Clinical Trials
(Phase III)
The drug discovery process ranges from reading and analyzing
already existing literature, to testing the ways potential drugs
interact with targets. According to report, AI could curb drug
discovery costs for companies by as much as 70%.
AI in Drug Discovery
(Phase I)
The preclinical development phase of drug discovery involves
testing potential drug targets on animal models. Utilizing AI
during this phase could help trials run smoothly and enable
researchers to more quickly and successfully predict how a
drug might interact with the animal model.
AI in Preclinical Development
(Phase II)
6. Ø Predicting 3D structure of
target protein
Ø Predicting drug-protein
interactions
Ø AI in determining drug
activity
Ø AI in de novo drug design
AI in
drug design
AI In Drug Discovery
AI in
polypharmacology
Ø Designing biospecific
drug molecules
Ø Designing multitarget
drug molecules
AI in
chemical synthesis
Ø Predicting reaction yield
Ø Predicting retrosynthesis
pathways
Ø Developing insights into
reaction mechanisms
Ø Designing synthetic route
AI in
drug repurposing
Ø Identification of
therapeutic target
Ø Prediction of new
therapeutic use
AI in
drug screening
Ø Prediction of toxicity
Ø Prediction of bioactivity
Ø Prediction of
physicochemical property
Ø Identification and
classification of target cells
8. Classes of Learning Tasks and Techniques
Mix of supervised and unsupervised learning, where less expensive and more abundant unlabeled
data can be utilized to train a classifier.
Semisupervised Learning (Fig. A)
A learning algorithm can interactively query the user to determine labels for unlabeled data in the
regions of the input space about which the model is least certain.
Active Learning (Fig. B)
Describes a family of algorithms that relax the common assumption that the training and test data
should be in the same feature space and follow the same distribution.
Transfer Learning (Fig. D)
Can be treated as a geometric or topological problem, the goal is to find similarities and differences
between data points used to spatially order data.
Unsupervised Learning
The goal is to reconstruct the unknown function f that assigns output values y to data points x.
Supervised Learning
Instead of learning only one task at a time, as in single-task learning, several different but
conceptually related tasks are learned in parallel and make use of a shared internal representation.
Multitask Learning (Fig. E)
To some extent strives to emulate reward-driven learning, and in its simplest configuration, an agent
attempts to find the optimal set of actions to promote some outcome.
Reinforcement Learning (Fig. C)
Xin Y,et al. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019, 119 (18): 10520-10594.
9. Bayesian methods are those that explicitly
apply Bayes’ theorem to classification and
regression problems.
Bayesian Algorithms
It is called instance-based because it builds
the hypotheses from the training instances.
It is also known as memory-based learning
or lazy-learning.
Instance-Based Methods
Algorithms for constructing decision trees
usually work top-down, by choosing a
variable at each step that best splits the
set of items.
Decision Tree Algorithms
In statistics and machine learning,
ensemble methods use multiple
learning algorithms to obtain better
predictive performance than could be
obtained from any of the constituent
learning algorithms alone.
Ensemble Algorithms
Dimensionality reduction seeks a lower-
dimensional representation of numerical
input data that preserves the salient
relationships in the data.
Dimensionality Reduction
Artificial neural networks (ANNs) consist of
input, hidden, and output layers with
connected neurons (nodes) to simulate the
human brain.
Artificial Neural Networks
Common Learning Algorithms
10. Bayesian Algorithms
Liu ZH,et al. ChemStable: A web server for rule-embedded naïve Bayesian learning approach to predict
compound stability. J. Comput. Aided Mol. Des. 2014, 28: 941-950.
11. Instance-Based Methods
SVM is a supervised machine learning algorithm used for both classification
and regression. The objective of SVM algorithm is to find a hyperplane in an
N-dimensional space that distinctly classifies the data points.
Support Vector Machine
A SOM or self-organizing feature map is an unsupervised machine learning
technique used to produce a low-dimensional representation of a higher
dimensional data set while preserving the topological structure of the data.
Self-organizing Map
KNN is a simple, supervised machine learning algorithm that can be used to
solve both classification and regression problems.
K-nearest Neighbor
Xin Y,et al. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019,
119 (18): 10520-10594.
12. Decision Tree Algorithms
Random forests or random decision forests is an ensemble learning method for
classification, regression and other tasks that operates by constructing a multitude of
decision trees at training time.
Random Forest
A decision tree is a decision support tool that uses a tree-like model of decisions and their
possible consequences, including chance event outcomes, resource costs, and utility.
Decision Tree
Xin Y,et al. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019, 119 (18): 10520-10594.
13. Ensemble Algorithms
Boosting is an ensemble learning method that combines a set of weak
learners into a strong learner to minimize training errors. In boosting, a
random sample of data is selected, fitted with a model and then
trained sequentially—that is, each model tries to compensate for the
weaknesses of its predecessor.
Boosting
Bagging, is the ensemble learning method that is commonly used
to reduce variance within a noisy dataset. In bagging, a random
sample of data in a training set is selected with replacement—
meaning that the individual data points can be chosen more than
once.
Bagging
Xin Y,et al. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019, 119 (18): 10520-10594.
14. Dimensionality Reduction
LDA is a generalization of Fisher's linear discriminant, a method used in
statistics, pattern recognition and machine learning to find a linear
combination of features that characterizes or separates two or more classes
of objects or events.
Linear Discriminant Analysis
Image From Wikipedia
A visual depiction of the resulting PCA projection for a set of 2D points. A visual depiction of the resulting LDA projection for a set of 2D points.
PCA is a popular technique for analyzing large datasets containing a high
number of dimensions/features per observation, increasing the
interpretability of data while preserving the maximum amount of information,
and enabling the visualization of multidimensional data.
Principal Component Analysis
15. Artificial Neural Networks
DNN refers to an ANN that has several hidden layers with several
differences. Deep nets process data in complex ways by employing
sophisticated math modeling.
Deep Neural Networks
ANNs are computing systems inspired by the biological neural networks
that constitute animal brains. A typical ANN architecture contains many
artificial neurons arranged in a series of layers: the input layer, an output
layer, i.e., the top layer, which generates a desired prediction ( ADMET
properties, activity, a vector of fingerprint etc.), and one or more hidden
layer where the intermediate representations of the input data are
transformed.
Artificial neural networks
Image From Wikipedia
17. DeepVS: Boosting Docking-Based Virtual
Screening with DL
Pereira J.C. Boosting docking-based virtual screening with deep learning. J. Chem. Inf. Model. 2016;56:2495–2506.
Mostafa K. DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks. Bioinformatics. 2019, 35(18):3329–3338.
The deep neural network that is introduced, DeepVS, uses the output of a
docking program and learns how to extract relevant features from basic
data. The approach introduces the use of atom and amino acid
embeddings and implements an effective way of creating distributed
vector representations of protein–ligand complexes by modeling the
compound as a set of atom contexts that is further processed by a
convolutional layer.
DeepVS
18. DeepAffinity: DL Method
Used to Measure DTBA
Mostafa K. DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks. Bioinformatics. 2019, 35(18):3329–3338.
DeepAffinity is a deep learning methods used to measure drug
target binding affinity. Under novel representations of
structurally-annotated protein sequences, a semi-supervised
deep learning model that unifies recurrent and convolutional
neural networks has been proposed to exploit both unlabeled
and labeled data, for joint ly encoding molecular
representations and predicting affinities. Performances for new
protein classes with few labeled data are further improved by
transfer learning.
DeepAffinity
19. DeepTox: Toxicity Prediction Using Deep Learning
Mayr A. DeepTox: toxicity prediction using deep learning. Front. Environ. Sci. 2016, 3:80.
Representation of a toxicophore by hierarchically related features.
22. • High throughput, screen large numbers of clones
• Large library capacity: from 107 to over 108
• Various phage display systems (M13,λ,T7)
• Tailored biopanning strategies
• Wide range of applications
Antibody Production by Phage Display
Creative Biolabs has combined AI, big data, machine learning, and phage display to generate a novel AI-powered computational antibody drug discovery
platform. Aided by this innovative platform, one-stop human antibody discovery services are provided, including antibody-antigen binding prediction,
antibody candidate generation, antibody sequence optimization, and antibody production & characterization.
AI-Based One-stop Antibody
Discovery Platform
• Discover and analyze new antibody clusters
• Generate new sequences within existing clusters
• Accelerate the generation of high-affinity antibodies
• Rapidly generate novel antibody sequences using
computational algorithms to help improve affinity, solubility,
manufacturability, specificity, and stability
Augmented Antibody Discovery with Al
23. AI can typically generate 10 times more
antibody sequence clusters than a laboratory-
based approach alone. Diversity leads to the
discovery of new binding modalities and
potentially new therapeutic modes-of-action.
Antibody Discovery Services
Creative Biolabs is specialized in designing and
performing high-quality custom AI-based antibody
screening assays, with different formats, endpoints,
parameters, to satisfy any specific requirement.
Antibody Screening Services
Creative Biolabs offers a wide variety of antibody
engineering services to quickly and efficiently optimize
the existing antibodies via AI based algorithms, such
as affinity, solubility, cross-reactivity, manufacturability,
immunogenicity, specificity, and stability.
Creative Biolabs has applied AI technology in small
molecule design and optimization to promote its affinity,
specificity, and validity. Our innovative AI methods range
from in silico molecule screening, molecular modeling, to
AI-based molecule optimization.
Small Molecule Design & Optimization
Creative Biolabs provides the best strategy and
customized protocols for model training data
service, and ultimately, to accelerate the novel
candidate drug discovery.
AI-Augmented Drug Discovery at Creative Biolabs
Antibody Engineering Services Model Training Data Services