The document discusses developing quantitative structure-activity relationship (QSAR) models to predict the biological responses of nanomaterials. It describes using descriptors of pristine and weathered nanomaterials, as well as experimental parameters, to develop linear regression models between descriptors and responses. Partial least squares regression is used to handle correlations between descriptors. The data is also analyzed using k-means clustering to identify separate descriptor clusters, and QSAR models are developed for each cluster to improve predictions. The resulting models could then be used to predict responses of emerging nanomaterials based on their similarity to existing clusters.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Design of State Estimator for a Class of Generalized Chaotic Systemsijtsrd
In this paper, a class of generalized chaotic systems is considered and the state observation problem of such a system is investigated. Based on the time domain approach with differential inequality, a simple state estimator for such generalized chaotic systems is developed to guarantee the global exponential stability of the resulting error system. Besides, the guaranteed exponential decay rate can be correctly estimated. Finally, several numerical simulations are given to show the effectiveness of the obtained result. Yeong-Jeu Sun "Design of State Estimator for a Class of Generalized Chaotic Systems" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-6 , October 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29270.pdf Paper URL: https://www.ijtsrd.com/engineering/electrical-engineering/29270/design-of-state-estimator-for-a-class-of-generalized-chaotic-systems/yeong-jeu-sun
A New Approach to Design a Reduced Order ObserverIJERD Editor
In this paper, a new method for designing a reduced order observer for linear time-invariant system is
proposed. The approach is based on matrix inversion with proper dimension. The arbitrariness associated with
the method proposed by O’Reilly is presented here and has been reduced with the help of pole-placement
technique. It also helps reducing the computations regarding the observer design parameters. Illustrative
numerical examples with simulation results are also included.
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...Golden Helix Inc
Confounding from population structure, extended families and inbreeding can be a significant issue for burden and kernel association tests on rare variants from next generation DNA sequencing. An obvious solution is to combine the power of a mixed model regression analysis with the ability to assess the rare variant burden using methods such as KBAC or CMC. Recent approaches have adjusted burden and kernel tests using linear regression models; this method adjusts for the relatedness of samples and includes that directly into a logistic regression model.
This webcast will focus on the details of bringing Mixed Model Regression and KBAC together, including: deriving an optimal logistic mixed model algorithm for calculating the reduced model score, how the kinship or random effects matrix should be specified, and how it all comes together into one algorithm. Results from applying the method to variants from the 1000 Genomes project will also be presented and compared to famSKAT.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Design of State Estimator for a Class of Generalized Chaotic Systemsijtsrd
In this paper, a class of generalized chaotic systems is considered and the state observation problem of such a system is investigated. Based on the time domain approach with differential inequality, a simple state estimator for such generalized chaotic systems is developed to guarantee the global exponential stability of the resulting error system. Besides, the guaranteed exponential decay rate can be correctly estimated. Finally, several numerical simulations are given to show the effectiveness of the obtained result. Yeong-Jeu Sun "Design of State Estimator for a Class of Generalized Chaotic Systems" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-6 , October 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29270.pdf Paper URL: https://www.ijtsrd.com/engineering/electrical-engineering/29270/design-of-state-estimator-for-a-class-of-generalized-chaotic-systems/yeong-jeu-sun
A New Approach to Design a Reduced Order ObserverIJERD Editor
In this paper, a new method for designing a reduced order observer for linear time-invariant system is
proposed. The approach is based on matrix inversion with proper dimension. The arbitrariness associated with
the method proposed by O’Reilly is presented here and has been reduced with the help of pole-placement
technique. It also helps reducing the computations regarding the observer design parameters. Illustrative
numerical examples with simulation results are also included.
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...Golden Helix Inc
Confounding from population structure, extended families and inbreeding can be a significant issue for burden and kernel association tests on rare variants from next generation DNA sequencing. An obvious solution is to combine the power of a mixed model regression analysis with the ability to assess the rare variant burden using methods such as KBAC or CMC. Recent approaches have adjusted burden and kernel tests using linear regression models; this method adjusts for the relatedness of samples and includes that directly into a logistic regression model.
This webcast will focus on the details of bringing Mixed Model Regression and KBAC together, including: deriving an optimal logistic mixed model algorithm for calculating the reduced model score, how the kinship or random effects matrix should be specified, and how it all comes together into one algorithm. Results from applying the method to variants from the 1000 Genomes project will also be presented and compared to famSKAT.
A GENERALIZED SAMPLING THEOREM OVER GALOIS FIELD DOMAINS FOR EXPERIMENTAL DESIGNcscpconf
In this paper, the sampling theorem for bandlimited functions over
domains is
generalized to one over ∏
domains. The generalized theorem is applicable to the
experimental design model in which each factor has a different number of levels and enables us
to estimate the parameters in the model by using Fourier transforms. Moreover, the relationship
between the proposed sampling theorem and orthogonal arrays is also provided.
A Generalized Sampling Theorem Over Galois Field Domains for Experimental Des...csandit
In this paper, the sampling theorem for bandlimited functions over
domains is
generalized to one over ∏
domains. The generalized theorem is applicable to the
experimental design model in which each factor has a different number of levels and enables us
to estimate the parameters in the model by using Fourier transforms. Moreover, the relationship
between the proposed sampling theorem and orthogonal arrays is also provided.
KEY
These days a lot of data being generated is in the form of time series. From climate data to users post in social media, stock prices, neurological data etc. Discovering the temporal dependence between different time series data is important task in time series analysis. It finds its application in varied fields ranging from advertising in social media, finding influencers, marketing, share markets, psychology, climate science etc. Identifying the networks of dependencies has been studied in this report.
In this report we have study how this problem has been studied in the field of econometrics. We will also study three different approaches for building causal networks between the time series and then see how this knowledge has been used in three completely different fields. At last some important issues are presented and areas in which this can be extended for further research.
MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...Golden Helix Inc
Confounding from population structure, extended families and inbreeding can be a significant issue for burden and kernel association tests on rare variants from next generation DNA sequencing. An obvious solution is to combine the power of a mixed model regression analysis with the ability to assess the rare variant burden using methods such as KBAC or CMC. Recent approaches have adjusted burden and kernel tests using linear regression models; this method adjusts for the relatedness of samples and includes that directly into a logistic regression model.
Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...Naoki Hayashi
I have talked in the conference Algebraic Statistics 2020.
As a background of our research, I briefly explained singular learning theory which can be interpretable as an intersection between algebraic statistics and statistical learning theory.
The main part of this presentation is introducing our recent studies for parameter region restriction in singular learning theory. I showed the researches about the learning coefficient (real log canonical threshold) of NMF and LDA. NMF and LDA are typical models whose parameter regions are restricted.
This research paper demonstrates the invention of the kinetic bands, based on Romanian mathematician and statistician Octav Onicescu’s kinetic energy, also known as “informational energy”, where we use historical data of foreign exchange currencies or indexes to predict the trend displayed by a stock or an index and whether it will go up or down in the future. Here, we explore the imperfections of the Bollinger Bands to determine a more sophisticated triplet of indicators that predict the future movement of prices in the Stock Market. An Extreme Gradient Boosting Modelling was conducted in Python using historical data set from Kaggle, the historical data set spanning all current 500 companies listed. An invariable importance feature was plotted. The results displayed that Kinetic Bands, derived from (KE) are very influential as features or technical indicators of stock market trends. Furthermore, experiments done through this invention provide tangible evidence of the empirical aspects of it. The machine learning code has low chances of error if all the proper procedures and coding are in play. The experiment samples are attached to this study for future references or scrutiny.
Avionics 738 Adaptive Filtering at Air University PAC Campus by Dr. Bilal A. Siddiqui in Spring 2018. This lecture covers background material for the course.
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...Vahid Taslimitehrani
Presented at 15th International Conference on BioInformatics and BioEngineering (BIBE2014)
Prognostic modeling is central to medicine, as it is often used to predict patients’ outcome and response to treatments and to identify important medical risk factors. Logistic regression is one of the most used approaches for clinical pre- diction modeling. Traumatic brain injury (TBI) is an important public health issue and a leading cause of death and disability worldwide. In this study, we adapt CPXR (Contrast Pattern Aided Regression, a recently introduced regression method), to develop a new logistic regression method called CPXR(Log), for general binary outcome prediction (including prognostic modeling), and we use the method to carry out prognostic modeling for TBI using admission time data. The models produced by CPXR(Log) achieved AUC as high as 0.93 and specificity as high as 0.97, much better than those reported by previous studies. Our method produced interpretable prediction models for diverse patient groups for TBI, which show that different kinds of patients should be evaluated differently for TBI outcome prediction and the odds ratios of some predictor variables differ significantly from those given by previous studies; such results can be valuable to physicians.
Probabilistic Logic Programming with Beta-Distributed Random VariablesFederico Cerutti
by Federico Cerutti; Lance Kaplan; Angelika Kimmig; Murat Sensoy
Paper accepted at AAAI2019
We enable aProbLog—a probabilistic logical programming
approach—to reason in presence of uncertain probabilities
represented as Beta-distributed random variables. We
achieve the same performance of state-of-the-art algorithms
for highly specified and engineered domains, while simultaneously
we maintain the flexibility offered by aProbLog
in handling complex relational domains. Our motivation is
that faithfully capturing the distribution of probabilities is
necessary to compute an expected utility for effective decision
making under uncertainty: unfortunately, these probability
distributions can be highly uncertain due to sparse data. To
understand and accurately manipulate such probability distributions
we need a well-defined theoretical framework that is
provided by the Beta distribution, which specifies a distribution
of probabilities representing all the possible values of a
probability when the exact value is unknown.
Data Driven Choice of Threshold in Cepstrum Based Spectrum Estimatesipij
The technique of cepstrum thresholding, which is shown to be an effective, yet simple, way of obtaining a smoothed non parametric spectrum estimate of a stationary signal. The major problem of this method is the choice of the threshold value for variance reduction of spectrum estimates. This paper proposes a new threshold selection method which is based on cross validation schemes such as Leave-One-Out, LeaveTwo-Out and Leave-Half-Out. This new methods are easy to describe, simple to implement, and does not impose severe conditions on the unknown spectrum. Numerical results suggest that this new methods are shown to be in agreement with those obtained when the spectrum is fully known.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
A GENERALIZED SAMPLING THEOREM OVER GALOIS FIELD DOMAINS FOR EXPERIMENTAL DESIGNcscpconf
In this paper, the sampling theorem for bandlimited functions over
domains is
generalized to one over ∏
domains. The generalized theorem is applicable to the
experimental design model in which each factor has a different number of levels and enables us
to estimate the parameters in the model by using Fourier transforms. Moreover, the relationship
between the proposed sampling theorem and orthogonal arrays is also provided.
A Generalized Sampling Theorem Over Galois Field Domains for Experimental Des...csandit
In this paper, the sampling theorem for bandlimited functions over
domains is
generalized to one over ∏
domains. The generalized theorem is applicable to the
experimental design model in which each factor has a different number of levels and enables us
to estimate the parameters in the model by using Fourier transforms. Moreover, the relationship
between the proposed sampling theorem and orthogonal arrays is also provided.
KEY
These days a lot of data being generated is in the form of time series. From climate data to users post in social media, stock prices, neurological data etc. Discovering the temporal dependence between different time series data is important task in time series analysis. It finds its application in varied fields ranging from advertising in social media, finding influencers, marketing, share markets, psychology, climate science etc. Identifying the networks of dependencies has been studied in this report.
In this report we have study how this problem has been studied in the field of econometrics. We will also study three different approaches for building causal networks between the time series and then see how this knowledge has been used in three completely different fields. At last some important issues are presented and areas in which this can be extended for further research.
MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...Golden Helix Inc
Confounding from population structure, extended families and inbreeding can be a significant issue for burden and kernel association tests on rare variants from next generation DNA sequencing. An obvious solution is to combine the power of a mixed model regression analysis with the ability to assess the rare variant burden using methods such as KBAC or CMC. Recent approaches have adjusted burden and kernel tests using linear regression models; this method adjusts for the relatedness of samples and includes that directly into a logistic regression model.
Bayesian Generalization Error and Real Log Canonical Threshold in Non-negativ...Naoki Hayashi
I have talked in the conference Algebraic Statistics 2020.
As a background of our research, I briefly explained singular learning theory which can be interpretable as an intersection between algebraic statistics and statistical learning theory.
The main part of this presentation is introducing our recent studies for parameter region restriction in singular learning theory. I showed the researches about the learning coefficient (real log canonical threshold) of NMF and LDA. NMF and LDA are typical models whose parameter regions are restricted.
This research paper demonstrates the invention of the kinetic bands, based on Romanian mathematician and statistician Octav Onicescu’s kinetic energy, also known as “informational energy”, where we use historical data of foreign exchange currencies or indexes to predict the trend displayed by a stock or an index and whether it will go up or down in the future. Here, we explore the imperfections of the Bollinger Bands to determine a more sophisticated triplet of indicators that predict the future movement of prices in the Stock Market. An Extreme Gradient Boosting Modelling was conducted in Python using historical data set from Kaggle, the historical data set spanning all current 500 companies listed. An invariable importance feature was plotted. The results displayed that Kinetic Bands, derived from (KE) are very influential as features or technical indicators of stock market trends. Furthermore, experiments done through this invention provide tangible evidence of the empirical aspects of it. The machine learning code has low chances of error if all the proper procedures and coding are in play. The experiment samples are attached to this study for future references or scrutiny.
Avionics 738 Adaptive Filtering at Air University PAC Campus by Dr. Bilal A. Siddiqui in Spring 2018. This lecture covers background material for the course.
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...Vahid Taslimitehrani
Presented at 15th International Conference on BioInformatics and BioEngineering (BIBE2014)
Prognostic modeling is central to medicine, as it is often used to predict patients’ outcome and response to treatments and to identify important medical risk factors. Logistic regression is one of the most used approaches for clinical pre- diction modeling. Traumatic brain injury (TBI) is an important public health issue and a leading cause of death and disability worldwide. In this study, we adapt CPXR (Contrast Pattern Aided Regression, a recently introduced regression method), to develop a new logistic regression method called CPXR(Log), for general binary outcome prediction (including prognostic modeling), and we use the method to carry out prognostic modeling for TBI using admission time data. The models produced by CPXR(Log) achieved AUC as high as 0.93 and specificity as high as 0.97, much better than those reported by previous studies. Our method produced interpretable prediction models for diverse patient groups for TBI, which show that different kinds of patients should be evaluated differently for TBI outcome prediction and the odds ratios of some predictor variables differ significantly from those given by previous studies; such results can be valuable to physicians.
Probabilistic Logic Programming with Beta-Distributed Random VariablesFederico Cerutti
by Federico Cerutti; Lance Kaplan; Angelika Kimmig; Murat Sensoy
Paper accepted at AAAI2019
We enable aProbLog—a probabilistic logical programming
approach—to reason in presence of uncertain probabilities
represented as Beta-distributed random variables. We
achieve the same performance of state-of-the-art algorithms
for highly specified and engineered domains, while simultaneously
we maintain the flexibility offered by aProbLog
in handling complex relational domains. Our motivation is
that faithfully capturing the distribution of probabilities is
necessary to compute an expected utility for effective decision
making under uncertainty: unfortunately, these probability
distributions can be highly uncertain due to sparse data. To
understand and accurately manipulate such probability distributions
we need a well-defined theoretical framework that is
provided by the Beta distribution, which specifies a distribution
of probabilities representing all the possible values of a
probability when the exact value is unknown.
Data Driven Choice of Threshold in Cepstrum Based Spectrum Estimatesipij
The technique of cepstrum thresholding, which is shown to be an effective, yet simple, way of obtaining a smoothed non parametric spectrum estimate of a stationary signal. The major problem of this method is the choice of the threshold value for variance reduction of spectrum estimates. This paper proposes a new threshold selection method which is based on cross validation schemes such as Leave-One-Out, LeaveTwo-Out and Leave-Half-Out. This new methods are easy to describe, simple to implement, and does not impose severe conditions on the unknown spectrum. Numerical results suggest that this new methods are shown to be in agreement with those obtained when the spectrum is fully known.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
2. Assumptions
• Setting up and running the same experiment in the laboratory should get
the same results, time after time (within an error).
• The results of experiments, and how experiments are set up and run can
be described by a quantitative relationship.
• This relationship is a function 𝑦 = 𝑓 𝑥1, 𝑥2, … , 𝑥𝑚 , where y is the result of
the experiment and 𝑥1, …, 𝑥𝑚 are descriptors of the experiment. Every
time the values of the descriptors are the same, the result is the same.
• What that function looks like and what descriptors should be used are what
we are tying to find out.
2
3. Descriptors and Responses
• The descriptors of an experiment may be divided into:
o Properties of “pristine” material (e.g. surface charge, zeta potential);
o Properties of “weathered” or “aged” material (e.g. hydration);
o Parameters of experiment and assay increments (e.g. temperature,
nanomaterial concentration)
•The experimental responses may be results such as:
o The percentage of human lung cells that expire after 1 day
o The percentage of human lung cells that expire after 2 days
o Similar results for different cell types
3
5. Descriptor and Response Relationship
• A row is generated for each experiment conducted, recording the values
the descriptors take on and the results of the experiment.
• If we assume a linear relationship between descriptors and the results,
the function becomes 𝑦 = 𝑓 𝑥1, 𝑥2, … , 𝑥𝑚 = 𝑏0 + 𝑏1𝑥1 + … + 𝑏𝑚𝑥𝑚
• The results of multiple experiments can be represented using the matrix
notation
𝑦 = 𝑋𝑏 + 𝑒
where 𝑋 has m columns of descriptors and n rows of experiments.
5
7. NanoQSAR
• Select 80% of experimental results randomly to build a QSAR model
𝑅2 = 1 −
𝑦𝑎𝑐𝑡𝑢𝑎𝑙 − 𝑦𝑚𝑜𝑑𝑒𝑙
2
𝑦𝑎𝑐𝑡𝑢𝑎𝑙 − 𝑦𝑚𝑒𝑎𝑛
2
• How close to 1.0 reflects the quality of the model and the error terms
• With the remaining 20%, predict results
𝑄2
= 1 −
𝑦𝑎𝑐𝑡𝑢𝑎𝑙 − 𝑦𝑝𝑟𝑒𝑑𝑖𝑐𝑡
2
𝑦𝑎𝑐𝑡𝑢𝑎𝑙 − 𝑦𝑚𝑒𝑎𝑛
2
• In general, 𝑅2
≥ 𝑄2
7
8. Latent Structure of X (and Y)
• When there are correlations (collinearity) between the columns of 𝑋, the
calculated regression coefficients 𝑏 become unstable.
• Because of this, multivariate projection methods such as PLS (Projections
to Latent Structures) are increasingly being used in QSAR analysis.
• This method takes the projections of descriptors down to a reduced
dimensional hyperplane of descriptors.
• More stable calculated regression coefficients 𝑏 can be found using this
inherent latent structure of matrix 𝑋.
• Similar reduction of dimensions can be done for experimental results.
8
10. Many Separate Clusters
• Nature is found to organize experimental results in a clustered and
discontinuous way.
• How many clusters exist may be found using a k-means algorithm that starts
from n clusters, where n is the number of experimental results.
• Number of clusters are reduced each iteration by combining closest clusters.
•Also for each iteration, QSAR modeling is performed for all clusters that are
large enough, and how close the predicted values are to the actual values
𝑄2 is calculated.
• At the final step, the number of clusters with the best 𝑄2 is selected.
•If there are any clusters that are still not large enough for QSAR modeling,
new experimental data needs to be generated.
10
12. Emerging NanoMaterials
• What cluster an emerging nanomaterial is most similar to can be
identified by including theoretical descriptors like SMILES strings, and the
x, y, z coordinates of different molecules in the nanostructure.
• The emerging nanomaterials can then be associated with the closest
cluster.
•Experimental results are predicted using the regression equation found for
that particular cluster:
𝑦 = 𝑏0 + 𝑏1𝑥1 + … + 𝑏𝑚𝑥𝑚
• Like before, if an emerging nanomaterial is found very far from any
existing cluster, new experimental data needs to be generated to fill that
hole in the database.
12