SlideShare a Scribd company logo
1 of 31
Download to read offline
Simulation of Stokes Shifts and Analysis of
Substituent Effects in Boron Difluoride
Formazanate Complexes
By:
Rayner B. Mendes
Supervisor:
Viktor N. Staroverov
CHEM 4491E Thesis
submitted in partial fulfillment of the requirements
for the degree: Honours Bachelor of Science (Chemistry)
Department of Chemistry
The University of Western Ontario
London, Ontario, Canada
2016
ii
Thesis Examiner 1: __________________ Styliani Constas
Thesis Examiner 2: __________________ J. Clara Wren
Supervisor: __________________ Viktor N. Staroverov
iii
Abstract
Simulation of spectroscopic properties such as absorption, emission, and the Stokes shifts is of
great interest for the tuning of functional materials. Structural tuning of boron difluoride (BF2)
complexes has been shown to affect these spectroscopic properties. This study attempts to simulate
the Stokes shifts of BF2 formazanate complexes using density-functional theory. Additionally, it
analyzes the effect of structural tuning through a quantitative structure-property relationship study
using multivariable regression of quantum-chemical descriptors. The proposed methodology is
able to reproduce experimental data and provide insights into trends in substituent effects. The
PBE0/6-311+G(d,p) level of theory is optimal in simulating absorption, emission, and Stokes
shifts. The level of theory has a mean absolute error of 0.0048 eV, compared to other functionals
and basis sets tested across 60 unique trials. For small ฯ€-conjugated molecules the simulated
absorption, emission and Stokes shift had errors under 0.5%, 0.4% and 1.4% respectively. BF2
formazanate complexes had absorption and emission errors under 3% and 7%. Electronic trends
from previous research were reproduced, electron-donation groups caused a red shift while
electron-withdrawing groups caused a blue shift in the R3 position. The quantitative structure-
property relationship model suggests that this trend is reversed when substituents are placed in the
R1 and R2 positions. The multivariable regression analysis found that the absorption energy could
be described by the dipole moment, excited-state dipole moment, dipole moment derivatives,
HOMO and excited-state HOMO energies. The regression equation for a training series had a 6.0
nm mean absolute error, adjusted r2
of 0.9949, F-value under 0.04%, and all descriptors used had
a p-value under 1% and is therefore accurate within a 99% confidence interval.
iv
Contents
Abstract .........................................................................................................................................................iii
Acknowledgments..........................................................................................................................................v
1 Introduction............................................................................................................................................1
1.1 Stokes shifts and structural tuning..................................................................................................1
1.2 Objectives.......................................................................................................................................3
1.3 Simulation of Stokes shifts.............................................................................................................3
1.4 Quantitative structure-property relationship (QPSR).....................................................................4
2 Experimental ..........................................................................................................................................5
2.1 Simulation of Stokes shifts.............................................................................................................5
2.1.1 Defining a molecular training set ...............................................................................................5
2.1.2 Computation details....................................................................................................................5
2.1.3 Choice of functionals and basis sets...........................................................................................9
2.2 QSPR model.................................................................................................................................11
2.2.1 QSPR training series ................................................................................................................11
2.2.2 Electronic and steric descriptors...............................................................................................12
2.2.3 Regression methodology..........................................................................................................12
2.2.4 Optimizing QPSR model..........................................................................................................13
3 Results and Discussion.........................................................................................................................15
3.1 Stokes shifts .................................................................................................................................15
3.2 Effect of structural tuning on spectroscopic properties................................................................17
3.2.1 Steric and electronic influence on Stokes shift in the R3 position............................................18
3.2.2 Electrostatics and electronic influence on Stokes shift in the R1 and R2 positions ..................18
3.2.3 Multivariable regression of hydrogen training series...............................................................19
4 Conclusion............................................................................................................................................21
4.1 Further Research ..............................................................................................................................21
References....................................................................................................................................................22
v
Acknowledgments
I would like to thank Dr. Viktor Staroverov whose supervision, guidance and input have allowed
me to gain a better grasp of computational chemistry. I would also like to thank Dr. Joe Gilroy for
helping with the analysis of his groupโ€™s experimental data. Finally, I am grateful to Mr. Sviataslau
Kohut, Dr. Rogelio Cuevas, Ms. Darya Komsa, and Mr. Hanqing Zhao for the insights they
provided through conversations and discussions.
List of Abbreviations
B3LYP Becke, three-parameter, Lee-Yang-Parr
BF2 boron difluoride
DCM dichloromethane
DFT density-functional theory
๐› dipole moment
HF Hartree-Fock
HOMO highest occupied molecular orbital
PBE Perdewโ€“Burkeโ€“Ernzerhof
PCM polarizable continuum model (excited-state energy)
LUMO lowest unoccupied molecular orbital
M06 Minnesota 06
MV molar volume
SCF self-consistent field (ground-state energy)
SCRF self-consistent reaction field
SS Stokes shift
TD-DFT time dependent density-functional theory
THF tetrahydrofuran
QSPR quantitative structure property relationship
VWN Vosko-Wilk-Nusair
XC exchange-correlation
1 Introduction
1.1 Stokes shifts and structural tuning
Spectroscopic properties such as the Stokes shift (SS) wavelength measured in nm,
๐œ†SS = ๐œ†em โˆ’ ๐œ†abs (1)
are of particular interest for molecules which exhibit extended ๐œ‹-conjugation. These spectroscopic
properties allow for characterization of such molecules with regard to structural elucidation. The
tuning of such properties through structural variation allows ฯ€-conjugated functional materials to
be used in photovoltaic cells,1
luminescent materials,2
and field-effect transistors.3
It has been
shown that increasing the ฯ€-conjugation of fused aromatic rings impacts its spectroscopic
properties. Aromatic molecules exhibit an increase in absorption as ฯ€-conjugation is increased;
benzene has a maximum absorption of 260 nm, naphthalene increases to 310 nm, and anthracene
to 375 nm.4
The reason the wavelength of emission is different than that of absorption is due to the change in
the potential energy curve for a molecules excited state. Figure 1 shows the Franck-Condom
principle, which suggests that during an electronic transition is more likely to occur if two
vibrational wavefunctions overlap. As a molecule is excited into its excited state it absorbs a
photon, as the geometry distorts and the molecule relaxes to the ground state there is a release of
a photon (emission).
Figure 1: Absorption and emission due to fluorescence.
2
Boron difluoride (BF2) functional materials have been shown to have tunable spectroscopic
properties which vary on ligand type, position of substitution and extent of ฯ€-conjugation. Fu and
co-workers5
analyzed to naphthyridine BF2 complexes Figure 2 (1 and 2) observed that extension
of the ฯ€-conjugation of the system results in a red shift by 20 nm. Piers et al.6
modified the structure
of anilido-pyridine ligands Figure 2 (3 and 4) to increase the degree of ฯ€-conjugation which red-
shifted the absorption and emission by 50 nm.
Figure 2: BF2 complexes synthesized by the Fu5
and Piers groups.6
Functional materials such as the BF2 formazanate complexes synthesized in the Gilroy group are
of particular interest.7
These complexes have numerous applications as dyes,8
or indicators of cell
activity.9
The Gilroy group has shown10
that BF2 complexes derived from formazans have
desirable spectroscopic properties which can be tuned through structural variation shown in Figure
3. It was found that electron-withdrawing groups on the complex will blue shift the maximum
absorption and emission wavelengths.11
Electron-donating groups will cause a red shift even
compared to their phenyl substituted counterparts.11
Through the observation of these trends, BF2
formazanate complexes are an ideal candidate for a computational study.
3
Figure 3: Synthesized boron difluoride formazanate complexes from the Gilroy group.12
1.2 Objectives
The objective of this thesis is to:
1) Model the Stokes shifts of BF2 formazanate complexes in commonly used solvents such as
dichloromethane (DCM) and tetrahydrofuran (THF).
2) Conduct a quantitative structure-property relationship (QPSR) study to determine whether
through the application of the above model can: determine which electronic and steric
descriptors affect the absorption, emission, and Stokes shifts as well as predict the influence
of structural variation on spectroscopic properties.
1.3 Simulation of Stokes shifts
The above objectives will be realized through density-functional theory (DFT) approximations
using the GAUSSIAN 09 program.13
Calculations will utilize protocols such as the self-consistent
reaction field (SCRF)14
to model solvation, and time-dependent DFT (TD-DFT) to model excited-
states.15
DFT is a computational quantum-mechanics model used in a variety of fields from physics, to
material science, and chemistry. DFT investigates the electronic structure of many-body systems
such as atoms and molecules16
through the use of functionals to determine the nature of the electron
density being analyzed. Various properties can be computed to high degree of accuracy without
the need for experiments.17
These properties include excitation energies,12
ground- and excited-state geometries,11
and various
spectroscopic properties.18
The distortion of geometry from the ground-state to the first excited-
state results in the difference in absorption and emission wavelengths.19
4
1.4 Quantitative structure-property relationship (QPSR)
As mentioned, a QSPR study will be performed on the absorption and emission of BF2 formazanate
complexes through the regression of various electronic and steric descriptors. QSPRs are primarily
used in pharmaceutical design and medicinal drug synthesis20
where the dependent variable such
as binding affinity is related to an independent variable such as hydrophobicity.21
QSPR studies
transformed searching for compounds with desired properties using chemical intuition into a
mathematically quantified form.22
These studies can be conducted both experimentally23
and
computationally.24
To conduct a QSPR model for the BF2 formazanate complex a multivariable
regression model is required.25
Multivariable regressions quantitatively relate a properties such as
absorption or emission to a block of predictor variables like quantum-chemical descriptors in the
form of an equation.26
To adequately study the QSPRs of BF2 formazanate complexes the
electronic and steric descriptors need to be quantified. These descriptors should provide insight
into the chemical nature of the property under consideration.22
As with QSPRs of biological systems, the substituted ligands in molecules need to be varied so
the data is meaningful.27
To this purpose, the skeletal BF2 formazanate structure will be substituted
in the R1-R3 positions with various substituents shown in Figure 4. The substituents will vary in
their position, sterics from low โ€˜Hโ€™ to high โ€˜naphthylโ€™ substituents, and in their electronic
properties, electron-withdrawing groups like โ€˜NO2โ€™ substituents or electron-donating groups such
as โ€˜methylโ€™ substituents.
Figure 4: Boron difluoride skeletal structure with positions of substitution
5
The results from these computations will allow analysis of substituent effects of BF2 formazanate
complexes when a variety of substituents impact its spectroscopic properties. The results of this
study can be used to develop a predictive model based on electronic and steric descriptors, adapt
the QSPR methodology outside biological systems, and estimate the Stokes shift in a time-efficient
manner through the derived regression equation.
2 Experimental
2.1 Simulation of Stokes shifts
Developing a model for the Stokes shifts of BF2 formazanate complexes requires three
experimental aspects: molecular training set, protocols for self-consistent reaction field and time
dependent DFT and finding the optimal level of theory for calculations.
2.1.1 Defining a molecular training set
Applying experimental methodology to larger molecules such as BF2 formazanate complexes is
computationally expensive often taking in excess of 150 hours. Using molecules which exhibit
similar extended ฯ€-conjugation allows optimizing the model in a time-efficient manner.
Naphthalene and fluorene Figure 5 were chosen as training molecules for Stokes shift simulations
due to their similar ฯ€-conjugation and ฯ€ ๏ƒ  ฯ€* transitions. The absorption and emission of the
training molecules are simulated in both DCM and THF solvents.
Figure 5: Structure of training set molecules.
2.1.2 Computation details
To simulate the Stokes shifts of the training molecules in solvents like DCM and THF, a protocol
must be employed. The SCRF protocol in GUASSIAN 09 will be used to simulate solvent
interactions. This works through the use of a polarizable continuum model (PCM) using the
integral equation formalism in which the molecule of interest is placed within a cavity of two
overlapping solvent spheres,28
the solventsโ€™ interactions on the molecule are thereby reproduced.
For excited state calculations in solution, there is a distinction between equilibrium and non-
6
equilibrium calculations. There are two ways the solvent responds in regards to changes in the
state of the solute: it polarizes the electron distribution, which is a rapid process, and the solvent
molecules reorient themselves, a slower process. An equilibrium calculation describes a situation
with the solvent had time to fully respond to the solute in both these ways. A non-equilibrium
calculation is appropriate for the processes with are too rapid for the solvent to fully respond, such
as vertical electronic excitation.
TD-DFT is used for calculations which require non-equilibrium or excited-state geometries. TD-
DFT described in the following way, for a given interaction potential, the RG theorem29
shows that
the external potential uniquely determines the density. The Kohn-Sham approach chooses a non-
interacting system for which the interaction potential is zero to form the density equal to the
interacting system. The wave function of a non-interacting system can be represented as a Slater
determinate of single-particle orbitals. This determines a potential which can be used to determine
a non-interacting Hamiltonian Hs.
(2)
which determines a determinatal wave function
(3)
and generates a time-dependent density
(4)
Such that ฯs is equal to the density of the interacting system at all times. In this way if the potential
can be determined then the original Schrรถdinger equation, a single partial differential equation of
3N variables is replaced by N differential equations in 3 dimensions.
7
Absorption and emission calculations
Simulations of Stokes shifts require the following calculations: ground-state geometry, non-
equilibrium solvation, absorption calculation, single point TD-DFT calculation, excited-state
geometry optimization, and emission calculation shown in Figure 6.
Figure 6: Calculations required to simulate absorption, emission, and Stokes shifts.
Ground-state geometry optimization
The output of the ground-state optimized geometry is the energy of the molecule in solution.
Non-equilibrium solvation
This calculation stores the information about the non-equilibrium solvation based on the ground-
state. This calculation yields the ground-state SCF energy.
Absorption calculation
The actual state-specific calculation is then done reading in the required information for non-
equilibrium solvation. The energy of the first excited-state is then calculated at the ground-state
optimized geometry. This calculation yields the ground-state PCM energy.
The maximum absorption is then calculated by:
ฮ•abs = ฮ•XS โˆ’ ๐›ฆGS (5)
๐œ†abs =
โ„Ž๐‘
ฮ•abs
(6)
8
Single-point TD-DFT calculation
This TD-DFT calculation determines the vertical excitation energy based on a linear response from
the ground-state to the first allowed excited-state.
Excited-state geometry optimization
Using TD-DFT the force constants from the single-point calculations are read, the geometry is
optimized in equilibrium solvation.
Emission calculation
The first step of this calculation writes the solvation data of the state-specific equilibrium solvation
of the excited-state at its equilibrium geometry. This calculation yields the excited-state PCM
energy.
The second step of this procedure reads the solvation data and computes the ground-state energy
with excited-state geometryโ€™ first excited state non- equilibrium static solvation. This calculation
yields the excited-state SCF energy.
ฮ•em = EXS
โˆ—
โˆ’ EGS
โˆ— (7)
๐œ†em =
โ„Ž๐‘
ฮ•em
(8)
Sample outputs of the above calculations are in Supplemental Section A.1.
Calculations were performed using the outlined protocol on the Gaussian 09 user manual30
in
addition to the following keywords:
Int=(grid=UltraFine) โ€“ Ultrafine integration grid for DFT calculations
Opt=(MaxCycle=100) โ€“ increasing number of cycles for convergence at a minimum in
ground- and excited-state geometry optimizations
TD=(โ€ฆ, NStates=3,โ€ฆ) and TD=(โ€ฆ, NStates=3, Root=๐‘ฅ) โ€“ calculations were
specified to three excited-states to model allowed singlet transitions. The primary excited-state of
interest ๐‘ฅ being the first.
9
2.1.3 Choice of functionals and basis sets
The optimal combination of functional and basis set or โ€˜level of theoryโ€™ needs to be chosen for the
training set and protocols outlined. Calculations other than ground-state geometry were conducted
using a combination of various functional, basis sets, and solvents. Ground-state geometries were
calculated using the B3LYP functional and 6-31G(d) basis set.
Functionals
The functionals chosen for modeling absorption and emission are: B3LYP, PBE0, and M06-2X.
These three functionals are well known for their accuracy in spectroscopy calculations.31
B3LYP
and PBE0 are hybrid exchange-correlation functions constructed as a linear combination of the
Hartree-Fock (HF) exact exchange function. The parameters of the functionals are fitted based on
the functionalsโ€™ prediction of experiment or calculated thermochemical data.32
Since the
functionals being tested have a HF component it reduces the self-interaction error leading to good
performance in TD-DFT calculations.31
The equation for a HF exact exchange energy is,
Ex
HF
= โˆ’
1
2
โˆ‘ โˆฌ ฯˆโˆ—
ii,j (๐ซ1)ฯˆโˆ—
i
(๐ซ1)
1
r12
ฯˆi(๐ซ2)ฯˆj(๐ซ2)d๐ซ1d๐ซ2 (9)
The B3LYP33
functional is based on the Becke 88 exchange functional34
, generalized-gradient
approximation (GGA), and the VWN local-density approximation (LDA) 35
given by,
Exc
B3LYP
= Ex
LDA
+ Ec
LDA
+ 0.20 ( Ex
HF
โˆ’ Ex
LDA) + 0.72 (Ex
GGA
โˆ’ Ex
LDA) + 0.81 (Ex
GGA
โˆ’ Ec
LDA
) (10)
The PBE0 functional32
mixes the PBE exchange energy and HF exchange energy according the
equation,
Exc
PBE0
=
1
4
Ex
HF
+
3
4
Ex
PBE
+ Ec
PBE
(11)
The M06-2X functional36
is a global hybrid functional with 54% HF exchange energy. The M06
suite of functionals are constructed using empirical fitting of their parameters but constraining to
the uniform electron gas.31
10
Basis sets
A basis set is a set of functions that are combined in linear combinations to create molecular
orbitals. The Pople basis set functions are typically denoted by ๐‘‹ โˆ’ ๐‘Œ๐‘๐‘”.37
X represents the
number of primitive Gaussians for each core atomic orbital basis function, ๐‘Œand ๐‘ indicate the
valence orbitals composed of a linear combination of ๐‘Œ and ๐‘ primitive Gaussian functions. The
โ€˜*โ€™ adds valence polarized basis sets of the p, d, an f types, the โ€˜+โ€™ adds diffuse functions.38
The basis sets tested were: 6-31G(d), 6-31G(d,p), 6-31+G(d), 6-31+G(d,p), 6-311G(d), 6-
311G(d,p), 6-311+G(d), 6-311+G(d,p), 6-311+G(2d,p), 6-311+G(2df,2p) to test a variety of
polarized and diffuse functions on the Pople basis set.
Table 1: Summary of functionals and basis sets tested
Across the two training molecules, two solvents, three unique calculations, three functionals, and
10 basis sets are a total 360 unique calculations, equivalent to decades of computation years. The
level of theory which yields the smallest mean absolute error (MAE) across the training molecules
and solvents compared to experimental data will be chosen.
MAE =
1
๐‘›
โˆ‘|e ๐‘ก|
๐‘›
๐‘ก=1
(12)
S
Functionals tested Basis sets tested
B3LYP
PBE0
M06-2X
6-31G(d)
6-31G(d,p)
6-31+G(d)
6-31+G(d,p)
6-311G(d)
6-311G(d,p)
6-311+G(d)
6-311+G(d,p)
6-311+G(2d,p)
6-311+G(2df,2p)
11
To calculate the emission under the same methodology would be far too computationally
expensive, as a single excited-state geometry optimization can take in excess of two weeks. To
determine if the model can reproduce emission energies and Stokes shifts, the optimal level of
theory will be used to calculate the Stokes shift for naphthalene, fluorene, and BF2 formazanate
complexes to determine accuracy.
2.2 QSPR model
The most accurate model for simulating the Stokes shift of BF2 formazanate complexes using the
methodology above will be used in the subsequent QSPR study. The model will simulate the
spectroscopic properties of the structurally tuned variations of the BF2 formazanate skeletal
structure. The following is required to conduct the QSPR study: representative training series to
simulate, electronic and steric descriptors to be used as independent regression variables, a
regression methodology and means of optimization.
2.2.1QSPR training series
The training series for the QSPR needs to be more robust than the model for the Stokes shift. A
large training set is required to accomplish the goal of creating a predictive model. The training
set should contain substituents which vary in their sterics and electronics, and should be able to
model currently synthesized complexes from the Gilroy group. In Table 2 are five training series
which meet the criteria set out. Each series can be used independently or in combination to build
a QSPR model.
Table 2: Training series for QSPR study
Hydrogen series Phenyl series Naphthyl series Varied series Equivalent series
R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3
H H H Ph Ph H Nh Nh H Me Me H Me Me Me
H H Me Ph Ph Me Nh Nh Me Cl Cl H Cl Cl Cl
H H Cl Ph Ph Cl Nh Nh Cl CO2H CO2H H CO2H CO2H CO2H
H H CO2H Ph Ph CO2H Nh Nh CO2H OH OH H OH OH OH
H H OH Ph Ph OH Nh Nh OH NMe2 NMe2 H NMe2 NMe2 NMe2
H H NMe2 Ph Ph NMe2 Nh Nh NMe2 NO2 NO2 H NO2 NO2 NO2
H H NO2 Ph Ph NO2 Nh Nh NO2 CN CN H CN CN CN
H H CN Ph Ph CN Nh Nh CN CO CO H CO CO CO
H H CO Ph Ph CO Nh Nh CO
H H Ph Ph Ph Ph Nh Nh Ph
H H Nh Ph Ph Nh Nh Nh Nh
12
The substituents in each series vary in their steric and electronics from low to high and electron-
donating to electron-withdrawing. The hydrogen, varied, and equivalent series were created to
draw inferences on the effect of substituents on the R3, R1 and R2 positions respectively. The
phenyl and naphthyl series attempt to do the same however represent complexes which have
already been synthesized in the Gilroy group12
.
2.2.2 Electronic and steric descriptors
The dependent variables in the regression model will be absorption and emission wavelengths
respectively. The independent variables or โ€˜descriptorsโ€™ need to have a theoretical basis in other
computational QSPR studies; therefore quantum-chemical descriptors are used because of the lack
of inherent error normally associated with experimental measurements. Systematic error may exist
in the simulation model, however this error is considered to be applied evenly through the series
analyzed, thus does not influence modeled trends.22
To determine influence of structural tuning on the absorption and emission, the following
descriptors will be calculated: highest occupied molecular orbital (HOMO) and lowest unoccupied
molecular orbital (LUMO) energies39
, HOMO-LUMO (HL) gap39
, dipole moment (ฮผ)40
, root of ฮผ
(โˆšฮผ)40
for both the ground- and excited-state. Additionally, molar volume (MV)41
will be used to
describe the steric influence of the substituents. All descriptors with exception to molar volume
are contained within the output files of the ground- and excited-state geometry optimizations
respectively.
2.2.3 Regression methodology
The linear regression model for multiple independent variables can be described through the
following equations.42
Every independent variable (๐‘ฅ) is associated with a value of the dependent
variable (๐‘ฆ).
For p independent variables ๐‘ฅ1, ๐‘ฅ2, . . . , ๐‘ฅp the mean (ฮผy) or the โ€˜fitโ€™ is
๐œ‡ ๐‘ฆ = ๐›ฝo + ๐›ฝ1 ๐‘ฅ1 + ๐›ฝ2 ๐‘ฅ2 + โ‹ฏ + ๐›ฝ ๐‘ ๐‘ฅ ๐‘ (13)
The observed values for ๐‘ฆ vary about ฮผ ๐‘ฆ and are assumed to have the same standard deviation.
The fitted values ๐‘1, ๐‘2, . . . , ๐‘ ๐‘ estimate the parameters ๐›ฝo, ๐›ฝ1, โ€ฆ , ๐›ฝp of the population. The
13
regression model includes a model deviation term (๐œ€) which represents the deviations of observed
๐‘ฆ from ฮผ ๐‘ฆ normally distributed with a mean of 0. The model for multiple linear regressions for ๐‘›
observations where ๐‘– = 1, 2, โ€ฆ , ๐‘› is,
๐‘ฆ๐‘– = ๐›ฝo + ๐›ฝ1 ๐‘ฅ๐‘–1 + ๐›ฝ2 ๐‘ฅ๐‘–2 + โ‹ฏ + ๐›ฝ ๐‘ ๐‘ฅ๐‘–๐‘ + ๐œ€๐‘– (14)
The least-squares model finds the line of best-fit by minimizing the sum of squares of the residuals
(๐‘’๐‘–) or vertical deviations from the line. A vertical deviation equal to 0 represents a point which
lies exactly on the line. The residuals (๐‘’๐‘–) is given by
๐‘’๐‘– = ๐‘ฆ๐‘– โˆ’ ๐‘ฆ๐‘– (15)
Where ๐‘ฆ๐‘–
represents the values which fit by the equation
๐‘ฆฬ‚๐‘– = ๐‘o + ๐‘1 ๐‘ฅ๐‘–1 + ๐‘2 ๐‘ฅ๐‘–2 + โ‹ฏ + ๐‘ ๐‘ ๐‘ฅ๐‘–๐‘ (16)
2.2.4 Optimizing QPSR model
The multivariable linear regression models from the Analysis Toolpak in Microsoft Excel will be
used. The optimized model will ideally have a low standard error (SE), low MAE for residuals,
high adjusted r2
value to ensure accuracy of multiple descriptors, p and F-value under 5% to ensure
the underlying model is statistically sound at a 95% confidence interval.
Where ๐‘ is the sample size, ๐ท is the number of descriptors
SE =
๐œŽ๐‘ฅ
โˆšN
(17)
r2
=
โˆ‘(๐‘ฆ๐‘–
โˆ’ ๐œ‡ ๐‘ฆ)2
โˆ‘(๐‘ฆ๐‘– โˆ’ ๐œ‡ ๐‘ฆ)2
(18)
Adj. r2
= 1 โˆ’
(1 โˆ’ r2)(๐‘ โˆ’ 1)
๐‘ โˆ’ ๐ท โˆ’ 1
(19)
14
A ๐‘-value is defined as the probability under the assumption of a hypothesis (H), of obtaining a
result equal to or more extreme than observed in a normal distribution.
Figure 7: Visual representation of a p-value.
Below is the equation for the p-value of the two-tails in a Gaussian distribution,
๐‘value = 2 min โŸจPr(๐‘‹ โ‰ค ๐‘ฅ|๐ป), Pr(๐‘‹ โ‰ฅ ๐‘ฅ|๐ปโŸฉ (20)
F-tests analyze the variance of a quantifiable variable in pre-defined group. This can be used to
make sure that the groups of descriptors are significant to the regression equation.
Where ๐พ is number of groups and ๐‘ฆ๐‘–๐‘— is the ๐‘—th
observation in the ๐‘–th
out of ๐พ groups,
F =
explained variance
unexplain variance
=
โˆ‘ ni(yi
โˆ’ ฮผy)2
ร— (N โˆ’ K)i
โˆ‘ (yij โˆ’ y
i
)2 ร— (K โˆ’ 1)i,j
(21)
Descriptors will be analyzed both individually and in conjunctions with each other to determine
how they affect the above metrics. The goal is to develop a model with the fewest descriptors that
predicts the spectroscopic properties of interest.
15
3 Results and Discussion
3.1 Stokes shifts
The PBE0 functional consistently outperforms both B3LYP and M06-2X functionals across basis
sets; B3LYP consistently underestimates while M06-2X overestimates absorption energy, shown
through Figure 7 and 8.
Figure 7: Calculated absorption energy (eV) for naphthalene in DCM and THF compared to
experimental results (dashed line).
Figure 8: Calculated absorption energy (eV) for fluorene in DCM and THF compared to
experimental results (dashed line).
16
Table 3: MAE of calculated absorption energy (eV) compared to experimental data for all
calculated levels of theory
The minimum MAE for calculated absorption energy across training molecules and solvents is
found to be the PBE0/6-311+G(d,p) level of theory.
Table 4: Percent error in absorption, emission and Stokes shifts (nm) for naphthalene, fluorene,
BF2 formazanate Complexes 12
in solution using PBE0/6-311+G(d,p).
Naphthalene
Solvent Absorption (nm) Emission (nm) Stokes shift (nm)
Calcd Expt
Error
(%)
Calcd Expt
Error
(%)
Calcd Expt
Error
(%)
DCM 276.8 277 0.07% 321.4 321 0.12% 44.6 44 1.36%
THF 277.2 276 0.43% 321.7 320.5 0.40% 44.6 44.5 0.22%
Fluorene
DCM 264.6 265 0.15% 311 311.5 0.16% 46.9 46.5 0.86%
THF 264.4 264 0.15% 310.5 309.5 0.32% 46.1 45.5 1.32%
BF2 Complex (R1 โ€“ Ph, R2 โ€“ Ph, R3 โ€“ CN )
DCM 477 491 2.85% 550 584 5.82% 73 93 21.5%
THF 488.4 489 0.12% 550 585 6.98% 61.6 96 35.8%
BF2 Complex (R1 โ€“ Ph, R2 โ€“ Ph, R3 โ€“ NO2 )
DCM 482 491 1.83% 552 587 5.96% 70 96 27.1%
B3LYP PBE0 M06-2X
Basis set MAE (eV)
6-31G(d,
6-31G(d,p)
6-31+G(d)
6-31+G(d,p)
6-311G(d)
6-311G(d,p)
6-311+G(d)
6-311+G(d,p)
6-311+(2d,p)
6-311+(2df,2p)
0.098
0.085
0.128
0.137
0.083
0.090
0.077
0.165
0.189
0.183
0.0939
0.0863
0.0359
0.0459
0.0253
0.0355
0.0052
0.0048
0.0951
0.1100
0.2942
0.2889
0.2238
0.2157
0.2610
0.2542
0.1879
0.1017
0.0686
0.0659
Minimum MAE 0.0767 0.0048 0.0669
17
The chosen level of level of theory gives good results for absorption across all molecules including
the BF2 formazanate test set, boasting errors under 5%. The emission wavelengths and Stokes
shifts for the training molecules also show good agreement under 5%. Unfortunately, the emission
calculations and the Stokes shifts do not show good agreement. Given agreement for emission in
the training set, it is likely that the methodology does not scale with size or ฯ€-conjugation. A
possible reason for the high error is, excited-state calculations which utilize the SCRF protocol are
unable to model bulk solution effects such as ฯ€-stacking. The emission calculation show a
systematic underestimation of the emission energy. Systematic errors in a regression model are
held constant throughout a training series; therefore the level of theory PBE0/6-311+G(d,p) will
be utilized for modeling all spectroscopic properties in the QSPR study.
3.2 Effect of structural tuning on spectroscopic properties
Given the computationally heavy nature of simulating spectroscopic properties of 49 training
molecules and their descriptors of interest, one was unable to get all the required data for a
thorough analysis. See Supplemental Information A.2 for more information on computation
time. However, enough was acquired to do a partial QSPR study and analyze trends.
Table 5: Compilation of absorption, emission, and descriptor data acquired from training series.
Hydrogen Series Phenyl Series Varied Series Equivalent Series
Position R3
a R3 R1, R2 R1, R2, R3
Calc.b Abs
(nm)
Emis
(nm)
SS
(nm)
Abs
(nm)
Emis
(nm)
SS
(nm)
Abs
(nm)
Emis
(nm)
SS
(nm)
Abs
(nm)
Emis
(nm)
SS
(nm)
H 392.3 493.2 101.0 331.1 543.9 212.8 392.3 493.2 101.0 392.3 493.2 101.0
Me 385.2 565.2 180.0 423.3 555.4 132.1 379.2 468.1 88.9 362.5 ND ND
Cl 388.1 485.5 97.4 506.4 577.3 70.9 331.6 429.4 97.8 377.6 429.3 51.6
CO2H 393.8 485.4 91.6 468.1 539.9 71.8 492.7 634.8 142.1 497.2 624.5 127.3
OH 401.8 542.5 140.7 433.0 566.7 133.8 319.2 357.9 38.8 321.6 366.5 44.9
NMe2 727.8 ND ND 932.0 2210.7 1278.7 372.3 ND ND 439.7 ND ND
NO2 385.3 463.3 78.0 481.7 552.3 70.6 423.5 551.0 127.5 418.9 ND ND
CN 379.9 467.8 87.9 ND ND ND 454.0 609.7 155.8 435.4 ND ND
CO 960.3 1522.2 561.8 ND ND ND 493.9 ND ND ND ND ND
Ph 492.1 1120.4 628.3 ND ND ND ND ND ND ND ND ND
Nh 734.4 2958.7 2224.3 ND ND ND ND ND ND ND ND ND
a
If substituent position is not specified then Rn is Ph for the Phenyl Series, and H for all others.
b
ND โ€“ No Data, ND was obtained for the naphthyl series due to the computation time required.
18
3.2.1 Steric and electronic influence on Stokes shift in the R3 position
Steric influence
The influence for a substituentsโ€™ sterics is unclear, however there seems to be a trend in which
substituents with higher sterics disproportionately increase the emission wavelength compared to
absorption. The emission for the โ€˜naphthylโ€™ in the hydrogen series is 2969 nm over 4 times its
absorption, while โ€˜phenylโ€™ in the same series has an emission of 1120 nm about 2 times its
absorption. Molecules with much lower sterics have much smaller absorption to emission ratios,
1:1.25 and 1:45 for โ€˜chlorideโ€™ and โ€˜methylโ€™ respectively.
Electronic influence
Using the hydrogen series baseline (R1 = R2 = R3 = H), one can see that electron-donating groups
such as โ€˜methylโ€™ from the hydrogen series cause a red shift by 79 nm to the Stokes shift. Highly
electron-withdrawing groups such as โ€˜NO2โ€™ cause a blue shift by 23 nm. This trend holds for the
phenyl series (R1 = R2 = Ph, R3 = H) where โ€˜methylโ€™ causes a red shift 80.7 nm, and โ€˜NO2; a blue
shift by 142.2 nm. One can conclude that the ability for electron-donating and electron-
withdrawing groups to shift the Stokes shift increases when large ฯ€-conjugated substituents are in
the R1 and R2 positions, thus agreeing with previous research.11
3.2.2 Electrostatics and electronic influence on Stokes shift in the R1 and R2 positions
Steric influence
The influence from a substituents steric is unclear from the data calculated for the R1 and R2
positions.
Electronic influence
Compared to R3 position, the trends in electronic influence on absorption and emission is reversed
for R1 and R2 substituted positions. Compared to the hydrogen series baseline, the electron-
donating โ€˜methylโ€™ substituents from the varied series predict a blue shift of 12 nm; electron-
withdrawing groups like โ€˜NO2โ€™ from the varied series the cause a red shift by 27 nm.
19
3.2.3 Multivariable regression of hydrogen training series
QSPR analysis on series without sufficient emission data must be excluded. These must be
excluded because the chosen excited-state quantum-chemical descriptors are found in the output
of emission calculations. The hydrogen series is the only series for which multivariable regression
analysis can be conducted.
Table 6: Tests to optimize multivariable regression model for absorption (nm) of hydrogen series
Hydrogen Series - Absorption Regression Model
Trial # Descriptors r2
Adj. r2
F-value MAE (nm) Standard Error (nm)
1 All NA NA NA NA NA
2 Less MV 0.9998 -0.0016 NA 2.0 7.9
3 Less HL gap 0.9998 0.9984 2.93% 2.0 7.9
4 Less ELUMO
โฌš 0.9995 0.9979 0.16% 2.8 9.1
5 a
Less ELUMO
โˆ—
0.9983 0.9949 0.03% 6.0 14.1
a
Supplemental Information A.3 contains output of trial 5
The goal for the optimized regression model was a low standard error, low MAE, high adjusted
r2
, and F-value less than 5% to ensure a statically sound model within a 95% confidence interval;
these goals were realized in five trials. The reasons descriptors were eliminated is as follows: in
trial 1, the MV had a regression co-efficient of 0, trial 2 had line overfitting as dictated by the
negative adjusted r2
value caused by the HL gap. Trials 3 through 5 removed statistically
insignificant descriptors of LUMO and excited-state LUMO energies carrying p-values of
42.4%, and 14.9% respectively.
ฮปabs(nm) = 5071 โˆ’ 6897 EHOMO + 8142 EHOMO
โˆ—
โˆ’ 1214 ฮผ + 2619ฮผโˆ—
+ 4110โˆšฮผ โˆ’ 9058โˆšฮผ
โˆ— (22)
The final equation for the regression model boast a 6.0 nm MAE and 14.1 nm standard error. The
high coefficients in the equation are due to the units of the descriptors being orders of magnitude
smaller than nm.
20
Table 7: Statistical output of multivariable linear regression on emission (nm) for hydrogen
training series.
Hydrogen Series - Emission Regression Model
Trial # Descriptors r2
Adj. r2
F-value MAE (nm) Standard Error (nm)
1 All NA NA NA NA NA
2 Less MV 0.9905 -0.0855 NA 58.3 234.7
3 Less HL gap 0.9905 0.9145 21.12% 58.3 234.7
4 Less ELUMO
โฌš 0.9840 0.9281 5.48% 89.0 215.3
5 Less EHOMO
โฌš 0.9833 0.9500 0.92% 85.7 179.4
6 Less EHOMO
โˆ—
0.9826 0.9608 0.13% 82.5 158.8
7 Less ELUMO
โˆ— 0.9767 0.9581 0.03% 72.8 164.3
The emission regression model took a series of 7 trials to optimize. The reason descriptors were
eliminated were as follows: trials 1 thorough 3 followed the same logic as the absorption model.
Trials 4 through 7 eliminated the descriptors of LUMO energy, HOMO energy, excited-state
HOMO energy, and excited-state LUMO energy which had p-values of 56.0%, 79.9%, 73.7% and
30.9% respectively. This suggests that these descriptors are unimportant in emission simulation.
ฮปem(nm) = 22048 โˆ’ 10187 ฮผ + 17285 ฮผโˆ—
+ 31918 โˆšฮผ โˆ’ 57244 โˆšฮผ โˆ—
(23)
Examining the above equation might explain why both MAE and standard error yield higher errors
of 72.8 nm and 164.3 nm respectively. The intercept is an order of magnitude greater than the
equation for absorption (22048 vs. 5071). To counter this the coefficients of the independent
variables must be higher to compensate (-10187 vs. -1214 for ฮผ). Subtle variations in the descriptor
variables inputs would yield a higher associated error. It should be noted that even though the
adjusted r2
value is lower and errors much higher for this model, it is still statistically significant
with a F-value of 0.03%. It is likely that other descriptors which were not tested need to be
employed to predict behavior; one would still expect that the trends replicated by this model to be
accurate.
21
4 Conclusion
The proposed methodology for simulating the Stokes shift and the following QSPR study on
substituents effects on BF2 formazanate complexes reproduces experimental data and provides
foundation for future research into substituent analysis. It has been shown that PBE0/6-311+G(d,p)
level of theory is optimal in simulating absorption, emission, and Stokes shifts with a MAE of
0.0048 eV compared to other functionals and basis sets tested across 60 unique trials. For ฯ€-
conjugated molecules in the training set, errors compared to experiment in the absorption, emission
and Stokes shift were under 0.5%, 0.4% and 1.4%. The BF2 formazanate complexes tested under
the same level of theory had errors under 3% for absorption and 7% for emission. Increased error
in emission energy is likely due intermolecular effects not modeled in the system. Over 45 training
molecules were structurally tuned and analyzed based on their electronic and steric features.
Observed electronic trends from previous research were reproduced,11
specifically electron-
donating groups caused a red shift in the R3 position, while electron-withdrawing groups caused a
blue shift to the Stokes shift. The QPSR training model further suggests that this trend is reversed
when substituents are placed in the R1 and R2 positions; electron-donating groups causing a blue
shift, and electron-withdrawing groups a red shift. When substituents with high sterics were added
in the R3 position of the BF2 formazanate skeletal structure the ratio of emission to absorption
increased 5:1 for โ€˜naphthylโ€™ vs 1.2:1 for โ€˜chlorideโ€™. Finally, a multivariable regression analysis
found that the absorption energy could be described by the dipole moment, excited-state dipole
moment, dipole moment derivatives, HOMO and excited-state HOMO energies. The hydrogen
series regression equation boasts a 14.1 nm standard error, 6.0 nm mean absolute error, adjusted
r2
of 0.9949, F-value under 0.04%, and all descriptors had a p-value under 1% and is therefore
accurate within a 99% confidence interval.
4.1 Further Research
Further research may be conducted on the analysis for all the training series with the methodology
outlined herein. The results of the regression analysis from this and future research could be used
on a test set to determine whether the regression equations can reproduce experimental
spectroscopic properties. Various other descriptors could be used to identify their relationships to
properties such as emission wavelengths. Databases of various molecules could be created,
analyzed, and used for optimizing substituents for strategic synthesis in functional materials.
22
References
1
J. Roncali, P. Leriche, and P. Blanchard, Adv. Mater. 26, 3821 (2014).
2
D. Frath, J. Massue, G. Ulrich, and R. Ziessel, Angew. Chem. Int. Ed. Engl. 53, 2290 (2014).
3
W. Wu, Y. Liu, and D. Zhu, Chem. Soc. Rev. 39, 1489 (2010).
4
M. Montalti, A. Credi, L. Prodi, and M.T. Gandolfi, Handbook of Photochemistry (2006).
5
L. Quan, Y. Chen, X.-J. Lv, and W.-F. Fu, Chem. Eur. J. 18, 14599 (2012).
6
J.F. Araneda, W.E. Piers, B. Heyne, M. Parvez, and R. McDonald, Angew. Chem. Int. Ed.
Engl. 50, 12214 (2011).
7
S.M. Barbon, V.N. Staroverov, P.D. Boyle, and J.B. Gilroy, Dalton Trans. 43, 240 (2014).
8
M. Szymczyk, A. El-Shafei, and H.S. Freeman, Dye. Pigment. 72, 8 (2007).
9
W.M. Frederiks, J. van Marle, C. van Oven, B. Comin-Anduix, and M. Cascante, J. Histochem.
Cytochem. 54, 47 (2006).
10
M. Hesari, S.M. Barbon, V.N. Staroverov, Z. Ding, and J.B. Gilroy, Chem. Commun. 51, 3766
(2015).
11
S.M. Barbon, P.A. Reinkeluers, J.T. Price, V.N. Staroverov, and J.B. Gilroy, Chem. Eur. J. 20,
11340 (2014).
12
S.M. Barbon, V.N. Staroverov, and J.B. Gilroy, J. Org. Chem. 80, 5226 (2015).
13
M.J. Frisch, G.W. Trucks, H.B. Schlegel, G.E. Scuseria, M. a. Robb, J.A. Cheeseman, G.
Scalmani, V. Barone, B. Mennuci, G.A. Petersson, H. Nakatsuji, M. Caricato, X. Li, H.P.
Hratchian, A.F. Izmaylov, J. Bloino, G. Zheng, J.L. Sonnenberg, W. Liang, M. Hada, M. Ehara,
K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, T.
Vreven, J.A. Montgomery, J.E. Peralta, F. Ogliaro, M.J. Bearpark, J.J. Heyd, E. Brothers, K.N.
Kudin, V.N. Staroverov, T. Keith, R. Kobayashi, J. Normand, K. Raghavachari, A. Rendell, J.C.
Burant, S.S. Iyengar, J. Tomasi, M. Cossi, N. Rega, J.M. Millam, M. Klene, J.E. Knox, J.B.
Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R.E. Stratmann, O. Yazyev, A.J.
Austin, R. Cammi, C. Pomelli, J.W. Ochterski, R.L. Martin, K. Morokuma, V.G. Zakrezewski,
23
G.A. Voth, P. Salvador, J.J. Dannenberg, S. Dapprich, P. V. Parandekar, N.J. Mayhall, A.D.
Daniels, O. Farkas, J.B. Foresman, J. V. Ortiz, J. Cioslowski, and D.J. Fox, Gaussian Dev.
Version, Revis. H. 32 Gaussian (2010).
14
T. Mineva, N. Russo, and M. Toscano, Int. J. Quantum Chem. 56, 663 (1995).
15
E.K.U. Gross and W. Kohn, Adv. Quantum Chem (1990).
16
P. Hohenberg, Phys. Rev. 136, B864 (1964).
17
W. Kohn and L.J. Sham, Phys. Rev. 140, A1133 (1965).
18
M. Bourass, A. Touimi Benjelloun, M. Hamidi, M. Benzakour, M. Mcharfi, M. Sfaira, F.
Serein-Spirau, J.-P. Lรจre-Porte, J.-M. Sotiropoulos, S.M. Bouzzine, and M. Bouachrine, J. Saudi
Chem. Soc. (2013).
19
F. Cervantes-Navarro and D. Glossman-Mitnik, Chem. Cent. J. 6, 70 (2012).
20
G.-F. Yang and X. Huang, Curr. Pharm. Des. 12, 4601 (2006).
21
J. Verma, V.M. Khedkar, and E.C. Coutinho, Curr. Top. Med. Chem. 10, 95 (2010).
22
M. Karelson, V.S. Lobanov, and A.R. Katritzky, Chem. Rev. 96, 1027 (1996).
23
T.W. Schultz, M.T.D. Cronin, J.D. Walker, and A.O. Aptula, J. Mol. Struct. THEOCHEM
622, 1 (2003).
24
A.E. Soffers, M.G. Boersma, W.H. Vaes, J. Vervoort, B. Tyrakowska, J.L. Hermens, and I.M.
Rietjens, Toxicol. In Vitro 15, 539.
25
M.M.C. Ferreira, J. Braz. Chem. Soc. 13, 742 (2002).
26
R. Kiralj and M.M.C. Ferreira, J. Braz. Chem. Soc. 20, 770 (2009).
27
M.O. Taha, A.M. Qandil, D.D. Zaki, and M.A. AlDamen, Eur. J. Med. Chem. 40, 701 (2005).
28
S. Miertu, E. Scrocco, and J. Tomasi, Chem. Phys. 55, 117 (1981).
29
E. Runge and E.K.U. Gross, Phys. Rev. Lett. 52, 997 (1984).
30
1 (2015).
24
31
Y. Zhao and D.G. Truhlar, J. Phys. Chem. A 110, 13126 (2006).
32
J.P. Perdew, M. Ernzerhof, and K. Burke, J. Chem. Phys. 105, 9982 (1996).
33
K. Kim and K.D. Jordan, J. Phys. Chem. 98, 10089 (1994).
34
A.D. Becke, Phys. Rev. A 38, 3098 (1988).
35
S.H. Vosko, L. Wilk, and M. Nusair, Can. J. Phys. 58, 1200 (1980).
36
Y. Zhao and D.G. Truhlar, Theor. Chem. Acc. 120, 215 (2007).
37
R. Ditchfield, J. Chem. Phys. 54, 724 (1971).
38
J.A. Montgomery, M.J. Frisch, J.W. Ochterski, and G.A. Petersson, J. Chem. Phys. 110, 2822
(1999).
39
A.R. Katritzky, V.S. Lobanov, and M. Karelson, Chem. Soc. Rev. 24, 279 (1995).
40
L. Buydens, D.L. Massart, and P. Geerlings, Anal. Chem. 55, 738 (1983).
41
D.F. V. Lewis, C. Ioannides, and D. V. Parke, Xenobiotica 24, 401 (2008).
42
J.O. Rawlings, S.G. Pantula, and D.A. Dickey, editors , Applied Regression Analysis
(Springer-Verlag, New York, 1998).
A Supplemental Information
A.1 Sample output for simulation calculations using acetaldehyde
Ground-state geometry optimization
SCF Done: E(RB3LYP) = -153.851761719 A.U. after 1
cycles
Non-equilibrium solvation
No output for interpretation
Absorption calculation
After PCM corrections, the energy is -153.687679826 a.u.
Single-point TD-DFT calculation
No output for interpretation
Excited-state geometry optimization
Total Energy, E(TD-HF/TD-KS) = -153.705918726
Emission calculation
After PCM corrections, the energy is -153.707148980 a.u.
SCF Done: E(RB3LYP) = -153.822024722 A.U. after 10
cycles
A.2 Total computation time utilized on SHARCNET for calculations
A.3 Sample output of regression trial
Final regression output for optimized hydrogen series absorption
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.999150538
R Square 0.998301798
Adjusted R Square 0.994905395
Standard Error 14.1149689
Observations 10
ANOVA
df SS MS F Significance F
Regression 6 351361.0959 58560.18 293.9291 0.000305546
Residual 3 597.6970412 199.2323
Total 9 351958.7929
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Lower 95.0%Upper 95.0%
Intercept 5071.433211 310.4157034 16.33755 0.0% 4083.551902 6059.315 4083.552 6059.314519
HOMO -6896.995793 1150.644052 -5.99403 0.9% -10558.85871 -3235.13 -10558.9 -3235.13288
DM -1214.378491 117.1538478 -10.3657 0.2% -1587.214321 -841.543 -1587.21 -841.5426608
RDM 4109.608812 402.0016826 10.22286 0.2% 2830.260042 5388.958 2830.26 5388.957582
ExHOMO 8141.875651 1263.507913 6.443866 0.8% 4120.829561 12162.92 4120.83 12162.92174
ExDM 2619.329506 213.6784301 12.25828 0.1% 1939.309376 3299.35 1939.309 3299.349637
ExRDM -9058.120425 746.8661692 -12.1282 0.1% -11434.98191 -6681.26 -11435 -6681.258944
RESIDUAL OUTPUT
Observation Predicted Abs Residuals Abs. Error
1 377.5660811 14.72161536 14.72162
2 393.3729351 -8.212519227 8.212519
3 389.9451678 -1.893151418 1.893151
4 396.5349594 -2.730898376 2.730898
5 388.9116662 12.85973835 12.85974
6 389.2860815 -3.975836674 3.975837
7 382.7882077 -2.88556638 2.885566
8 957.9521865 2.373056185 2.373056
9 502.4186819 -10.36025899 10.36026
10 734.260612 0.103821168 0.103821
MAE 6.011646

More Related Content

What's hot

Jacob Curry presentation
Jacob Curry presentationJacob Curry presentation
Jacob Curry presentationJacob Curry
ย 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.comIJERD Editor
ย 
Improving optical properties of remote phosphor LED using green Y2O3:Ho3+ and...
Improving optical properties of remote phosphor LED using green Y2O3:Ho3+ and...Improving optical properties of remote phosphor LED using green Y2O3:Ho3+ and...
Improving optical properties of remote phosphor LED using green Y2O3:Ho3+ and...TELKOMNIKA JOURNAL
ย 
Presentation solid
Presentation solidPresentation solid
Presentation solidGaurav Rai
ย 
Poster
PosterPoster
PosterJose Lopez
ย 
MOF, metal organic frameworks
MOF, metal organic frameworksMOF, metal organic frameworks
MOF, metal organic frameworksSumanta Chakrabarty
ย 
Experimental (FT-IR, UV-visible, NMR) spectroscopy and molecular structure, g...
Experimental (FT-IR, UV-visible, NMR) spectroscopy and molecular structure, g...Experimental (FT-IR, UV-visible, NMR) spectroscopy and molecular structure, g...
Experimental (FT-IR, UV-visible, NMR) spectroscopy and molecular structure, g...iosrjce
ย 
Complexation and Protein Binding [Part-1] (Introduction and Classification an...
Complexation and Protein Binding [Part-1](Introduction and Classification an...Complexation and Protein Binding [Part-1](Introduction and Classification an...
Complexation and Protein Binding [Part-1] (Introduction and Classification an...Ms. Pooja Bhandare
ย 
Dual-layer remote phosphor structure: a novel technique to enhance the color ...
Dual-layer remote phosphor structure: a novel technique to enhance the color ...Dual-layer remote phosphor structure: a novel technique to enhance the color ...
Dual-layer remote phosphor structure: a novel technique to enhance the color ...IJECEIAES
ย 
synthesis Of metal organic frame work ( MOFs)
synthesis Of metal organic frame work ( MOFs)synthesis Of metal organic frame work ( MOFs)
synthesis Of metal organic frame work ( MOFs)University Of Wah
ย 
Io2515321537
Io2515321537Io2515321537
Io2515321537IJERA Editor
ย 
Correlation between corrosion inhibitive effect and quantum molecular structu...
Correlation between corrosion inhibitive effect and quantum molecular structu...Correlation between corrosion inhibitive effect and quantum molecular structu...
Correlation between corrosion inhibitive effect and quantum molecular structu...Al Baha University
ย 
Kacaretal Adv Mater2009
Kacaretal Adv Mater2009Kacaretal Adv Mater2009
Kacaretal Adv Mater2009labkit
ย 
Bis-perfluorocycloalkenyl (PFCA) aryl ether monomers towards a versatile clas...
Bis-perfluorocycloalkenyl (PFCA) aryl ether monomers towards a versatile clas...Bis-perfluorocycloalkenyl (PFCA) aryl ether monomers towards a versatile clas...
Bis-perfluorocycloalkenyl (PFCA) aryl ether monomers towards a versatile clas...aaaa zzzz
ย 

What's hot (19)

Jacob Curry presentation
Jacob Curry presentationJacob Curry presentation
Jacob Curry presentation
ย 
SIRFinalPaper1
SIRFinalPaper1SIRFinalPaper1
SIRFinalPaper1
ย 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
ย 
JPC C Guru
JPC C GuruJPC C Guru
JPC C Guru
ย 
Improving optical properties of remote phosphor LED using green Y2O3:Ho3+ and...
Improving optical properties of remote phosphor LED using green Y2O3:Ho3+ and...Improving optical properties of remote phosphor LED using green Y2O3:Ho3+ and...
Improving optical properties of remote phosphor LED using green Y2O3:Ho3+ and...
ย 
Presentation solid
Presentation solidPresentation solid
Presentation solid
ย 
D04010021033
D04010021033D04010021033
D04010021033
ย 
Poster
PosterPoster
Poster
ย 
MOF, metal organic frameworks
MOF, metal organic frameworksMOF, metal organic frameworks
MOF, metal organic frameworks
ย 
Experimental (FT-IR, UV-visible, NMR) spectroscopy and molecular structure, g...
Experimental (FT-IR, UV-visible, NMR) spectroscopy and molecular structure, g...Experimental (FT-IR, UV-visible, NMR) spectroscopy and molecular structure, g...
Experimental (FT-IR, UV-visible, NMR) spectroscopy and molecular structure, g...
ย 
Complexation and Protein Binding [Part-1] (Introduction and Classification an...
Complexation and Protein Binding [Part-1](Introduction and Classification an...Complexation and Protein Binding [Part-1](Introduction and Classification an...
Complexation and Protein Binding [Part-1] (Introduction and Classification an...
ย 
Dual-layer remote phosphor structure: a novel technique to enhance the color ...
Dual-layer remote phosphor structure: a novel technique to enhance the color ...Dual-layer remote phosphor structure: a novel technique to enhance the color ...
Dual-layer remote phosphor structure: a novel technique to enhance the color ...
ย 
Kaushal report
Kaushal reportKaushal report
Kaushal report
ย 
synthesis Of metal organic frame work ( MOFs)
synthesis Of metal organic frame work ( MOFs)synthesis Of metal organic frame work ( MOFs)
synthesis Of metal organic frame work ( MOFs)
ย 
Io2515321537
Io2515321537Io2515321537
Io2515321537
ย 
Correlation between corrosion inhibitive effect and quantum molecular structu...
Correlation between corrosion inhibitive effect and quantum molecular structu...Correlation between corrosion inhibitive effect and quantum molecular structu...
Correlation between corrosion inhibitive effect and quantum molecular structu...
ย 
Kacaretal Adv Mater2009
Kacaretal Adv Mater2009Kacaretal Adv Mater2009
Kacaretal Adv Mater2009
ย 
2017 imms roli
2017 imms roli2017 imms roli
2017 imms roli
ย 
Bis-perfluorocycloalkenyl (PFCA) aryl ether monomers towards a versatile clas...
Bis-perfluorocycloalkenyl (PFCA) aryl ether monomers towards a versatile clas...Bis-perfluorocycloalkenyl (PFCA) aryl ether monomers towards a versatile clas...
Bis-perfluorocycloalkenyl (PFCA) aryl ether monomers towards a versatile clas...
ย 

Viewers also liked

Sprint โ€“ Strategy Analysis by JNZnetwors.com v1.0
Sprint โ€“ Strategy Analysis by JNZnetwors.com   v1.0Sprint โ€“ Strategy Analysis by JNZnetwors.com   v1.0
Sprint โ€“ Strategy Analysis by JNZnetwors.com v1.0Jamshed Nazar
ย 
mass communication
mass communicationmass communication
mass communicationeba ali
ย 
review article
review articlereview article
review articleMustafa Khan
ย 
Valerie_Swanson-Portfolio
Valerie_Swanson-PortfolioValerie_Swanson-Portfolio
Valerie_Swanson-PortfolioValerie Swanson
ย 
Eng.Mostafa Ezzat.CV
Eng.Mostafa Ezzat.CVEng.Mostafa Ezzat.CV
Eng.Mostafa Ezzat.CVmostafa ezzat
ย 
MANUAL Tร‰CNICO
MANUAL Tร‰CNICO MANUAL Tร‰CNICO
MANUAL Tร‰CNICO Winker Morales
ย 
Fire safety mnemonics
Fire safety mnemonicsFire safety mnemonics
Fire safety mnemonicsnadyapratt
ย 
Relationship Workshop
Relationship Workshop Relationship Workshop
Relationship Workshop Eula Mae Society
ย 
Barc0de - Pitch
Barc0de - PitchBarc0de - Pitch
Barc0de - PitchRayner Mendes
ย 
IFP Team 56 Final Report Bryan Walkey V33 Master
IFP Team 56 Final Report Bryan Walkey V33 MasterIFP Team 56 Final Report Bryan Walkey V33 Master
IFP Team 56 Final Report Bryan Walkey V33 MasterRayner Mendes
ย 
Maritime experience application
Maritime experience applicationMaritime experience application
Maritime experience applicationDusan Dijan
ย 
Designers rooms: Welcome to the Made in Spain
Designers rooms: Welcome to the Made in SpainDesigners rooms: Welcome to the Made in Spain
Designers rooms: Welcome to the Made in SpainGabriela Marengo
ย 
Presentation boards and working drawings
Presentation boards and working drawingsPresentation boards and working drawings
Presentation boards and working drawingsRobert John Gelito
ย 
Business Plan- Team Mt. Jr
Business Plan- Team Mt. JrBusiness Plan- Team Mt. Jr
Business Plan- Team Mt. JrRayner Mendes
ย 
Ciclo celular: Mitosis
Ciclo celular: Mitosis Ciclo celular: Mitosis
Ciclo celular: Mitosis Angie Nina
ย 
Management system of pepsi co
Management system of pepsi coManagement system of pepsi co
Management system of pepsi coOwais Hassan
ย 

Viewers also liked (20)

Sprint โ€“ Strategy Analysis by JNZnetwors.com v1.0
Sprint โ€“ Strategy Analysis by JNZnetwors.com   v1.0Sprint โ€“ Strategy Analysis by JNZnetwors.com   v1.0
Sprint โ€“ Strategy Analysis by JNZnetwors.com v1.0
ย 
mass communication
mass communicationmass communication
mass communication
ย 
review article
review articlereview article
review article
ย 
Valerie_Swanson-Portfolio
Valerie_Swanson-PortfolioValerie_Swanson-Portfolio
Valerie_Swanson-Portfolio
ย 
CV
CVCV
CV
ย 
Eng.Mostafa Ezzat.CV
Eng.Mostafa Ezzat.CVEng.Mostafa Ezzat.CV
Eng.Mostafa Ezzat.CV
ย 
MANUAL Tร‰CNICO
MANUAL Tร‰CNICO MANUAL Tร‰CNICO
MANUAL Tร‰CNICO
ย 
Fire safety mnemonics
Fire safety mnemonicsFire safety mnemonics
Fire safety mnemonics
ย 
EL BULLYING
EL BULLYINGEL BULLYING
EL BULLYING
ย 
Relationship Workshop
Relationship Workshop Relationship Workshop
Relationship Workshop
ย 
Barc0de - Pitch
Barc0de - PitchBarc0de - Pitch
Barc0de - Pitch
ย 
IFP Team 56 Final Report Bryan Walkey V33 Master
IFP Team 56 Final Report Bryan Walkey V33 MasterIFP Team 56 Final Report Bryan Walkey V33 Master
IFP Team 56 Final Report Bryan Walkey V33 Master
ย 
Maritime experience application
Maritime experience applicationMaritime experience application
Maritime experience application
ย 
Designers rooms: Welcome to the Made in Spain
Designers rooms: Welcome to the Made in SpainDesigners rooms: Welcome to the Made in Spain
Designers rooms: Welcome to the Made in Spain
ย 
Presentation boards and working drawings
Presentation boards and working drawingsPresentation boards and working drawings
Presentation boards and working drawings
ย 
Business Plan- Team Mt. Jr
Business Plan- Team Mt. JrBusiness Plan- Team Mt. Jr
Business Plan- Team Mt. Jr
ย 
Ciclo celular: Mitosis
Ciclo celular: Mitosis Ciclo celular: Mitosis
Ciclo celular: Mitosis
ย 
Speaking Tiger books on travel
Speaking Tiger books on travelSpeaking Tiger books on travel
Speaking Tiger books on travel
ย 
Literature Paper
Literature PaperLiterature Paper
Literature Paper
ย 
Management system of pepsi co
Management system of pepsi coManagement system of pepsi co
Management system of pepsi co
ย 

Similar to Simulation of Stokes Shifts in BF2 Formazanate Complexes

Journal of Luminescence
Journal of LuminescenceJournal of Luminescence
Journal of LuminescenceRafael Cossiello
ย 
Effect of Poling Field and Non-linearity in Quantum Breathers in Ferroelectrics
Effect of Poling Field and Non-linearity in Quantum Breathers in FerroelectricsEffect of Poling Field and Non-linearity in Quantum Breathers in Ferroelectrics
Effect of Poling Field and Non-linearity in Quantum Breathers in FerroelectricsIOSR Journals
ย 
Study of Substituent Effect on Properties of Platinum(II) Porphyrin Semicondu...
Study of Substituent Effect on Properties of Platinum(II) Porphyrin Semicondu...Study of Substituent Effect on Properties of Platinum(II) Porphyrin Semicondu...
Study of Substituent Effect on Properties of Platinum(II) Porphyrin Semicondu...UniversitasGadjahMada
ย 
Catalysis In Chemistry
Catalysis In ChemistryCatalysis In Chemistry
Catalysis In ChemistrySandy Harwell
ย 
Investigation of structural, kinetic and thermodynamic properties of proteins...
Investigation of structural, kinetic and thermodynamic properties of proteins...Investigation of structural, kinetic and thermodynamic properties of proteins...
Investigation of structural, kinetic and thermodynamic properties of proteins...Abesh Bhar
ย 
Fluoride Recognition of Amide- and Pyrrole-Based Receptors: A Theoretical Study
Fluoride Recognition of Amide- and Pyrrole-Based Receptors: A Theoretical Study Fluoride Recognition of Amide- and Pyrrole-Based Receptors: A Theoretical Study
Fluoride Recognition of Amide- and Pyrrole-Based Receptors: A Theoretical Study drboon
ย 
MScAlastalo
MScAlastaloMScAlastalo
MScAlastaloAri Alastalo
ย 
Synthesis, Growth and Characterization of Nonlinear Optical Semi Organic Pota...
Synthesis, Growth and Characterization of Nonlinear Optical Semi Organic Pota...Synthesis, Growth and Characterization of Nonlinear Optical Semi Organic Pota...
Synthesis, Growth and Characterization of Nonlinear Optical Semi Organic Pota...IRJET Journal
ย 
opti.pptx
opti.pptxopti.pptx
opti.pptxbreenaawan
ย 
1-s2.0-S0022286014012551-main
1-s2.0-S0022286014012551-main1-s2.0-S0022286014012551-main
1-s2.0-S0022286014012551-maingovindarasu kannan
ย 
Electropolymerization of Polyaniline in the Presence of Ferricyanide
Electropolymerization of Polyaniline in the Presence of FerricyanideElectropolymerization of Polyaniline in the Presence of Ferricyanide
Electropolymerization of Polyaniline in the Presence of FerricyanideFarhadAkrami1
ย 
Investigation of the Effects of Fullerene addition and Plasma Exposure on Opt...
Investigation of the Effects of Fullerene addition and Plasma Exposure on Opt...Investigation of the Effects of Fullerene addition and Plasma Exposure on Opt...
Investigation of the Effects of Fullerene addition and Plasma Exposure on Opt...iosrjce
ย 
Structural and magnetic properties on F-doped LiVO2 with two-dimensional tria...
Structural and magnetic properties on F-doped LiVO2 with two-dimensional tria...Structural and magnetic properties on F-doped LiVO2 with two-dimensional tria...
Structural and magnetic properties on F-doped LiVO2 with two-dimensional tria...Yang Li
ย 

Similar to Simulation of Stokes Shifts in BF2 Formazanate Complexes (20)

9
99
9
ย 
OKE
OKEOKE
OKE
ย 
Journal of Luminescence
Journal of LuminescenceJournal of Luminescence
Journal of Luminescence
ย 
17
1717
17
ย 
Effect of Poling Field and Non-linearity in Quantum Breathers in Ferroelectrics
Effect of Poling Field and Non-linearity in Quantum Breathers in FerroelectricsEffect of Poling Field and Non-linearity in Quantum Breathers in Ferroelectrics
Effect of Poling Field and Non-linearity in Quantum Breathers in Ferroelectrics
ย 
Study of Substituent Effect on Properties of Platinum(II) Porphyrin Semicondu...
Study of Substituent Effect on Properties of Platinum(II) Porphyrin Semicondu...Study of Substituent Effect on Properties of Platinum(II) Porphyrin Semicondu...
Study of Substituent Effect on Properties of Platinum(II) Porphyrin Semicondu...
ย 
Catalysis In Chemistry
Catalysis In ChemistryCatalysis In Chemistry
Catalysis In Chemistry
ย 
Investigation of structural, kinetic and thermodynamic properties of proteins...
Investigation of structural, kinetic and thermodynamic properties of proteins...Investigation of structural, kinetic and thermodynamic properties of proteins...
Investigation of structural, kinetic and thermodynamic properties of proteins...
ย 
Ftnm rof curcuminnew
Ftnm rof curcuminnewFtnm rof curcuminnew
Ftnm rof curcuminnew
ย 
Fluoride Recognition of Amide- and Pyrrole-Based Receptors: A Theoretical Study
Fluoride Recognition of Amide- and Pyrrole-Based Receptors: A Theoretical Study Fluoride Recognition of Amide- and Pyrrole-Based Receptors: A Theoretical Study
Fluoride Recognition of Amide- and Pyrrole-Based Receptors: A Theoretical Study
ย 
MScAlastalo
MScAlastaloMScAlastalo
MScAlastalo
ย 
2003 self-organization processes in impurity subsystem of solid solutions
2003 self-organization processes in impurity subsystem of solid solutions2003 self-organization processes in impurity subsystem of solid solutions
2003 self-organization processes in impurity subsystem of solid solutions
ย 
Synthesis, Growth and Characterization of Nonlinear Optical Semi Organic Pota...
Synthesis, Growth and Characterization of Nonlinear Optical Semi Organic Pota...Synthesis, Growth and Characterization of Nonlinear Optical Semi Organic Pota...
Synthesis, Growth and Characterization of Nonlinear Optical Semi Organic Pota...
ย 
opti.pptx
opti.pptxopti.pptx
opti.pptx
ย 
1-s2.0-S0022286014012551-main
1-s2.0-S0022286014012551-main1-s2.0-S0022286014012551-main
1-s2.0-S0022286014012551-main
ย 
Electropolymerization of Polyaniline in the Presence of Ferricyanide
Electropolymerization of Polyaniline in the Presence of FerricyanideElectropolymerization of Polyaniline in the Presence of Ferricyanide
Electropolymerization of Polyaniline in the Presence of Ferricyanide
ย 
Investigation of the Effects of Fullerene addition and Plasma Exposure on Opt...
Investigation of the Effects of Fullerene addition and Plasma Exposure on Opt...Investigation of the Effects of Fullerene addition and Plasma Exposure on Opt...
Investigation of the Effects of Fullerene addition and Plasma Exposure on Opt...
ย 
Structural and magnetic properties on F-doped LiVO2 with two-dimensional tria...
Structural and magnetic properties on F-doped LiVO2 with two-dimensional tria...Structural and magnetic properties on F-doped LiVO2 with two-dimensional tria...
Structural and magnetic properties on F-doped LiVO2 with two-dimensional tria...
ย 
Macromolecules
MacromoleculesMacromolecules
Macromolecules
ย 
85908856
8590885685908856
85908856
ย 

Simulation of Stokes Shifts in BF2 Formazanate Complexes

  • 1. Simulation of Stokes Shifts and Analysis of Substituent Effects in Boron Difluoride Formazanate Complexes By: Rayner B. Mendes Supervisor: Viktor N. Staroverov CHEM 4491E Thesis submitted in partial fulfillment of the requirements for the degree: Honours Bachelor of Science (Chemistry) Department of Chemistry The University of Western Ontario London, Ontario, Canada 2016
  • 2. ii Thesis Examiner 1: __________________ Styliani Constas Thesis Examiner 2: __________________ J. Clara Wren Supervisor: __________________ Viktor N. Staroverov
  • 3. iii Abstract Simulation of spectroscopic properties such as absorption, emission, and the Stokes shifts is of great interest for the tuning of functional materials. Structural tuning of boron difluoride (BF2) complexes has been shown to affect these spectroscopic properties. This study attempts to simulate the Stokes shifts of BF2 formazanate complexes using density-functional theory. Additionally, it analyzes the effect of structural tuning through a quantitative structure-property relationship study using multivariable regression of quantum-chemical descriptors. The proposed methodology is able to reproduce experimental data and provide insights into trends in substituent effects. The PBE0/6-311+G(d,p) level of theory is optimal in simulating absorption, emission, and Stokes shifts. The level of theory has a mean absolute error of 0.0048 eV, compared to other functionals and basis sets tested across 60 unique trials. For small ฯ€-conjugated molecules the simulated absorption, emission and Stokes shift had errors under 0.5%, 0.4% and 1.4% respectively. BF2 formazanate complexes had absorption and emission errors under 3% and 7%. Electronic trends from previous research were reproduced, electron-donation groups caused a red shift while electron-withdrawing groups caused a blue shift in the R3 position. The quantitative structure- property relationship model suggests that this trend is reversed when substituents are placed in the R1 and R2 positions. The multivariable regression analysis found that the absorption energy could be described by the dipole moment, excited-state dipole moment, dipole moment derivatives, HOMO and excited-state HOMO energies. The regression equation for a training series had a 6.0 nm mean absolute error, adjusted r2 of 0.9949, F-value under 0.04%, and all descriptors used had a p-value under 1% and is therefore accurate within a 99% confidence interval.
  • 4. iv Contents Abstract .........................................................................................................................................................iii Acknowledgments..........................................................................................................................................v 1 Introduction............................................................................................................................................1 1.1 Stokes shifts and structural tuning..................................................................................................1 1.2 Objectives.......................................................................................................................................3 1.3 Simulation of Stokes shifts.............................................................................................................3 1.4 Quantitative structure-property relationship (QPSR).....................................................................4 2 Experimental ..........................................................................................................................................5 2.1 Simulation of Stokes shifts.............................................................................................................5 2.1.1 Defining a molecular training set ...............................................................................................5 2.1.2 Computation details....................................................................................................................5 2.1.3 Choice of functionals and basis sets...........................................................................................9 2.2 QSPR model.................................................................................................................................11 2.2.1 QSPR training series ................................................................................................................11 2.2.2 Electronic and steric descriptors...............................................................................................12 2.2.3 Regression methodology..........................................................................................................12 2.2.4 Optimizing QPSR model..........................................................................................................13 3 Results and Discussion.........................................................................................................................15 3.1 Stokes shifts .................................................................................................................................15 3.2 Effect of structural tuning on spectroscopic properties................................................................17 3.2.1 Steric and electronic influence on Stokes shift in the R3 position............................................18 3.2.2 Electrostatics and electronic influence on Stokes shift in the R1 and R2 positions ..................18 3.2.3 Multivariable regression of hydrogen training series...............................................................19 4 Conclusion............................................................................................................................................21 4.1 Further Research ..............................................................................................................................21 References....................................................................................................................................................22
  • 5. v Acknowledgments I would like to thank Dr. Viktor Staroverov whose supervision, guidance and input have allowed me to gain a better grasp of computational chemistry. I would also like to thank Dr. Joe Gilroy for helping with the analysis of his groupโ€™s experimental data. Finally, I am grateful to Mr. Sviataslau Kohut, Dr. Rogelio Cuevas, Ms. Darya Komsa, and Mr. Hanqing Zhao for the insights they provided through conversations and discussions. List of Abbreviations B3LYP Becke, three-parameter, Lee-Yang-Parr BF2 boron difluoride DCM dichloromethane DFT density-functional theory ๐› dipole moment HF Hartree-Fock HOMO highest occupied molecular orbital PBE Perdewโ€“Burkeโ€“Ernzerhof PCM polarizable continuum model (excited-state energy) LUMO lowest unoccupied molecular orbital M06 Minnesota 06 MV molar volume SCF self-consistent field (ground-state energy) SCRF self-consistent reaction field SS Stokes shift TD-DFT time dependent density-functional theory THF tetrahydrofuran QSPR quantitative structure property relationship VWN Vosko-Wilk-Nusair XC exchange-correlation
  • 6. 1 Introduction 1.1 Stokes shifts and structural tuning Spectroscopic properties such as the Stokes shift (SS) wavelength measured in nm, ๐œ†SS = ๐œ†em โˆ’ ๐œ†abs (1) are of particular interest for molecules which exhibit extended ๐œ‹-conjugation. These spectroscopic properties allow for characterization of such molecules with regard to structural elucidation. The tuning of such properties through structural variation allows ฯ€-conjugated functional materials to be used in photovoltaic cells,1 luminescent materials,2 and field-effect transistors.3 It has been shown that increasing the ฯ€-conjugation of fused aromatic rings impacts its spectroscopic properties. Aromatic molecules exhibit an increase in absorption as ฯ€-conjugation is increased; benzene has a maximum absorption of 260 nm, naphthalene increases to 310 nm, and anthracene to 375 nm.4 The reason the wavelength of emission is different than that of absorption is due to the change in the potential energy curve for a molecules excited state. Figure 1 shows the Franck-Condom principle, which suggests that during an electronic transition is more likely to occur if two vibrational wavefunctions overlap. As a molecule is excited into its excited state it absorbs a photon, as the geometry distorts and the molecule relaxes to the ground state there is a release of a photon (emission). Figure 1: Absorption and emission due to fluorescence.
  • 7. 2 Boron difluoride (BF2) functional materials have been shown to have tunable spectroscopic properties which vary on ligand type, position of substitution and extent of ฯ€-conjugation. Fu and co-workers5 analyzed to naphthyridine BF2 complexes Figure 2 (1 and 2) observed that extension of the ฯ€-conjugation of the system results in a red shift by 20 nm. Piers et al.6 modified the structure of anilido-pyridine ligands Figure 2 (3 and 4) to increase the degree of ฯ€-conjugation which red- shifted the absorption and emission by 50 nm. Figure 2: BF2 complexes synthesized by the Fu5 and Piers groups.6 Functional materials such as the BF2 formazanate complexes synthesized in the Gilroy group are of particular interest.7 These complexes have numerous applications as dyes,8 or indicators of cell activity.9 The Gilroy group has shown10 that BF2 complexes derived from formazans have desirable spectroscopic properties which can be tuned through structural variation shown in Figure 3. It was found that electron-withdrawing groups on the complex will blue shift the maximum absorption and emission wavelengths.11 Electron-donating groups will cause a red shift even compared to their phenyl substituted counterparts.11 Through the observation of these trends, BF2 formazanate complexes are an ideal candidate for a computational study.
  • 8. 3 Figure 3: Synthesized boron difluoride formazanate complexes from the Gilroy group.12 1.2 Objectives The objective of this thesis is to: 1) Model the Stokes shifts of BF2 formazanate complexes in commonly used solvents such as dichloromethane (DCM) and tetrahydrofuran (THF). 2) Conduct a quantitative structure-property relationship (QPSR) study to determine whether through the application of the above model can: determine which electronic and steric descriptors affect the absorption, emission, and Stokes shifts as well as predict the influence of structural variation on spectroscopic properties. 1.3 Simulation of Stokes shifts The above objectives will be realized through density-functional theory (DFT) approximations using the GAUSSIAN 09 program.13 Calculations will utilize protocols such as the self-consistent reaction field (SCRF)14 to model solvation, and time-dependent DFT (TD-DFT) to model excited- states.15 DFT is a computational quantum-mechanics model used in a variety of fields from physics, to material science, and chemistry. DFT investigates the electronic structure of many-body systems such as atoms and molecules16 through the use of functionals to determine the nature of the electron density being analyzed. Various properties can be computed to high degree of accuracy without the need for experiments.17 These properties include excitation energies,12 ground- and excited-state geometries,11 and various spectroscopic properties.18 The distortion of geometry from the ground-state to the first excited- state results in the difference in absorption and emission wavelengths.19
  • 9. 4 1.4 Quantitative structure-property relationship (QPSR) As mentioned, a QSPR study will be performed on the absorption and emission of BF2 formazanate complexes through the regression of various electronic and steric descriptors. QSPRs are primarily used in pharmaceutical design and medicinal drug synthesis20 where the dependent variable such as binding affinity is related to an independent variable such as hydrophobicity.21 QSPR studies transformed searching for compounds with desired properties using chemical intuition into a mathematically quantified form.22 These studies can be conducted both experimentally23 and computationally.24 To conduct a QSPR model for the BF2 formazanate complex a multivariable regression model is required.25 Multivariable regressions quantitatively relate a properties such as absorption or emission to a block of predictor variables like quantum-chemical descriptors in the form of an equation.26 To adequately study the QSPRs of BF2 formazanate complexes the electronic and steric descriptors need to be quantified. These descriptors should provide insight into the chemical nature of the property under consideration.22 As with QSPRs of biological systems, the substituted ligands in molecules need to be varied so the data is meaningful.27 To this purpose, the skeletal BF2 formazanate structure will be substituted in the R1-R3 positions with various substituents shown in Figure 4. The substituents will vary in their position, sterics from low โ€˜Hโ€™ to high โ€˜naphthylโ€™ substituents, and in their electronic properties, electron-withdrawing groups like โ€˜NO2โ€™ substituents or electron-donating groups such as โ€˜methylโ€™ substituents. Figure 4: Boron difluoride skeletal structure with positions of substitution
  • 10. 5 The results from these computations will allow analysis of substituent effects of BF2 formazanate complexes when a variety of substituents impact its spectroscopic properties. The results of this study can be used to develop a predictive model based on electronic and steric descriptors, adapt the QSPR methodology outside biological systems, and estimate the Stokes shift in a time-efficient manner through the derived regression equation. 2 Experimental 2.1 Simulation of Stokes shifts Developing a model for the Stokes shifts of BF2 formazanate complexes requires three experimental aspects: molecular training set, protocols for self-consistent reaction field and time dependent DFT and finding the optimal level of theory for calculations. 2.1.1 Defining a molecular training set Applying experimental methodology to larger molecules such as BF2 formazanate complexes is computationally expensive often taking in excess of 150 hours. Using molecules which exhibit similar extended ฯ€-conjugation allows optimizing the model in a time-efficient manner. Naphthalene and fluorene Figure 5 were chosen as training molecules for Stokes shift simulations due to their similar ฯ€-conjugation and ฯ€ ๏ƒ  ฯ€* transitions. The absorption and emission of the training molecules are simulated in both DCM and THF solvents. Figure 5: Structure of training set molecules. 2.1.2 Computation details To simulate the Stokes shifts of the training molecules in solvents like DCM and THF, a protocol must be employed. The SCRF protocol in GUASSIAN 09 will be used to simulate solvent interactions. This works through the use of a polarizable continuum model (PCM) using the integral equation formalism in which the molecule of interest is placed within a cavity of two overlapping solvent spheres,28 the solventsโ€™ interactions on the molecule are thereby reproduced. For excited state calculations in solution, there is a distinction between equilibrium and non-
  • 11. 6 equilibrium calculations. There are two ways the solvent responds in regards to changes in the state of the solute: it polarizes the electron distribution, which is a rapid process, and the solvent molecules reorient themselves, a slower process. An equilibrium calculation describes a situation with the solvent had time to fully respond to the solute in both these ways. A non-equilibrium calculation is appropriate for the processes with are too rapid for the solvent to fully respond, such as vertical electronic excitation. TD-DFT is used for calculations which require non-equilibrium or excited-state geometries. TD- DFT described in the following way, for a given interaction potential, the RG theorem29 shows that the external potential uniquely determines the density. The Kohn-Sham approach chooses a non- interacting system for which the interaction potential is zero to form the density equal to the interacting system. The wave function of a non-interacting system can be represented as a Slater determinate of single-particle orbitals. This determines a potential which can be used to determine a non-interacting Hamiltonian Hs. (2) which determines a determinatal wave function (3) and generates a time-dependent density (4) Such that ฯs is equal to the density of the interacting system at all times. In this way if the potential can be determined then the original Schrรถdinger equation, a single partial differential equation of 3N variables is replaced by N differential equations in 3 dimensions.
  • 12. 7 Absorption and emission calculations Simulations of Stokes shifts require the following calculations: ground-state geometry, non- equilibrium solvation, absorption calculation, single point TD-DFT calculation, excited-state geometry optimization, and emission calculation shown in Figure 6. Figure 6: Calculations required to simulate absorption, emission, and Stokes shifts. Ground-state geometry optimization The output of the ground-state optimized geometry is the energy of the molecule in solution. Non-equilibrium solvation This calculation stores the information about the non-equilibrium solvation based on the ground- state. This calculation yields the ground-state SCF energy. Absorption calculation The actual state-specific calculation is then done reading in the required information for non- equilibrium solvation. The energy of the first excited-state is then calculated at the ground-state optimized geometry. This calculation yields the ground-state PCM energy. The maximum absorption is then calculated by: ฮ•abs = ฮ•XS โˆ’ ๐›ฆGS (5) ๐œ†abs = โ„Ž๐‘ ฮ•abs (6)
  • 13. 8 Single-point TD-DFT calculation This TD-DFT calculation determines the vertical excitation energy based on a linear response from the ground-state to the first allowed excited-state. Excited-state geometry optimization Using TD-DFT the force constants from the single-point calculations are read, the geometry is optimized in equilibrium solvation. Emission calculation The first step of this calculation writes the solvation data of the state-specific equilibrium solvation of the excited-state at its equilibrium geometry. This calculation yields the excited-state PCM energy. The second step of this procedure reads the solvation data and computes the ground-state energy with excited-state geometryโ€™ first excited state non- equilibrium static solvation. This calculation yields the excited-state SCF energy. ฮ•em = EXS โˆ— โˆ’ EGS โˆ— (7) ๐œ†em = โ„Ž๐‘ ฮ•em (8) Sample outputs of the above calculations are in Supplemental Section A.1. Calculations were performed using the outlined protocol on the Gaussian 09 user manual30 in addition to the following keywords: Int=(grid=UltraFine) โ€“ Ultrafine integration grid for DFT calculations Opt=(MaxCycle=100) โ€“ increasing number of cycles for convergence at a minimum in ground- and excited-state geometry optimizations TD=(โ€ฆ, NStates=3,โ€ฆ) and TD=(โ€ฆ, NStates=3, Root=๐‘ฅ) โ€“ calculations were specified to three excited-states to model allowed singlet transitions. The primary excited-state of interest ๐‘ฅ being the first.
  • 14. 9 2.1.3 Choice of functionals and basis sets The optimal combination of functional and basis set or โ€˜level of theoryโ€™ needs to be chosen for the training set and protocols outlined. Calculations other than ground-state geometry were conducted using a combination of various functional, basis sets, and solvents. Ground-state geometries were calculated using the B3LYP functional and 6-31G(d) basis set. Functionals The functionals chosen for modeling absorption and emission are: B3LYP, PBE0, and M06-2X. These three functionals are well known for their accuracy in spectroscopy calculations.31 B3LYP and PBE0 are hybrid exchange-correlation functions constructed as a linear combination of the Hartree-Fock (HF) exact exchange function. The parameters of the functionals are fitted based on the functionalsโ€™ prediction of experiment or calculated thermochemical data.32 Since the functionals being tested have a HF component it reduces the self-interaction error leading to good performance in TD-DFT calculations.31 The equation for a HF exact exchange energy is, Ex HF = โˆ’ 1 2 โˆ‘ โˆฌ ฯˆโˆ— ii,j (๐ซ1)ฯˆโˆ— i (๐ซ1) 1 r12 ฯˆi(๐ซ2)ฯˆj(๐ซ2)d๐ซ1d๐ซ2 (9) The B3LYP33 functional is based on the Becke 88 exchange functional34 , generalized-gradient approximation (GGA), and the VWN local-density approximation (LDA) 35 given by, Exc B3LYP = Ex LDA + Ec LDA + 0.20 ( Ex HF โˆ’ Ex LDA) + 0.72 (Ex GGA โˆ’ Ex LDA) + 0.81 (Ex GGA โˆ’ Ec LDA ) (10) The PBE0 functional32 mixes the PBE exchange energy and HF exchange energy according the equation, Exc PBE0 = 1 4 Ex HF + 3 4 Ex PBE + Ec PBE (11) The M06-2X functional36 is a global hybrid functional with 54% HF exchange energy. The M06 suite of functionals are constructed using empirical fitting of their parameters but constraining to the uniform electron gas.31
  • 15. 10 Basis sets A basis set is a set of functions that are combined in linear combinations to create molecular orbitals. The Pople basis set functions are typically denoted by ๐‘‹ โˆ’ ๐‘Œ๐‘๐‘”.37 X represents the number of primitive Gaussians for each core atomic orbital basis function, ๐‘Œand ๐‘ indicate the valence orbitals composed of a linear combination of ๐‘Œ and ๐‘ primitive Gaussian functions. The โ€˜*โ€™ adds valence polarized basis sets of the p, d, an f types, the โ€˜+โ€™ adds diffuse functions.38 The basis sets tested were: 6-31G(d), 6-31G(d,p), 6-31+G(d), 6-31+G(d,p), 6-311G(d), 6- 311G(d,p), 6-311+G(d), 6-311+G(d,p), 6-311+G(2d,p), 6-311+G(2df,2p) to test a variety of polarized and diffuse functions on the Pople basis set. Table 1: Summary of functionals and basis sets tested Across the two training molecules, two solvents, three unique calculations, three functionals, and 10 basis sets are a total 360 unique calculations, equivalent to decades of computation years. The level of theory which yields the smallest mean absolute error (MAE) across the training molecules and solvents compared to experimental data will be chosen. MAE = 1 ๐‘› โˆ‘|e ๐‘ก| ๐‘› ๐‘ก=1 (12) S Functionals tested Basis sets tested B3LYP PBE0 M06-2X 6-31G(d) 6-31G(d,p) 6-31+G(d) 6-31+G(d,p) 6-311G(d) 6-311G(d,p) 6-311+G(d) 6-311+G(d,p) 6-311+G(2d,p) 6-311+G(2df,2p)
  • 16. 11 To calculate the emission under the same methodology would be far too computationally expensive, as a single excited-state geometry optimization can take in excess of two weeks. To determine if the model can reproduce emission energies and Stokes shifts, the optimal level of theory will be used to calculate the Stokes shift for naphthalene, fluorene, and BF2 formazanate complexes to determine accuracy. 2.2 QSPR model The most accurate model for simulating the Stokes shift of BF2 formazanate complexes using the methodology above will be used in the subsequent QSPR study. The model will simulate the spectroscopic properties of the structurally tuned variations of the BF2 formazanate skeletal structure. The following is required to conduct the QSPR study: representative training series to simulate, electronic and steric descriptors to be used as independent regression variables, a regression methodology and means of optimization. 2.2.1QSPR training series The training series for the QSPR needs to be more robust than the model for the Stokes shift. A large training set is required to accomplish the goal of creating a predictive model. The training set should contain substituents which vary in their sterics and electronics, and should be able to model currently synthesized complexes from the Gilroy group. In Table 2 are five training series which meet the criteria set out. Each series can be used independently or in combination to build a QSPR model. Table 2: Training series for QSPR study Hydrogen series Phenyl series Naphthyl series Varied series Equivalent series R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3 H H H Ph Ph H Nh Nh H Me Me H Me Me Me H H Me Ph Ph Me Nh Nh Me Cl Cl H Cl Cl Cl H H Cl Ph Ph Cl Nh Nh Cl CO2H CO2H H CO2H CO2H CO2H H H CO2H Ph Ph CO2H Nh Nh CO2H OH OH H OH OH OH H H OH Ph Ph OH Nh Nh OH NMe2 NMe2 H NMe2 NMe2 NMe2 H H NMe2 Ph Ph NMe2 Nh Nh NMe2 NO2 NO2 H NO2 NO2 NO2 H H NO2 Ph Ph NO2 Nh Nh NO2 CN CN H CN CN CN H H CN Ph Ph CN Nh Nh CN CO CO H CO CO CO H H CO Ph Ph CO Nh Nh CO H H Ph Ph Ph Ph Nh Nh Ph H H Nh Ph Ph Nh Nh Nh Nh
  • 17. 12 The substituents in each series vary in their steric and electronics from low to high and electron- donating to electron-withdrawing. The hydrogen, varied, and equivalent series were created to draw inferences on the effect of substituents on the R3, R1 and R2 positions respectively. The phenyl and naphthyl series attempt to do the same however represent complexes which have already been synthesized in the Gilroy group12 . 2.2.2 Electronic and steric descriptors The dependent variables in the regression model will be absorption and emission wavelengths respectively. The independent variables or โ€˜descriptorsโ€™ need to have a theoretical basis in other computational QSPR studies; therefore quantum-chemical descriptors are used because of the lack of inherent error normally associated with experimental measurements. Systematic error may exist in the simulation model, however this error is considered to be applied evenly through the series analyzed, thus does not influence modeled trends.22 To determine influence of structural tuning on the absorption and emission, the following descriptors will be calculated: highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) energies39 , HOMO-LUMO (HL) gap39 , dipole moment (ฮผ)40 , root of ฮผ (โˆšฮผ)40 for both the ground- and excited-state. Additionally, molar volume (MV)41 will be used to describe the steric influence of the substituents. All descriptors with exception to molar volume are contained within the output files of the ground- and excited-state geometry optimizations respectively. 2.2.3 Regression methodology The linear regression model for multiple independent variables can be described through the following equations.42 Every independent variable (๐‘ฅ) is associated with a value of the dependent variable (๐‘ฆ). For p independent variables ๐‘ฅ1, ๐‘ฅ2, . . . , ๐‘ฅp the mean (ฮผy) or the โ€˜fitโ€™ is ๐œ‡ ๐‘ฆ = ๐›ฝo + ๐›ฝ1 ๐‘ฅ1 + ๐›ฝ2 ๐‘ฅ2 + โ‹ฏ + ๐›ฝ ๐‘ ๐‘ฅ ๐‘ (13) The observed values for ๐‘ฆ vary about ฮผ ๐‘ฆ and are assumed to have the same standard deviation. The fitted values ๐‘1, ๐‘2, . . . , ๐‘ ๐‘ estimate the parameters ๐›ฝo, ๐›ฝ1, โ€ฆ , ๐›ฝp of the population. The
  • 18. 13 regression model includes a model deviation term (๐œ€) which represents the deviations of observed ๐‘ฆ from ฮผ ๐‘ฆ normally distributed with a mean of 0. The model for multiple linear regressions for ๐‘› observations where ๐‘– = 1, 2, โ€ฆ , ๐‘› is, ๐‘ฆ๐‘– = ๐›ฝo + ๐›ฝ1 ๐‘ฅ๐‘–1 + ๐›ฝ2 ๐‘ฅ๐‘–2 + โ‹ฏ + ๐›ฝ ๐‘ ๐‘ฅ๐‘–๐‘ + ๐œ€๐‘– (14) The least-squares model finds the line of best-fit by minimizing the sum of squares of the residuals (๐‘’๐‘–) or vertical deviations from the line. A vertical deviation equal to 0 represents a point which lies exactly on the line. The residuals (๐‘’๐‘–) is given by ๐‘’๐‘– = ๐‘ฆ๐‘– โˆ’ ๐‘ฆ๐‘– (15) Where ๐‘ฆ๐‘– represents the values which fit by the equation ๐‘ฆฬ‚๐‘– = ๐‘o + ๐‘1 ๐‘ฅ๐‘–1 + ๐‘2 ๐‘ฅ๐‘–2 + โ‹ฏ + ๐‘ ๐‘ ๐‘ฅ๐‘–๐‘ (16) 2.2.4 Optimizing QPSR model The multivariable linear regression models from the Analysis Toolpak in Microsoft Excel will be used. The optimized model will ideally have a low standard error (SE), low MAE for residuals, high adjusted r2 value to ensure accuracy of multiple descriptors, p and F-value under 5% to ensure the underlying model is statistically sound at a 95% confidence interval. Where ๐‘ is the sample size, ๐ท is the number of descriptors SE = ๐œŽ๐‘ฅ โˆšN (17) r2 = โˆ‘(๐‘ฆ๐‘– โˆ’ ๐œ‡ ๐‘ฆ)2 โˆ‘(๐‘ฆ๐‘– โˆ’ ๐œ‡ ๐‘ฆ)2 (18) Adj. r2 = 1 โˆ’ (1 โˆ’ r2)(๐‘ โˆ’ 1) ๐‘ โˆ’ ๐ท โˆ’ 1 (19)
  • 19. 14 A ๐‘-value is defined as the probability under the assumption of a hypothesis (H), of obtaining a result equal to or more extreme than observed in a normal distribution. Figure 7: Visual representation of a p-value. Below is the equation for the p-value of the two-tails in a Gaussian distribution, ๐‘value = 2 min โŸจPr(๐‘‹ โ‰ค ๐‘ฅ|๐ป), Pr(๐‘‹ โ‰ฅ ๐‘ฅ|๐ปโŸฉ (20) F-tests analyze the variance of a quantifiable variable in pre-defined group. This can be used to make sure that the groups of descriptors are significant to the regression equation. Where ๐พ is number of groups and ๐‘ฆ๐‘–๐‘— is the ๐‘—th observation in the ๐‘–th out of ๐พ groups, F = explained variance unexplain variance = โˆ‘ ni(yi โˆ’ ฮผy)2 ร— (N โˆ’ K)i โˆ‘ (yij โˆ’ y i )2 ร— (K โˆ’ 1)i,j (21) Descriptors will be analyzed both individually and in conjunctions with each other to determine how they affect the above metrics. The goal is to develop a model with the fewest descriptors that predicts the spectroscopic properties of interest.
  • 20. 15 3 Results and Discussion 3.1 Stokes shifts The PBE0 functional consistently outperforms both B3LYP and M06-2X functionals across basis sets; B3LYP consistently underestimates while M06-2X overestimates absorption energy, shown through Figure 7 and 8. Figure 7: Calculated absorption energy (eV) for naphthalene in DCM and THF compared to experimental results (dashed line). Figure 8: Calculated absorption energy (eV) for fluorene in DCM and THF compared to experimental results (dashed line).
  • 21. 16 Table 3: MAE of calculated absorption energy (eV) compared to experimental data for all calculated levels of theory The minimum MAE for calculated absorption energy across training molecules and solvents is found to be the PBE0/6-311+G(d,p) level of theory. Table 4: Percent error in absorption, emission and Stokes shifts (nm) for naphthalene, fluorene, BF2 formazanate Complexes 12 in solution using PBE0/6-311+G(d,p). Naphthalene Solvent Absorption (nm) Emission (nm) Stokes shift (nm) Calcd Expt Error (%) Calcd Expt Error (%) Calcd Expt Error (%) DCM 276.8 277 0.07% 321.4 321 0.12% 44.6 44 1.36% THF 277.2 276 0.43% 321.7 320.5 0.40% 44.6 44.5 0.22% Fluorene DCM 264.6 265 0.15% 311 311.5 0.16% 46.9 46.5 0.86% THF 264.4 264 0.15% 310.5 309.5 0.32% 46.1 45.5 1.32% BF2 Complex (R1 โ€“ Ph, R2 โ€“ Ph, R3 โ€“ CN ) DCM 477 491 2.85% 550 584 5.82% 73 93 21.5% THF 488.4 489 0.12% 550 585 6.98% 61.6 96 35.8% BF2 Complex (R1 โ€“ Ph, R2 โ€“ Ph, R3 โ€“ NO2 ) DCM 482 491 1.83% 552 587 5.96% 70 96 27.1% B3LYP PBE0 M06-2X Basis set MAE (eV) 6-31G(d, 6-31G(d,p) 6-31+G(d) 6-31+G(d,p) 6-311G(d) 6-311G(d,p) 6-311+G(d) 6-311+G(d,p) 6-311+(2d,p) 6-311+(2df,2p) 0.098 0.085 0.128 0.137 0.083 0.090 0.077 0.165 0.189 0.183 0.0939 0.0863 0.0359 0.0459 0.0253 0.0355 0.0052 0.0048 0.0951 0.1100 0.2942 0.2889 0.2238 0.2157 0.2610 0.2542 0.1879 0.1017 0.0686 0.0659 Minimum MAE 0.0767 0.0048 0.0669
  • 22. 17 The chosen level of level of theory gives good results for absorption across all molecules including the BF2 formazanate test set, boasting errors under 5%. The emission wavelengths and Stokes shifts for the training molecules also show good agreement under 5%. Unfortunately, the emission calculations and the Stokes shifts do not show good agreement. Given agreement for emission in the training set, it is likely that the methodology does not scale with size or ฯ€-conjugation. A possible reason for the high error is, excited-state calculations which utilize the SCRF protocol are unable to model bulk solution effects such as ฯ€-stacking. The emission calculation show a systematic underestimation of the emission energy. Systematic errors in a regression model are held constant throughout a training series; therefore the level of theory PBE0/6-311+G(d,p) will be utilized for modeling all spectroscopic properties in the QSPR study. 3.2 Effect of structural tuning on spectroscopic properties Given the computationally heavy nature of simulating spectroscopic properties of 49 training molecules and their descriptors of interest, one was unable to get all the required data for a thorough analysis. See Supplemental Information A.2 for more information on computation time. However, enough was acquired to do a partial QSPR study and analyze trends. Table 5: Compilation of absorption, emission, and descriptor data acquired from training series. Hydrogen Series Phenyl Series Varied Series Equivalent Series Position R3 a R3 R1, R2 R1, R2, R3 Calc.b Abs (nm) Emis (nm) SS (nm) Abs (nm) Emis (nm) SS (nm) Abs (nm) Emis (nm) SS (nm) Abs (nm) Emis (nm) SS (nm) H 392.3 493.2 101.0 331.1 543.9 212.8 392.3 493.2 101.0 392.3 493.2 101.0 Me 385.2 565.2 180.0 423.3 555.4 132.1 379.2 468.1 88.9 362.5 ND ND Cl 388.1 485.5 97.4 506.4 577.3 70.9 331.6 429.4 97.8 377.6 429.3 51.6 CO2H 393.8 485.4 91.6 468.1 539.9 71.8 492.7 634.8 142.1 497.2 624.5 127.3 OH 401.8 542.5 140.7 433.0 566.7 133.8 319.2 357.9 38.8 321.6 366.5 44.9 NMe2 727.8 ND ND 932.0 2210.7 1278.7 372.3 ND ND 439.7 ND ND NO2 385.3 463.3 78.0 481.7 552.3 70.6 423.5 551.0 127.5 418.9 ND ND CN 379.9 467.8 87.9 ND ND ND 454.0 609.7 155.8 435.4 ND ND CO 960.3 1522.2 561.8 ND ND ND 493.9 ND ND ND ND ND Ph 492.1 1120.4 628.3 ND ND ND ND ND ND ND ND ND Nh 734.4 2958.7 2224.3 ND ND ND ND ND ND ND ND ND a If substituent position is not specified then Rn is Ph for the Phenyl Series, and H for all others. b ND โ€“ No Data, ND was obtained for the naphthyl series due to the computation time required.
  • 23. 18 3.2.1 Steric and electronic influence on Stokes shift in the R3 position Steric influence The influence for a substituentsโ€™ sterics is unclear, however there seems to be a trend in which substituents with higher sterics disproportionately increase the emission wavelength compared to absorption. The emission for the โ€˜naphthylโ€™ in the hydrogen series is 2969 nm over 4 times its absorption, while โ€˜phenylโ€™ in the same series has an emission of 1120 nm about 2 times its absorption. Molecules with much lower sterics have much smaller absorption to emission ratios, 1:1.25 and 1:45 for โ€˜chlorideโ€™ and โ€˜methylโ€™ respectively. Electronic influence Using the hydrogen series baseline (R1 = R2 = R3 = H), one can see that electron-donating groups such as โ€˜methylโ€™ from the hydrogen series cause a red shift by 79 nm to the Stokes shift. Highly electron-withdrawing groups such as โ€˜NO2โ€™ cause a blue shift by 23 nm. This trend holds for the phenyl series (R1 = R2 = Ph, R3 = H) where โ€˜methylโ€™ causes a red shift 80.7 nm, and โ€˜NO2; a blue shift by 142.2 nm. One can conclude that the ability for electron-donating and electron- withdrawing groups to shift the Stokes shift increases when large ฯ€-conjugated substituents are in the R1 and R2 positions, thus agreeing with previous research.11 3.2.2 Electrostatics and electronic influence on Stokes shift in the R1 and R2 positions Steric influence The influence from a substituents steric is unclear from the data calculated for the R1 and R2 positions. Electronic influence Compared to R3 position, the trends in electronic influence on absorption and emission is reversed for R1 and R2 substituted positions. Compared to the hydrogen series baseline, the electron- donating โ€˜methylโ€™ substituents from the varied series predict a blue shift of 12 nm; electron- withdrawing groups like โ€˜NO2โ€™ from the varied series the cause a red shift by 27 nm.
  • 24. 19 3.2.3 Multivariable regression of hydrogen training series QSPR analysis on series without sufficient emission data must be excluded. These must be excluded because the chosen excited-state quantum-chemical descriptors are found in the output of emission calculations. The hydrogen series is the only series for which multivariable regression analysis can be conducted. Table 6: Tests to optimize multivariable regression model for absorption (nm) of hydrogen series Hydrogen Series - Absorption Regression Model Trial # Descriptors r2 Adj. r2 F-value MAE (nm) Standard Error (nm) 1 All NA NA NA NA NA 2 Less MV 0.9998 -0.0016 NA 2.0 7.9 3 Less HL gap 0.9998 0.9984 2.93% 2.0 7.9 4 Less ELUMO โฌš 0.9995 0.9979 0.16% 2.8 9.1 5 a Less ELUMO โˆ— 0.9983 0.9949 0.03% 6.0 14.1 a Supplemental Information A.3 contains output of trial 5 The goal for the optimized regression model was a low standard error, low MAE, high adjusted r2 , and F-value less than 5% to ensure a statically sound model within a 95% confidence interval; these goals were realized in five trials. The reasons descriptors were eliminated is as follows: in trial 1, the MV had a regression co-efficient of 0, trial 2 had line overfitting as dictated by the negative adjusted r2 value caused by the HL gap. Trials 3 through 5 removed statistically insignificant descriptors of LUMO and excited-state LUMO energies carrying p-values of 42.4%, and 14.9% respectively. ฮปabs(nm) = 5071 โˆ’ 6897 EHOMO + 8142 EHOMO โˆ— โˆ’ 1214 ฮผ + 2619ฮผโˆ— + 4110โˆšฮผ โˆ’ 9058โˆšฮผ โˆ— (22) The final equation for the regression model boast a 6.0 nm MAE and 14.1 nm standard error. The high coefficients in the equation are due to the units of the descriptors being orders of magnitude smaller than nm.
  • 25. 20 Table 7: Statistical output of multivariable linear regression on emission (nm) for hydrogen training series. Hydrogen Series - Emission Regression Model Trial # Descriptors r2 Adj. r2 F-value MAE (nm) Standard Error (nm) 1 All NA NA NA NA NA 2 Less MV 0.9905 -0.0855 NA 58.3 234.7 3 Less HL gap 0.9905 0.9145 21.12% 58.3 234.7 4 Less ELUMO โฌš 0.9840 0.9281 5.48% 89.0 215.3 5 Less EHOMO โฌš 0.9833 0.9500 0.92% 85.7 179.4 6 Less EHOMO โˆ— 0.9826 0.9608 0.13% 82.5 158.8 7 Less ELUMO โˆ— 0.9767 0.9581 0.03% 72.8 164.3 The emission regression model took a series of 7 trials to optimize. The reason descriptors were eliminated were as follows: trials 1 thorough 3 followed the same logic as the absorption model. Trials 4 through 7 eliminated the descriptors of LUMO energy, HOMO energy, excited-state HOMO energy, and excited-state LUMO energy which had p-values of 56.0%, 79.9%, 73.7% and 30.9% respectively. This suggests that these descriptors are unimportant in emission simulation. ฮปem(nm) = 22048 โˆ’ 10187 ฮผ + 17285 ฮผโˆ— + 31918 โˆšฮผ โˆ’ 57244 โˆšฮผ โˆ— (23) Examining the above equation might explain why both MAE and standard error yield higher errors of 72.8 nm and 164.3 nm respectively. The intercept is an order of magnitude greater than the equation for absorption (22048 vs. 5071). To counter this the coefficients of the independent variables must be higher to compensate (-10187 vs. -1214 for ฮผ). Subtle variations in the descriptor variables inputs would yield a higher associated error. It should be noted that even though the adjusted r2 value is lower and errors much higher for this model, it is still statistically significant with a F-value of 0.03%. It is likely that other descriptors which were not tested need to be employed to predict behavior; one would still expect that the trends replicated by this model to be accurate.
  • 26. 21 4 Conclusion The proposed methodology for simulating the Stokes shift and the following QSPR study on substituents effects on BF2 formazanate complexes reproduces experimental data and provides foundation for future research into substituent analysis. It has been shown that PBE0/6-311+G(d,p) level of theory is optimal in simulating absorption, emission, and Stokes shifts with a MAE of 0.0048 eV compared to other functionals and basis sets tested across 60 unique trials. For ฯ€- conjugated molecules in the training set, errors compared to experiment in the absorption, emission and Stokes shift were under 0.5%, 0.4% and 1.4%. The BF2 formazanate complexes tested under the same level of theory had errors under 3% for absorption and 7% for emission. Increased error in emission energy is likely due intermolecular effects not modeled in the system. Over 45 training molecules were structurally tuned and analyzed based on their electronic and steric features. Observed electronic trends from previous research were reproduced,11 specifically electron- donating groups caused a red shift in the R3 position, while electron-withdrawing groups caused a blue shift to the Stokes shift. The QPSR training model further suggests that this trend is reversed when substituents are placed in the R1 and R2 positions; electron-donating groups causing a blue shift, and electron-withdrawing groups a red shift. When substituents with high sterics were added in the R3 position of the BF2 formazanate skeletal structure the ratio of emission to absorption increased 5:1 for โ€˜naphthylโ€™ vs 1.2:1 for โ€˜chlorideโ€™. Finally, a multivariable regression analysis found that the absorption energy could be described by the dipole moment, excited-state dipole moment, dipole moment derivatives, HOMO and excited-state HOMO energies. The hydrogen series regression equation boasts a 14.1 nm standard error, 6.0 nm mean absolute error, adjusted r2 of 0.9949, F-value under 0.04%, and all descriptors had a p-value under 1% and is therefore accurate within a 99% confidence interval. 4.1 Further Research Further research may be conducted on the analysis for all the training series with the methodology outlined herein. The results of the regression analysis from this and future research could be used on a test set to determine whether the regression equations can reproduce experimental spectroscopic properties. Various other descriptors could be used to identify their relationships to properties such as emission wavelengths. Databases of various molecules could be created, analyzed, and used for optimizing substituents for strategic synthesis in functional materials.
  • 27. 22 References 1 J. Roncali, P. Leriche, and P. Blanchard, Adv. Mater. 26, 3821 (2014). 2 D. Frath, J. Massue, G. Ulrich, and R. Ziessel, Angew. Chem. Int. Ed. Engl. 53, 2290 (2014). 3 W. Wu, Y. Liu, and D. Zhu, Chem. Soc. Rev. 39, 1489 (2010). 4 M. Montalti, A. Credi, L. Prodi, and M.T. Gandolfi, Handbook of Photochemistry (2006). 5 L. Quan, Y. Chen, X.-J. Lv, and W.-F. Fu, Chem. Eur. J. 18, 14599 (2012). 6 J.F. Araneda, W.E. Piers, B. Heyne, M. Parvez, and R. McDonald, Angew. Chem. Int. Ed. Engl. 50, 12214 (2011). 7 S.M. Barbon, V.N. Staroverov, P.D. Boyle, and J.B. Gilroy, Dalton Trans. 43, 240 (2014). 8 M. Szymczyk, A. El-Shafei, and H.S. Freeman, Dye. Pigment. 72, 8 (2007). 9 W.M. Frederiks, J. van Marle, C. van Oven, B. Comin-Anduix, and M. Cascante, J. Histochem. Cytochem. 54, 47 (2006). 10 M. Hesari, S.M. Barbon, V.N. Staroverov, Z. Ding, and J.B. Gilroy, Chem. Commun. 51, 3766 (2015). 11 S.M. Barbon, P.A. Reinkeluers, J.T. Price, V.N. Staroverov, and J.B. Gilroy, Chem. Eur. J. 20, 11340 (2014). 12 S.M. Barbon, V.N. Staroverov, and J.B. Gilroy, J. Org. Chem. 80, 5226 (2015). 13 M.J. Frisch, G.W. Trucks, H.B. Schlegel, G.E. Scuseria, M. a. Robb, J.A. Cheeseman, G. Scalmani, V. Barone, B. Mennuci, G.A. Petersson, H. Nakatsuji, M. Caricato, X. Li, H.P. Hratchian, A.F. Izmaylov, J. Bloino, G. Zheng, J.L. Sonnenberg, W. Liang, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, T. Vreven, J.A. Montgomery, J.E. Peralta, F. Ogliaro, M.J. Bearpark, J.J. Heyd, E. Brothers, K.N. Kudin, V.N. Staroverov, T. Keith, R. Kobayashi, J. Normand, K. Raghavachari, A. Rendell, J.C. Burant, S.S. Iyengar, J. Tomasi, M. Cossi, N. Rega, J.M. Millam, M. Klene, J.E. Knox, J.B. Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R.E. Stratmann, O. Yazyev, A.J. Austin, R. Cammi, C. Pomelli, J.W. Ochterski, R.L. Martin, K. Morokuma, V.G. Zakrezewski,
  • 28. 23 G.A. Voth, P. Salvador, J.J. Dannenberg, S. Dapprich, P. V. Parandekar, N.J. Mayhall, A.D. Daniels, O. Farkas, J.B. Foresman, J. V. Ortiz, J. Cioslowski, and D.J. Fox, Gaussian Dev. Version, Revis. H. 32 Gaussian (2010). 14 T. Mineva, N. Russo, and M. Toscano, Int. J. Quantum Chem. 56, 663 (1995). 15 E.K.U. Gross and W. Kohn, Adv. Quantum Chem (1990). 16 P. Hohenberg, Phys. Rev. 136, B864 (1964). 17 W. Kohn and L.J. Sham, Phys. Rev. 140, A1133 (1965). 18 M. Bourass, A. Touimi Benjelloun, M. Hamidi, M. Benzakour, M. Mcharfi, M. Sfaira, F. Serein-Spirau, J.-P. Lรจre-Porte, J.-M. Sotiropoulos, S.M. Bouzzine, and M. Bouachrine, J. Saudi Chem. Soc. (2013). 19 F. Cervantes-Navarro and D. Glossman-Mitnik, Chem. Cent. J. 6, 70 (2012). 20 G.-F. Yang and X. Huang, Curr. Pharm. Des. 12, 4601 (2006). 21 J. Verma, V.M. Khedkar, and E.C. Coutinho, Curr. Top. Med. Chem. 10, 95 (2010). 22 M. Karelson, V.S. Lobanov, and A.R. Katritzky, Chem. Rev. 96, 1027 (1996). 23 T.W. Schultz, M.T.D. Cronin, J.D. Walker, and A.O. Aptula, J. Mol. Struct. THEOCHEM 622, 1 (2003). 24 A.E. Soffers, M.G. Boersma, W.H. Vaes, J. Vervoort, B. Tyrakowska, J.L. Hermens, and I.M. Rietjens, Toxicol. In Vitro 15, 539. 25 M.M.C. Ferreira, J. Braz. Chem. Soc. 13, 742 (2002). 26 R. Kiralj and M.M.C. Ferreira, J. Braz. Chem. Soc. 20, 770 (2009). 27 M.O. Taha, A.M. Qandil, D.D. Zaki, and M.A. AlDamen, Eur. J. Med. Chem. 40, 701 (2005). 28 S. Miertu, E. Scrocco, and J. Tomasi, Chem. Phys. 55, 117 (1981). 29 E. Runge and E.K.U. Gross, Phys. Rev. Lett. 52, 997 (1984). 30 1 (2015).
  • 29. 24 31 Y. Zhao and D.G. Truhlar, J. Phys. Chem. A 110, 13126 (2006). 32 J.P. Perdew, M. Ernzerhof, and K. Burke, J. Chem. Phys. 105, 9982 (1996). 33 K. Kim and K.D. Jordan, J. Phys. Chem. 98, 10089 (1994). 34 A.D. Becke, Phys. Rev. A 38, 3098 (1988). 35 S.H. Vosko, L. Wilk, and M. Nusair, Can. J. Phys. 58, 1200 (1980). 36 Y. Zhao and D.G. Truhlar, Theor. Chem. Acc. 120, 215 (2007). 37 R. Ditchfield, J. Chem. Phys. 54, 724 (1971). 38 J.A. Montgomery, M.J. Frisch, J.W. Ochterski, and G.A. Petersson, J. Chem. Phys. 110, 2822 (1999). 39 A.R. Katritzky, V.S. Lobanov, and M. Karelson, Chem. Soc. Rev. 24, 279 (1995). 40 L. Buydens, D.L. Massart, and P. Geerlings, Anal. Chem. 55, 738 (1983). 41 D.F. V. Lewis, C. Ioannides, and D. V. Parke, Xenobiotica 24, 401 (2008). 42 J.O. Rawlings, S.G. Pantula, and D.A. Dickey, editors , Applied Regression Analysis (Springer-Verlag, New York, 1998).
  • 30. A Supplemental Information A.1 Sample output for simulation calculations using acetaldehyde Ground-state geometry optimization SCF Done: E(RB3LYP) = -153.851761719 A.U. after 1 cycles Non-equilibrium solvation No output for interpretation Absorption calculation After PCM corrections, the energy is -153.687679826 a.u. Single-point TD-DFT calculation No output for interpretation Excited-state geometry optimization Total Energy, E(TD-HF/TD-KS) = -153.705918726 Emission calculation After PCM corrections, the energy is -153.707148980 a.u. SCF Done: E(RB3LYP) = -153.822024722 A.U. after 10 cycles A.2 Total computation time utilized on SHARCNET for calculations
  • 31. A.3 Sample output of regression trial Final regression output for optimized hydrogen series absorption SUMMARY OUTPUT Regression Statistics Multiple R 0.999150538 R Square 0.998301798 Adjusted R Square 0.994905395 Standard Error 14.1149689 Observations 10 ANOVA df SS MS F Significance F Regression 6 351361.0959 58560.18 293.9291 0.000305546 Residual 3 597.6970412 199.2323 Total 9 351958.7929 Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Lower 95.0%Upper 95.0% Intercept 5071.433211 310.4157034 16.33755 0.0% 4083.551902 6059.315 4083.552 6059.314519 HOMO -6896.995793 1150.644052 -5.99403 0.9% -10558.85871 -3235.13 -10558.9 -3235.13288 DM -1214.378491 117.1538478 -10.3657 0.2% -1587.214321 -841.543 -1587.21 -841.5426608 RDM 4109.608812 402.0016826 10.22286 0.2% 2830.260042 5388.958 2830.26 5388.957582 ExHOMO 8141.875651 1263.507913 6.443866 0.8% 4120.829561 12162.92 4120.83 12162.92174 ExDM 2619.329506 213.6784301 12.25828 0.1% 1939.309376 3299.35 1939.309 3299.349637 ExRDM -9058.120425 746.8661692 -12.1282 0.1% -11434.98191 -6681.26 -11435 -6681.258944 RESIDUAL OUTPUT Observation Predicted Abs Residuals Abs. Error 1 377.5660811 14.72161536 14.72162 2 393.3729351 -8.212519227 8.212519 3 389.9451678 -1.893151418 1.893151 4 396.5349594 -2.730898376 2.730898 5 388.9116662 12.85973835 12.85974 6 389.2860815 -3.975836674 3.975837 7 382.7882077 -2.88556638 2.885566 8 957.9521865 2.373056185 2.373056 9 502.4186819 -10.36025899 10.36026 10 734.260612 0.103821168 0.103821 MAE 6.011646