Bioinformatics resources and search tools - report on summer training project institute of pharma inquest to explore advanced concepts of computational biology, sc

1
Report on Summer Training Project
To Explore Advanced Concepts of Computational Biology, Scientific
Communication and Pharmacovigilance
Project Done At
Institute of Pharma Inquest
SCO 1-12, 4th Floor, PPR Mall, Mithapur Road, Model Town, Jalandhar-144001
Under the Supervision of
Dr. Harpreet Kaur (Director)
and Miss Geetu (Associate Professor)
Submitted By
Vir Sapan Pratap Anand
Registration No. 11202530

2
JULY 30, 2015
Declaration
I hereby declare that I have completed my six weeks summer training at Institute of Pharma
Inquest (SCO 1-12, 4th Floor, PPR Mall, Mithapur Road, Model Town, Jalandhar-144001) on
the project “To Explore advanced concepts of Literature search, Pharmacovigilance and
Computational Biology” from 15th
June 2015 to 30th
July 2015 under the guidance of Dr.
Harpreet Kaur (Director) and Miss. Geetu (Associate Professor). I have declare that I have
worked with full dedication during these six weeks of training and my learning outcomes fulfill
the requirements of training for the award of degree of B.tech Biotechnology(Hons.) Lovely
Professional University, Phagwara.
Signature of the Supervisor

3
ACKNOWLEDGEMENT
Any accomplishment requires the effort of many people and this one is no different. As I come to
the completion of my project, I wish to express my gratitude to those who have contributed to
guide me at every stage of my experiments and made me more practical. The work would not
have been possible to come to the present shape without the able guidance, supervision and help
to me by number of people. With deep sense of gratitude, I acknowledge the encouragement and
guidance received by my organizational guide. To begin with, I would like to extend my
immense gratitude and respect for Dr. Himanshu, H.O.D. Department of Bioscience and
Biotechnology, Lovely Professional University, who availed me with this wonderful opportunity
to work at such an institute. I am highly obliged to Dr. Harpreet Kaur and Miss Geetu for their
exemplary guidance and motivation which ultimately led to the empowerment of my skills in
computational biology during the training period at Institute Pharma Inquest. I would like to
thank Dr. Sumit K Dalal who have guided us a lot in clearing our concepts in Clinical research,
Pharmacovigilance and Job opportunities in this department. His guidance and encouragement
are truly valued.
I have been privileged to work under Dr. Harpreet and Ms. Geetu to whom I will be forever
obliged. Their zeal and scientific attitude were highly contagious and put my efforts on to the
right path. I am thankful to Mr. Akash Sir for giving us guidance regarding our professional and
personality enhancement. I am grateful to the cooperation of my lab mates Pooja Arora,
Sugandha Dubey, Amiya Das, Salma Khatun and Priyanka Misra.
Finally I would like to thank my family who has been there to inspire and support me at my
every step and persistent effort. Without the blessings of my parents and almighty God, this
accomplishment could never be possible. I thank them for all the love and faith they have shown
in me.
I perceive as this opportunity a big opportunity in my career development. I will apply this
knowledge and skills gained in the best possible and will continue my efforts to work on their
improvement in order to attain desired career objectives.
Vir Sapan Pratap Anand

4
Table of Contents
Serial
Numbe
r
Topic Page
Numbe
r
1. List of Figures 5
2. List of Abbreviations 7
3. CHAPTER 1 Objectives of the Study 9
4. CHAPTER 2 Literature Review
 Bioinformatics
 Literature Search
 Medical Writing
 Clinical Research
 Pharmacovigilance
 HAROM
10
10
11
12
12
13
14
5. CHAPTER 3 Scope of Study and Future aspects 15
6. CHAPTER 4 Conceptual Research and Techniques 20
7. Chapter 4.1 Bioinformatics and Computational Biology
 SOPMA
 PHD
 PSIPRED
 Swiss PDB Viewer
 JMOL
 UCSF Chimera
 POLYVIEW 3D
 Combinatorial Extension
 Mfold
21
21
25
27
30
34
39
41
42
43
8. Chapter 4.2 Literature Search
 PubMed
 ScienceDirect
 Cochrane Library
 OMIM
 Clinical Trials.gov
46
49
52
53
55
56

5
9. Chapter 4.3 Essentials of Medical Writing 58
10. Chapter 4.4 Clinical Research and Pharmacovigilance 67
11. Chapter 4.5 Human Adverse Reaction Online Monitoring
(HAROM) System
72
12. Results and Discussion 79
13. Conclusion 80
14. Bibliography 81

6
List of Figures
Serial
Number
Figure Page
Number
1. SOPMA Front Page 22
2. SOPM Result Page 23
3. Graphical Result of SOPM 24
4. Pasting protein sequence in PHD 26
5. PHD Results 26
6. PHD Graphical Results 27
7. PSIPRED front page. By default, the chosen prediction
method is PSIPRED v3.3(Predict secondary structure)
28
8. Protein Sequence Pasted in PSIPRED and short identifier
name given
29
9. PSIPRED results showing alpha helix, beta sheets and
disordered region in different colours
29
10. PSIPRED results showing confirmation regions and
secondary structure prediction
30
11. SWISS-PDB toolbar menu 31
12. SWISS PDB control bar and layers info window
information
31
13. SWISS PDB toolbar 32
14. SWISS PDB showing structure of a protein molecule. 32
15. SWISS PDB tool bar, graphic window showing protein
and control panel
32
16. SWISS PDB preferences menu 33
17. Changing the atomic view in ribbon form through control
panel
33
18. The menu bar in SWISS PDB 34
19. SWISS PDB tools 34
20. Front screen of JMOL page 37
21. Graphical image of insulin in JMOL with ID 1ZNI 37
22. Complete window of JMOL 37
23. Various Menu bars in JMOL 38
24. Insulin in 3D View showing ligands along with the
toolbar
38
25. CHIMERA Graphical Window showing 3D protein
structure
39

7
26. Window showing selection of A Chain of hemoglobin and
colouring it as red
40
27. Chimera tool graphic window showing how to change
ribbon view into atomic view of protein
40
28. Polyview 3D Home page. The step shows where to enter
the PDB ID.
41
29. The figure shoes the output window and settings used 42
30. Combinatorial extension window 42
31. Comparison of protein molecules by CE 43
32. CE Results after comparison 43
33. Mfold Home page 44
34. Mfold Output page 44
35. JPEG Image of Mfold output (RNA structure) 45
36 Goals and objective of literature search 47
37. Parts of PubMed Results 48
38. PubMed search result window 50
39. PubMed Advanced Searcing - I 50
40. PubMed Advanced searching II 51
41. PubMed showing MeSH window 51
42. Comparison of PubMed simple search and MeSH Search 52
43. Science Direct Search Results 53
44. Cochrane Library results and abstract of an article on
myocardial infarction
54
45. Showing OMIM search results and article of interest 56
46. Search results for ‘Polio AND India’ in clinical trials.gov 57
47. Types of Medical Writing 58
48. Job and Career Opportunities 60
49. Process describing the steps of clinical research 70
50. Log in Page in HAROM 73
51. HAROM Main page after Logging 74
52. A part of 16-15 page final Report view after entering the
data
78

8
List of Abbreviations
Sr.
No.
Abbreviation Definition
1. ADE Adverse Drug Event
2. ADR Adverse Drug Reaction
3. BPO Business Process Outsourcing
4. CDSCO Central Drugs Standard Control Organization
5. CE Combinatorial Extension
6. CRA Clinical Research Associate
7. CRC Clinical Research Coordinator
8. CRF Case Report Form
9. CRM Clinical Research Managers
10. CRO Contract Research Organizations
11. DNA Deoxyribonucleic Acid
12. FDA Food And Drug Administration
13. GCP Good Clinical Practices
14. HAROM Human Adverse Reaction Online Monitoring
15. JMOL Java Molecule
16. JPEG Joint Photographic Experts Group
17. KPO Knowledge Processing Outsourcing
18. MeSH Medical Sub Headings
19. mmCIF Macromolecular Crystallographic Information File
20. NCBI National Center for Biotechnology Information
21. NIH National Institute of Health
22. NLM National Library of Medicine
23. NMR Nuclear Magnetic Resonance
24. OCA Observed Clinical Activity
25. OMIM Online Mendelian Inheritance in Man
26. PD Pharmacodynamics
27. PDB Protein Data Bank
28. PI Principle Investigator
29. PK Pharmacokinetics
30. PMS Post Marketing Surveillance
31. PV Pharmacovigilance
32. RCSB Research Collaboratory for Structural Bioinformatics
33. RNA Ribonucleic Acid
34. SOP Standard Operating Procedure

9
35. SOPMA Self-Optimized Prediction Method with Alignment
36. UCSF University of California, San Fransisco
37. WHO World Health Organization

10
CHAPTER 1 Objectives of the Study
The main aim of the project was to “Explore advanced concepts of Literature search,
Pharmacovigilance and Computational Biology”. The detail concept of Human Adverse
Reaction Online Monitoring (HAROM) System Software played a vital role in shaping the
project to its best.
Bioinformatics is an interdisciplinary research area at the interface between computer
science and biological science. It involves the technology that uses computer for storage,
retrieval, manipulation and distribution of information related to biological macromolecules such
as DNA, RNA and Proteins. The ultimate goal of bioinformatics is to better understand a living
cell and how it functions at the molecular level. By analyzing raw molecular sequence and
structural data, bioinformatics research can generate new insights and provide a “global”
perspective of the cell.
The biological databases play a key role in bioinformatics. The sequence and
structural data of a vast number of organisms are offered to scientists through the databases. In
this project I have discussed a variety of databases that offer references and abstracts on life
sciences and biomedical topics. Most large resources or databases have structured information
content. We can base our searching on such a structure and retrieve focused information.
Literature search is the foundation of any research activity. It helps to stay updated in one’s field
or to get answers to specific research questions. The main literature databases that I have
included in this project are PubMed, Science direct, Cochrane Library and OMIM.
The project work also includes detailed concept of Clinical research,
Pharmacovigilance and a pool of Job and career opportunities we can go for. In the recent years
there has been an emerging significance of Pharmacovigilance leading in effective clinical
practices and public health science. It plays a vital role in ensuring that doctors along with
patients have enough information to make a decision when it comes to select a drug for
treatment. In some nations, Adverse Drug Reactions (ADRs) rank among top 10 leading causes
of mortality. In order to reduce harm to patients and thus improve public health, it is essential to
evaluate and monitor the safety of medicines. Individual Case Study Reports (ICSR) of patients
which suffered from the adverse reactions of various drugs can be formulated, tabulated and
studied using the system of Human Adverse Reaction Online Monitoring (HAROM)
The present report also focuses on the importance of Medical writing and its various
types. Rudyard Kipling says, “Words are of course the most powerful drug used by mankind”,
which makes us remind that writing has the power to amend attitudes and performance. Medical
writing thus carries a serious burden of responsibility.

11
CHAPTER 2 Literature Review
Topics to be covered
I. Essentials of Bioinformatics and Computational Biology
II. Literature Search
III. Medical Writing
IV. Clinical Research
V. Pharmacovigilance
VI. Human Adverse Reaction Online Monitoring (HAROM) System
I. Essentials of Bioinformatics and Computational Biology
Bioinformatics, the application of computational techniques to examine the information linked
with biomolecules on a large-scale, has now confidently established itself as a discipline in
molecular biology, and includes a broad range of subject areas from structural biology, genomics
to gene expression studies. In this review, I introduce various concepts to use Bioinformatics in
medical aspects and bio-medical informatics. The biological data are being produced at a
phenomenal rate. Bioinformatics began over a 100 years ago when Gregor Johann Mendel, an
Austrian Monk cross fertilized dissimilar colours of the same species of flowers and
demonstrated that the inheritance of traits could be more easily made clear if it was controlled by
factors passed down from generation to generation. Since Mendel, bioinformatics and genetic
record keeping have come a long way.
Bioinformatics is an interdisciplinary research area at the interface between computer
science and biological science. Bioinformatics involves the technology that uses computer for
storage, retrieval, manipulation, and distribution of information related to biological
macromolecules such as DNA, RNA and Proteins. The ultimate goal of bioinformatics is to
better understand a living cell and how it functions at molecular level. By analyzing raw
molecular sequence and structural data, bioinformatics research can generate new heights and
provide a “global” perspective of the cell. The reason that the functions of a cell can be better
understood by analyzing sequence data is ultimately because of the flow of genetic information
is dictated by the “central dogma” of biology in which DNA is transcribed to RNA, which is
translated to proteins. Cellular functions are mainly performed by proteins whose capabilities are
ultimately determined by their sequences. Therefore, solving functional problems using sequence
and sometimes structural approaches has proved to be a fruitful endeavor.
Important Applications of Bioinformatics
1. Genomic and Molecular biology research
2. Computational studies of protein ligand interaction provide a rational basis for the rapid
identification of novel leads for synthetic drugs.
3. Informatics based approach significantly reduces the time and the cost necessary to
develop drugs with higher potency, fewer side effects, and less toxicity than using the
traditional trial and error approach.
4. Genomics and bioinformatics are now poised to revolutionize our health care system by
developing personalized and customized medicine.

12
5. Bioinformatics tools are being used in agriculture as well. Plant genome databases and
gene expression profile analyses have played an important role in the development of
new crop varieties.
Currently, bioinformatics is carried out by a specialized group of individuals, such as
database curators database and software engineers, and computational biologists. On the edge of
this are the collaborative entities of biologists, mechanical or electric engineers (bioengineers),
computer scientists, and mathematicians. The majority of the biologists, however, are on the
other end of the spectrum in that they are users of the most basic bioinformatical tools. This can
be seen as the major limitation of bioinformatics today. It is simply not as accessible to most
biologists as it should be. In the future, it can be seen that the distribution of people in this
spectrum will change to a bell curve where the majority of biologists will have some basic skills
such as programming, database development and management of large datasets, and quantitative
and statistical analysis of data.
II. Literature Search
In the present era of information and technology, staying up to date with the latest advances in
biomedical sciences is a major challenge for clinical practitioners. Because the amount of
biomedical information doubles every five years, clinicians must have free and easy access to the
current literature database for easy and effective evidence based clinical decision-making
Traditionally, there have been several systems available that condense and dispense the medical
intelligence in to easily absorbable forms (e.g. medical and dental textbooks and dictionaries).
However, these are frequently based on the synopsis and ideas of established experts and may
not be refreshed with current information. In our day-to-day practice, we often come across a
single and specific clinical problem that may be explained well in a single article. Until recently,
the problem for many clinicians has been accessing this information. The World Wide Web or
Internet has resolved these dilemmas to a large extent. Its rapid growth has created a boom in the
field of biomedical investigation and research, although there is a long way to go before its full
potential is realized.
Searching biomedical literature is a very organized and specific procedure. It requires
systematic planning so as to develop a well-constructed clinical question or precise keyword.
Unplanned and messy efforts may result in the retrieval of several, apparently irrelevant articles
thus discouraging the professional to look further. Web based search engines are tools designed
specially to search for information in the form of images, databases, journals and dictionaries. As
these search engines are computer operated, they mostly search algorithmically. With these
points in mind, it is therefore very important to understand how to access the information that is
being searched for. There are several types of search engines like:
 Internet search engines
 Internet Based Bibliographic Databases
 Indian Internet Based Bibliographic Databases
 Other Non-Government Internet-Based Bibliographic Resources
 Commercial Web-Based Resources: Scholarly Research Databases

13
Medical Writing is the activity of producing scientific documentation by a specialized writer.
The medical writer typically is not one of the scientists or doctors who performed the research
but the scientific information in these documents needs to be presented to suit the level of
understanding of the target audience, namely, patients or general public, physicians or the
regulators. The medical writer translates scientific data into real-people talk because, facing it,
lay people, like the general public and many health professionals, need help. With all the
groundbreaking discoveries in medicine, and significant advances in the health care industry,
somebody's got to enclose technical things and transform it into something that resembles
English so the rest of us can understand what all the argument is about. CROs conduct clinical
studies and help pharmaceutical companies to get their products registered with international
regulatory authorities. In general, medical writers in CROs are involved with preparing a range
of documents for these regulatory submissions, including protocols and final reports for clinical
trials, and clinical expert reports. They may also be involved in the preparation of manuscripts
for publication in medical journals. The situation is generally similar in a pharmaceutical
company with medical writers preparing documents for submission to the regulatory authorities
and manuscripts for publication. However, depending on the company, they may also be
involved in other writing projects such as training manuals, promotional material for marketing
purposes, and websites.
IV. Clinical Research
Clinical research is a branch of healthcare science that determines the safety and effectiveness of
medications, devices, diagnostic products and treatment regimens intended for human use. These
may be used for prevention, treatment, diagnosis or for relieving symptoms of a disease. Clinical
trials are used to test new drugs and therapies on human volunteers. Today, such investigations
are carried out using protocols that adhere to accepted standards of safety, patient care and data
interpretation. However, history shows that patient welfare was not always such a high priority.
The first clinical trial of a novel therapy was conducted unintentionally by the Renaissance
surgeon Ambroise Parè in 1537. He used a concoction of turpentine, rose oil and egg yolk to
prevent the infection of battlefield wounds, noting that the new treatment was much more
effective that the traditional formula. From 1800 onwards, clinical trials began to proliferate and
more attention was paid to study design. Placebos were first used in 1863, and the idea of
randomization was introduced in 1923. The first trial using properly randomized treatment and
control groups was carried out in 1948 by the Medical Research Council, and involved the use of
streptomycin to treat pulmonary tuberculosis. This trial also featured blind assessment (where
neither the researchers nor the patients knew which treatment group each patient was in at the
time of the study) enabling unbiased analysis of the results. Since 1945, the ethical impact of
clinical trials has become increasingly important, resulting in strict regulation of medical
experiments on human subjects. Clinical trials have thus evolved into a standard procedure,
focusing on patient safety and requiring informed consent from all participants. There will
always be a balance between medical progress and patient safety, and the regulation of clinical
trials helps to ensure that this balance is acceptable.
Today’s techniques are increasingly targeting new sources of information from
patients, recognizing the uniqueness of individual subjects and producing massive quantities of

14
data. At this point, the clinical trials toolbox has grown so extensively that the research
community is no longer limited by the tools themselves — we are limited instead by our ability
to combine, deploy and manage these complex tools in ways that enable us to drive innovation.
Indeed, the drive to innovate has prompted researchers and regulators to explore novel and more
complicated ways to investigate promising new products, yielding trial designs that are faster,
more flexible and more targeted. In many respects, the future of clinical research lies in the
successful completion of these complex clinical trials, many of which require the simultaneous
development of novel combinations, including drug/biologic, drug/diagnostic, drug/device, etc.
V. Pharmacovigilance
Pharmacovigilance is the science and activities relating to the Detection, Assessment,
Understanding and Prevention of adverse drug reactions or any other possible drug related
problems. The ultimate goal of this activity is to improve the safe and rational use of medicines,
thereby improving patient care and public health. As a means of pooling existing data on adverse
drug reactions (ADRs), the World Health Organization (WHO) Programme for International
Drug Monitoring was started in 1968. Currently, 86 countries participate in the programme,
which is coordinated by WHO together with its collaborating centre in Uppsala, Sweden. The
origin of pharmacovigilance in India goes back to 1986, when a formal ADR(Adverse drug
reactions) monitoring system consisting of 12 regional centers, each covering a population of 50
million, was proposed for India. The National Pharmacovigilance Program established in
January 2005, was to be overseen by the National Pharmacovigilance Advisory Committee based
in the Central Drugs Standard Control Organization (CDSCO), New Delhi. Beneficial and
harmful properties of medical remedies have been known to mankind for thousands of years. In
more recent times, serious adverse reactions associated with medical products resulted in the
evolution of regulatory changes and an effort to discover drug safety issues as early as possible.
Pharmacovigilance is the practice used by sponsors and regulatory bodies to detect harmful
effects associated with medical products to identify potential risks and enable warnings to reach
physicians in a timely manner. There must be an evolution in the current mindset to understand
patients as people. Human beings are complicated; there are numerous factors that influence how
they make decisions, especially with something as significant as medication. No longer should
organizations view a missed dose of medication as the fault of a “naïve” or “forgetful” patient,
but rather as an explicit choice based on numerous factors. It is atypical for consumers to just
accept what their doctors say without question. Instead, they leverage a variety of environmental
factors to make healthcare decisions. PV organizations must transform themselves in a way that
enhances transparency and trust to ensure future relevance. PV organizations must realize the
differentiating value of “safety as a brand.” PV organizations that address these imperatives will
be better positioned over the long term to simultaneously enhance patient safety while
reinforcing the R&D value chain with new technologies and processes that drive sustainable
revenue growth.

15
VI. Human Adverse Reaction Online Monitoring (HAROM) System
Human Adverse Reaction Online Monitoring (HAROM) system is the system to analyze the
individual case reports of patients who are affected by the adverse drug reactions (ADR’s). An
adverse drug event (ADE) is any injury resulting from the use of a drug. The high prevalence of
ADE and their negative consequences for the patient, the healthcare provider and the economy
has prompted research to try and identify their causes so that they may be prevented.
Unfortunately study findings can be difficult to compare and interpret because different
definitions for drug related incidents are commonly used. Case studies are in-depth
investigations of a single person, group, event or community. Typically data are gathered from a
variety of sources and by using several different methods (e.g. observations & interviews).
Research may also continue for an extended period of time so processes and developments can
be studied as they happen. This HAROM system was made because there was an extensive need
of online monitoring and to examine the individual case studies regarding adverse events, to
form a complete report of the study in as less time as possible and to forward it to higher
regulatory bodies.

16
Chapter 3 Scope of Study and Future aspects
Computational Biology and Bioinformatics: The term ‘bioinformatics’ is the short
form of ‘biological informatics’, just as biotechnology is the short form of ‘biological
technology’. Clearly, a number of divergent areas, many of them outside biotechnology, come
under bioinformatics. Biotechnology is the buzzword of the current times. It is necessary that
bioinformatics is viewed in the proper perspective in order to reap the rich benefits that accrue
out of it. In fact, serious efforts should be made to place even biotechnology in a rational
perspective. Awareness is the key to a successful deployment of both bioinformatics and
biotechnology, in enhancing the well being of people, animals and the environment. This effort
should essentially begin with biotechnology education. With several divergent claimants, it is
rather difficult to decide which areas of knowledge and information genuinely constitute
bioinformatics.
It may be helpful to identify areas that are not normally considered as bioinformatics, as for
example,
a) structure determination by crystallography and NMR,
b) ecological modelling of populations of organisms,
c) genome sequencing methods (genetic mapping),
d) radiological image processing (human structure scans),
e) artificial life simulation such as artificial immunology and life security,
f) organism phylogenies based on non-molecular data,
g) computerised diagnosis based on genetic analysis (pedigrees), and
a few others, though all these constitute computer processing of biological data.
In Genomics and Proteomics: Genomics is an important area of modern biology, where the
nucleotide sequences of all the chromosomes of an organism are mapped and thereby the
location of different genes and their sequences are determined. Genomics involves extensive
analysis of nucleic acids through molecular biological techniques, before the data are ready for
processing by computers.
Proteomics involves the sequencing of amino acids in a protein, determining its three-
dimensional structure and relating it to the function of the protein. Before computer processing
comes into the picture, extensive data, particularly through crystallography and NMR, are
required for this kind of a study. With such data on known proteins, the structure and its
relationship to function of newly discovered proteins can be understood in a very short time. In
such areas, bioinformatics has an enormous analytical and predictive potential.
In Drug Designing: Drug design through bioinformatics is one of the most actively pursued
areas of research. Several therapeutically active compounds are synthetic. Over a period of
time, synthetic organic chemists have realised that it is no longer easy or possible, to
continuously conceptualise new structures. The alternative is to use natural products with a
desired and known activity and to use them directly or to structurally modify them for improved
performance and lower levels of side effects. In this context, the natural products are of great
importance to the field of drug design.

17
In Glycomics: The structures of the monosaccharides, their number and sequences in
polysaccharides, are all genetically determined, as for nucleic acids and proteins. While four
nucleotides offer only 64 triplet codes, the carbohydrates offer 34,625 combinations. With ever
continuously discovered numerous biological roles of carbohydrates, glycobiology is a rapidly
expanding are of biological research. Glycomics, the application of bioinformatic procedures to
carbohydrates research, is the future field of bioinformatics.
In Molecular phylogeny: Phylogeny is the origin and evolution of organisms. With an
estimated four million organisms, though not even a quarter of them are currently known to
science, it is necessary that they are properly classified and named. It will be of great advantage
to understand the genetic and evolutionary relationships of organisms, in order to use them in a
profitable manner, in biotechnology and elsewhere. Biologists have constructed very elegant
systems of classifications for the known organisms, though problems persist. All this
commendable work, with over three centuries of history, was done using externally visible,
structural, chemical or functional attributes of organisms. This constitutes the field of
taxonomy, which is called systematics when the theory of organic evolution is applied to it. With
the advancements in molecular biology, biologists have used data from the genetic material to
characterise organisms and to verify their classification and relationships, inferred on the basis of
other evidence. Since it is impractical to use entire genomes for this purpose, nucleotide
sequences of genes in the genomes from the mitochondria and chloroplasts are used. These
nucleotide sequences are compared using complex computer software. Extensive work was
carried out this way, comparing a very large number of organisms of plants and animals. A
number of systematists would be benefited if bioinformatists provide them with computer-based
services to analyse their systematic data.
Literature Search: The goal of searching the literature is to find the right facts and the right
references. In that endeavor, getting 100,000 hits is not better than retrieving 50,000 hits, when
there were only 100 documents that were actually relevant. This means you need a search engine
that covers the relevant content and enables you to quickly hone in on the papers you need. It is
like finding a needle in the haystack; more hay just makes the task harder and longer. When you
do a literature search for the purpose of finding information, such as what genes are associated
with a particular process, extracting the desired information can be another pain point. The easier
the search engine makes that step, the better for you. Once you get a list of hits, the next thing
most of us do is peruse the results to see if the search indeed found relevant documents. You
need a search engine that makes this easy since this can be a time-consuming task.
PubMed: The past decade has witnessed the modern advances of high-throughput technology
and rapid growth of research capacity in producing large-scale biological data, both of which
were concomitant with an exponential growth of biomedical literature. This wealth of scholarly
knowledge is of significant importance for researchers in making scientific discoveries and
healthcare professionals in managing health-related matters. However, the acquisition of such
information is becoming increasingly difficult due to its large volume and rapid growth. In
response, the National Center for Biotechnology Information (NCBI) is continuously making
changes to its PubMed Web service for improvement. Meanwhile, different entities have devoted
themselves to developing Web tools for helping users quickly and efficiently search and retrieve
relevant publications. These practices, together with maturity in the field of text mining, have led

18
to an increase in the number and quality of various Web tools that provide comparable literature
search service to PubMed.
ScienceDirect: ScienceDirect is best described as a "megasource" for over one million online
articles, dated 1995 to present, which are available from more than 1,100 journals from Elsevier
companies and other participating publishers.
Cochrane Library: The Cochrane Library is a collection of databases that contain high-quality,
independent evidence to inform healthcare decision making. Cochrane Reviews represent the
highest level of evidence on which to base clinical treatment decisions. The Cochrane Library
consists of seven databases and is used by a broad range of people interested in Evidence-Based
Health Care, including consumers, clinicians, policy-makers, researchers, educators, students and
others.
OMIM: The database “Online Mendelian Inheritance in Man" or OMIM, is a collection of
diseases observed to be Mendelian, or inherited from parent to child among families. The
database contains 14,359 (as of September 06, 2013) genes that have been linked to Mendelian
genetic disorders. In addition to information on genetic disorders, the database contains external
links to other databases such as NCBI Homologene and Reactome for searching any genetic
homologues to or cellular pathways involved with the gene in question. Currently, OMIM is
managed by Dr. Ada Homash from Johns Hopkins University, and is a collaboration between
Johns Hopkins, the McKusik-Nathans Institute of Genetic Medicine, and the National Human
Genome Research Institute.
Medical Writing: Medical writing is about communicating clinical and scientific data and
information to a range of audiences in a wide variety of different formats. Medical writers
combine their knowledge of science and their research skills with an understanding of how to
present information and pitch it at the right level for the intended audience. We can find medical
writers in pharmaceutical companies, contract research organizations (CROs), and
communications agencies. CROs conduct clinical studies and help pharmaceutical companies to
get their products registered with international regulatory authorities. In general, medical writers
in CROs are involved with preparing a range of documents for these regulatory submissions,
including protocols and final reports for clinical trials, and clinical expert reports. They may also
be involved in the preparation of manuscripts for publication in medical journals. The situation is
generally similar in a pharmaceutical company with medical writers preparing documents for
submission to the regulatory authorities and manuscripts for publication. However, depending on
the company, they may also be involved in other writing projects such as training manuals,
promotional material for marketing purposes, and websites. Writers in communications agencies
generally prepare manuscripts for publication, items for conferences (e.g. posters, abstracts, and
slide presentations), and promotional items for pharmaceutical marketing, training material, and
multimedia (e.g. websites). The work in this environment is likely to be more creative. A
medical writer creates texts for a variety of purposes, including reports, press releases, journals,
articles and advertisements for government agencies and institutions, pharmaceutical and biotech
companies, medical institutions and associations, universities, and medical publishing firms.

19
Becoming a medical writer can offer a varying and interesting career with many job
opportunities for individuals with a sound knowledge of medical terminology, superior writing
skills, and a commitment to making deadlines.
Clinical Research: Clinical research is a branch of medical science. It is related to the
effectiveness and testing of medications, diagnostic products, medical devices and treatment
procedures for human use. These can be used for a disease’s treatment, prevention, diagnosis or
for relief. Clinical research refers to the comprehensive bibliography of biology, drugs, and
devices. It is a fast growing discipline in India as the country’s large population and lower costs
allow multinationals to establish research facilities here. Also, there many widespread diseases
and ailments here, which makes it important to conduct clinical research trials here for the field
of pharmaceutics. India has been involved in clinical research for the past many years and is now
on its way to becoming a major hub for it. The billion dollar industry is already witnessing high
demand for qualified professionals. There is a massive need for clinical research professionals in
this fast-growing field. Clinical research makes an interesting career option with a great scope
for professional growth. To build a career in clinical research, basic education in this field is
necessary. Clinical research is a scientific analysis of the impact, risks, benefits and efficacy of
medicines or a medicinal product. These trials are done before the launch of the products in the
market. The tests are undertaken at different stages and after-launch surveys are held to supervise
the safety and monitor the side effects of large-scale use. The contract research organizations
(CRO) or pharmaceutical companies carry out the tests. It is a good time to become part of this
fast evolving industry. The field is categorized into various branches. The common entry-level
roles include the position of a Clinical Research Associate (CRA). A CRA is a key partaker in
the planning, execution and supervision of the trials. They are responsible for the planning and
execution of all activities like monitoring complex clinical trials and ensuring that the clinical
practices are conducted well. They also assist in the creation of manuscripts and presentations
from scientific meetings and technical journals and visit meetings and training programs.
Clinical research also includes biostatistics. Biostatisticians do statistical designing,
programming, statistical evaluation and summary-writing for clinical trial projects. There are
additionally responsible for the submissions of Biological License Applications and New Drug
Applications. Another good post in this industry is of Clinical Research Managers (CRM).
CRMs manage the planning and writing of protocols, informed consent forms and case report
forms for trials. They make sure the case report forms are regularly reviewed and handed over to
the data management group. Similar positions are Clinical Research Coordinator, Clinical
Research Investigator, Business Development Manager and Clinical Data Manager. With the
arrival of multi-national companies establishing their research facilities in India, this industry is
likely to develop exponentially. As per a report, there are over 50,000 clinical research jobs in
the country. There is simply a need of qualified professionals.
Pharmacovigilance: The scope of pharmacovigilance continues to broaden as the array of
medicinal products grows. There is a realization that drug safety is more than the monitoring,
detection and assessment of ADRs occurring under clearly defined conditions and within a
specific dose range. Rather, it is closely linked to the patterns of drug use within society.
Problems resulting from:

20
 irrational drug use
 overdoses
 polypharmacy and interactions
 increasing use of traditional and herbal medicines with other medicines
 illegal sale of medicines and drugs of abuse over the Internet
 increasing self medication practices
 substandard medicines
 medication errors
 lack of efficacy
are all within the domain of Pharmacovigilance. Current systems need to evolve in order to
address this broad scope adequately. Another aspect of broadened scope is the lack of clear
boundaries between:
 Blood products
 Biological
 medical devices
 Cosmetics
 food additives
 Vaccines.
India has become a major hub for clinical trials. Despite this, several problems related to
availability of drugs, distribution, accessibility and consumption still remain at
large. Pharmacovigilance is the discipline which takes care of such aspects and is concerned with
identifying, validating, quantifying and evaluating adverse reactions associated with the use of
drugs thereby improving the safety of medicines in use. The Institute of Clinical Research, India,
offers a one year weekend post graduate diploma in Pharmacovigilance. It deals with all aspects
right from the basic principles of pharmacovigilance to pharmacovigilance in clinical research,
setting up a pharmacovigilance centre in industry, management of pharmacovigilance data, risk
management in pharmacovigilance and pharmacoepidemiology. The objective is to develop
candidates who are experts at monitoring the adverse effects of the drugs which have been
released in the market. Their goal is to establish safety of the drug and oversee the well-being of
the consumers.

21
CHAPTER 4 Conceptual Research and Techniques
I. Bioinformatics and computational Biology
Tools to predict the secondary structure and Visualization tools
II. Literature Searching
1. PubMed
2. Science Direct
3. Cochrane Library
4. OMIM
5. Clinical Trials.gov
IV. Clinical Research and Pharmacovigilance
V. HAROM

22
Chapter 4.1 Bioinformatics and Computational Biology
Protein Structure Prediction Techniques and Visualization
1. Secondary structure analysis of a protein using SOPMA
2. PHD Secondary Structure Prediction Method
3. The PSIPRED Protein Sequence Analysis
4. Swiss Pdb Viewer
5. JMol
6. UCSF Chimera
7. POLYVIEW-3D
8. Combinatorial Extension (CE) toolbar
For RNA Visualization:
9. Mfold web server for nucleic acid folding and hybridization prediction
_________________________________________________________
1. Secondary structure analysis of a protein using SOPMA
The Self-Optimized Prediction method With Alignment (SOPMA) is a tool to predict the
secondary structure of a protein. Based on the query (primary sequence of a protein), SOPMA
will predict its secondary structure. SOPMA is using homologue method of Levin et al..
According to this method, short homologous sequence of amino acids will tend to form similar
secondary structure. So it has a whole database consist of 126 chains of non-homologous
proteins. If the user enters an unknown protein, it will search against a collection of proteins in
the database that have some similar properties and evolutionary history.
Figure 1. SOPMA Front Page

23
(URL:: https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_sopma.html)
We can paste the computerized protein sequence in the text box provided. By default the output
width is 70. It means that in the output shows up to 70 amino acids in each line. We can change
the output width if we want. In the parameters there are options like ‘Number of conformational
states’, ‘Similarity threshold’ and ‘Window width’. The user can select ‘Number of
conformational states’ as either ‘(3Helix, Sheet, Coil)’ or ‘4(Helix, Sheet, Turn, Coil). The
former predicts the percentage of helix, sheet and coil structure while the latter predicts
percentage of helix, coil, turn and sheet.
 Since the output width we set as 70, here it shows 70 amino acids and corresponding
predicted structures in each line.
 The sequence length is also displayed.
 Here it is 120.
 The percentage of each structure is also listed in this page.
 For example α helix 20.83%.
Fig:2 SOPM Result Page
Results:
Alpha helix: 20.83%
Extended Strand: 21.67%
Beta turn 10.83%
Random Coil 46.67%

24
Fig.3 Graphical Result of SOPM
There are two graphs shown in the result page of SOPMA. First one is to visualize the
prediction. The second contains score curves for all predicted states.It also shows the parameters
such as window width, number of states etc. that are used for the prediction. It provides a link on
prediction result file which gives the result in a text format. There are links to find the
intermediate result files also.
Key points:
I. A new method called the self optimized prediction method (SOPM) has been
developed to improve the success rate in the prediction of the secondary
structure of proteins.
II. This new method has been checked against an updated release of the Kabsch
and Sander database, 'DATABASE.DSSP', comprising 239 protein chains
III. The first step of the SOPM is to build sub-databases of protein sequences and
their known secondary structures drawn from 'DATABASE.DSSP' by
i) making binary comparisons of all protein sequences and
ii) taking into account the prediction of structural classes of proteins
IV. The second step is to submit each protein of the sub-database to a secondary
structure prediction using a predictive algorithm based on sequence similarity.
V. The third step is to iteratively determine the predictive parameters that
optimize the prediction quality on the whole sub-database
VI. This new method correctly predicts 69% of amino acids for a three-state
description of the secondary structure (alpha helix, beta sheet and coil) in the
whole database (46,011 amino acids).

25
VII. The correlation coefficients are C alpha = 0.54, C beta = 0.50 and Cc = 0.48.
Root mean square deviations of 10% in the secondary structure content are
obtained. Implications for the users are drawn so as to derive an accuracy at the
amino acid level and provide the user with a guide for secondary structure
prediction.
2. PHD Secondary Structure Prediction Method
The PHD-method uses evolutionary information from multiple sequence alignments in a multi
level system of neural networks. Due to the authors, the average accuracy of PHD-method is
greater than 72%. A neural network is comprised of a machine learning approach, providing
computational processes the ability to “learn” in an attempt to simulate the complex patterns of
synaptic connections formed among neurons in the brain during learning. Computers are trained
to recognize patterns in known secondary structures using training sets of non-homologous
structures, and tested with proteins of known structure. An example of one commonly used
neural network, PHD. The reasons for improved prediction accuracy is attributed to its ability to
align the query sequence with other related proteins of the same family and find protein members
with known structures to aid its assignment of secondary structures. While neural networks can
detect interactions between amino acids within a window of amino acids, neural networks have
great difficulties in dealing with variable length motifs because the input layer is typically a rigid
structure with a fixed number of cells, accepting sequences of only one length class.
Furthermore, neural nets are designed as black-box methods. While the weights of a weight
matrix are usually known to the user and lend themselves to a physical interpretation, the
parameters of a neural net are hidden and not meant to be biologically interpretable or of interest
to the user. Thus, while neural nets may be very powerful function prediction tools, they usually
do not tell us anything about the underlying molecular recognition process. PHD was the first
program to use evolutionary information derived from aligned homologous sequences. It is based
on a two-layered feed-forward neural network. In the neural network, aligned homologous
sequences of known structures are used to "train" the network, which then can be used to predict
the secondary structure of the aligned sequences of the unknown protein.
 Neural network models are programs trained to recognize amino acid patterns located in
known secondary structures.
 distinguish these patterns from patterns not located in structures
 PHD and NNPREDICT use neural networks
Step: 1. Go to NCBI site
Step: 2 Find a protein sequence and download in FASTA Format
Step: 3 Paste in the PHD website
Step: 4 One can change the output width (Optional)
Step: 5 Click on submit

26
URL :: https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_phd.html
Fig:4 Pasting protein sequence in PHD
Fig:5 PHD Results

27
Fig:6 PHD Graphical Results
3. The PSIPRED Protein Sequence Analysis
(URL : http://bioinf.cs.ucl.ac.uk/psipred/)
PSIPRED (Psi-blast based secondary structure prediction) is a technique used to
investigate protein structure. PSIPRED employs neural network, machine learning methods in its
algorithm. It is a server-side program, featuring a website serving as a front-end interface, which
can predict a protein's secondary structure (beta sheets, alpha helices and coils) from the primary
sequence.
The prediction method or algorithm is split into three stages:
1. Generation of a sequence profile
2. Prediction of initial secondary structure
3. Filtering of the predicted structure.
a neural network is a system of programs and data structures that approximates the operation of
the human brain. A neural network usually involves a large number of processors operating in
parallel, each with its own small sphere of knowledge and access to data in its local memory.
The idea of this method is to use the information of the evolutionarily related proteins to predict
the secondary structure of a new amino acid sequence. PSIBLAST is used to find related
sequences and to build a position-specific scoring matrix. This matrix is processed by a neural

28
network, which was constructed and trained to predict the secondary structure of the input
sequence; in short, it is a machine method. PSIPRED works to normalize the sequence profile
generated by PSIBLAST. Then, by using neural networking, initial secondary structure is
predicted. For each amino acid in the sequence the neural network is fed with a window of 15
acids. There is additional information attached, indicating if the window spans the N or C
terminus of the chain. This results in a final input layer of 315 input units, divided into 15 groups
of 21 units. The network has a single hidden layer of 75 units and 3 output nodes (one for each
secondary structure element: helix, sheet, coil). A second neural network is used for filtering the
predicted structure of the first network. This network is also fed with a window of 15 positions.
The indicator on the possible position of the window at a chain terminus is also forwarded. This
results in 60 input units, divided into 15 groups of four. The network has a single hidden layer of
60 units and results in three output nodes (one for each secondary structure element: helix, sheet,
coil). The three final output nodes deliver a score for each secondary structure element for the
central position of the window. Using the secondary structure with the highest score, PSIPRED
generates the protein prediction. Steps are as follows:
1. Open NCBI site
2. Find a protein sequence and download it in FASTA format
3. Paste it in PSIPRED Page
4. Give short Identifier Name
5. Submit and get result
Fig: 7 PSIPRED front page. By default, the chosen prediction method is PSIPRED v3.3(Predict
secondary structure)

29
Fig:8 Protein Sequence Pasted in PSIPRED and short identifier name given
Fig: 9 PSIPRED results showing alpha helix, beta sheets and disordered region in different
colours

30
Fig:10 PSIPRED results showing confirmation regions and secondary structure prediction
4. Swiss PDB Viewer (SPDBV):
DeepView – the Swiss-PdbViewer (or SPDBV), is an interactive molecular graphics
program for viewing and analyzing protein and nucleic acid structures. In combination with
Swiss-Model (a server for automated comparative protein modeling maintained at
http://www.expasy.org/swissmod) new protein structures can also be modeled. A molecular
coordinate file (e.g. *.pdb, *.mmCIF, etc.) is a text file containing, amongst other
information, the atom coordinates of one or several molecules. It can be opened from a local
directory or imported from a remote server by entering its PDB accession code. The content
of one coordinate file is loaded in one (or more) layers, the first one will be referred to as the
"reference layer".

31
Fig:11 SWISS-PDB toolbar menu
Fig:12 SWISS PDB control bar and layers info window information

32
Fig:13 SWISS PDB toolbar
Fig:14 SWISS PDB showing structure of a protein molecule.
Fig:15 SWISS PDB tool bar, graphic window showing protein and control panel

33
Fig16: SWISS PDB preferences menu
Fig:17 Changing the atomic view in ribbon form through control panel

34
Fig.18 The menu bar in SWISS PDB
Fig:19. SWISS PDB tools
5. JMOL (URL :: http://bioinformatics.org/firstglance/fgij/ )
Introduction: JMol is a highly user-friendly software for visual investigation of 3D molecular
structures of proteins, nucleic acids, and their interactions with each other and with ligands,
substrates, and drugs; and of protein evolutionary conservation. By this we can learn how to
create publication-quality molecular images, rotating molecules, and custom on-line rotating
molecular scenes . FirstGlance in Jmol is the easiest way to look at the 3D structures of proteins,
DNA, RNA, and their complexes. It works within a web browser, and was designed to enable the
readers of scientific journals to see the main features of newly published 3D models in a few
clicks, without installing anything, and in all popular web browsers and computer platforms. It
has been in use by Nature since 2007, as well as Nature Structural and Molecular Biology ,
Proteopedia.Org, the Protein Data Bank, the OCA Structure Browser, and the ConSurf
Server (which automatically colors amino acids by evolutionary conservation), among others.
Main tools:
. Rotate the molecule by dragging near it with the mouse.
I. Identify any atom by clicking on it. Its identity will be displayed to the lower left of the
molecule (and in the browser status bar). If you don't recognize the abbreviation, look for
it under Ligands + and Non-Standard Residues in the 1d66 tab.
II. Center a region of interest using Center Atom (at the bottom of any of the 4 tabs). You
can then inspect details by zooming in. Regions distant from the centered moiety can

35
be hidden by toggling on Slab (in the Views or Tools tabs). When finished, Center
Atom offers the option to re-center the entire molecule.
How Does It Work?
. FirstGlance in Jmol is a user-interface to the preeminent free molecular visualization
program named Jmol.
i. Jmol is very powerful, but in order to access that power, we either need to learn a
complicated command language, or use a wrapper (user interface) such as FirstGlance in
Jmol.
ii. FirstGlance makes some of the power inherent in Jmol accessible to people who don't have
the time or inclination to learn Jmol's command language.
iii. Each protein or nucleic acid chain is assigned a different pastel color, so it is easy to
distinguish chains. Disulfide bonds are evident. The initial secondary structural schematic
"cartoon" display makes alpha helices and beta strands obvious.
iv. All atoms or compounds that are not standard protein or nucleic acid residues ("ligands")
have space-filling atoms so they are clear. A clickable list of ligands in the Molecules
Tab gives their full names, and locates them. Touching or clicking on any atom in the 3D
view identifies it. Regions with missing residues are marked with an "empty basket" icon,
residues with incomplete side chains are marked S-, and non-standard residues are
marked X.
What Does It Indicate?
The control panel for FirstGlance is separated into five Tabs: Molecule, Views, Tools,
Resources, and Preferences. The Molecule Tab digests information in the PDB file, presenting it
in readable form: Title of study/name of molecule, year of publication, method of structure
determination. For X-ray structures, resolution and free R are "graded" to interpret them for non-
specialists. The number of models is given for NMR results, and viewing all models
superimposed is one click away. A clickable list of chains gives the total number, shows which
are sequence-identical, and locates each in the 3D model. For each chain is given the number
of missing residues, and missing positive and negative charges. A clickable list of ligands and
non-standard residues gives their full names, counts, and locates each in the 3D model.
The abstract of the primary publication, and other publications, are linked.

36
What Can It Do?
1. Views Tab: Major structural features of the molecule are revealed. Explanatory help is
shown automatically for each view. One-click options display secondary structure, amino
and carboxy (or 3' and 5') termini, composition (protein, DNA, RNA, ligands, and solvent),
the distributions of hydrophobic, polar, charged amino acids, and local uncertainty (B
factor/temperature). Non-standard residues, missing residues, and incomplete sidechains are
clearly indicated. Centering, slab and zoom buttons make an uncluttered closeup
examination easy for any moiety of interest.
2. Tools Tab: A Find.. tool makes it easy to locate short sequences, and amino acids (or
nucleotides) by name, sequence number, or a range of sequence numbers. A Hide.. tool
hides any chain(s), residue(s), or atom(s) that you click. They remain hidden in other views
until explicitly re-displayed. An Isolate.. tool hides everything except one entity and this
persists in all views until it is undone. The Contacts.. dialog shows the non-covalent
interactions to any moiety you select (by clicking on it). The non-covalent interactions are
automatically divided into seven categories: hydrogen-bonded water, water bridges,
hydrogen-bonded non-water, hydrophobic interactions, salt bridges, cation-pi interations,
and metal and miscellaneous interactions. Checkboxes hide categories, simplifying the
display. Examples can be seen in the Snapshot Gallery. Disulfide bonds are shown in the
initial view, but here, a disulfides/S/Se tool highlights and counts disulfide bonds, sulfur,
and selenium atoms (and counts cysteine, methionine, selenocysteine, selenomethionine).
A salt bridge and cation-pi interaction tool highlights these. Distances and angles are easily
measured with mouse clicks.
3. Resources Tab: Explanations with illustrated step-by-step guides provide visualization
of biological units, evolutionary conservation, atomic clashes, lipid bilayer boundaries for
integral membrane proteins, as well as help predicting intrinsically disordered segments of a
protein chain, and sharing customized molecular views in the Proteopedia wiki.
4. Preferences Tab: You can specify whether links to FirstGlance will use Java or not, whether
the molecule should spin at the beginning of a session, etc.
Fig 20: Front screen of JMOL page

37
Fig:21 3D Graphical image of insulin in JMOL with ID 1ZNI
Fig: 22 Complete window of JMOL

38
Fig:23 Various Menu bars in JMOL
Fig:24 Insulin in 3D View showing ligands along with the toolbar

39
6. UCSF Chimera
UCSF Chimera is a highly extensible program for interactive visualization and analysis of
molecular structures and related data, including density maps, supra-molecular assemblies,
sequence alignments, docking results, trajectories, and conformational ensembles. High-quality
images and animations can be generated. Chimera includes complete documentation and several
tutorials, and can be downloaded free of charge for academic, government, non-profit, and
personal use. Chimera is developed by the Resource for Bio-computing, Visualization, and
Informatics, funded by the National Institutes of Health. Excellent molecular graphics package
with support for a wide range of operations, including flexible molecular graphics, high
resolution images for publication, user-driven analysis, multiple sequence alignment analysis,
multiple model analysis, docking. Chimera is segmented into a core that provides basic services
and visualization, and extensions that provide higher level functionality. This architecture
ensures that the extension mechanism satisfies the demands of outside developers who wish to
incorporate new features. Two unusual extensions are presented: Multi-scale, which adds the
ability to visualize large-scale molecular assemblies such as viral coats, and Collaboratory,
which allows researchers to share a Chimera session interactively despite being at separate
locales.
Fig:25 CHIMERA Graphical Window showing 3D protein structure

40
Fig: 26 Window showing selection of A Chain of hemoglobin and colouring it as red
Fig:27 Chimera tool graphic window showing how to change ribbon view into atomic view of
protein

41
7. PolyView 3D ( URL - http://polyview.cchmc.org/polyview3d.html)
 POLYVIEW-3D is a web-based tool for macromolecular structure visualization and
analysis.
 In particular, it provides a wide array of options for automated structural and functional
analysis of proteins and their complexes.
 Server is available at http://polyview.cchmc.org/polyview3d.html.
 By integrating the web technology with state-of-the-art software for macromolecular
visualization, POLYVIEW-3D enables versatile structural and functional annotations
coupled with publication quality structure rendering. In addition to static pictures, high
quality animated images for electronic resources such as PowerPoint or Web-sites can be
easily generated with POLYVIEW-3D as well. In particular, POLYVIEW-3D server
features the PyMol program for image rendering, providing detailed and high quality
presentation of macromolecular structures, with an easy to use web-based interface.
Fig:28 Polyview 3D Home page. The step shows where to enter the PDB ID.

42
Fig: 29 Here the figure shoes the output window and settings used
8. Combinatorial Extension (http://source.rcsb.org/jfatcatserver/ceHome.jsp)
Fig: 30. Combinatorial extension window

43
Fig 31: Comparison of protein molecules by CE
Fig 32: CE Results after comparison
9. Mfold web server for nucleic acid folding and hybridization prediction
 Nucleic acid structure prediction is a computational method to determine nucleic
acid secondary and tertiary structure from its sequence.
 Secondary structure can be predicted from a single or from several nucleic acid
sequences.
 Tertiary structure can be predicted from the sequence, or by comparative
modeling (when the structure of a homologous sequence is known).
 Mfold is used to predict secondary structure of RNA.
 Advantages Mfold is very easy to use and free of charge online.
 M fold is internet based program that runs on computer that has access to
internet. Mfold has many versions and updated regularly.
 URL is http://mfold.rna.albany.edu/results

44
Fig: 33 Mfold Home page
Fig:34 Mfold Output page

45
Fig:35 JPEG Image of Mfold output (RNA structure)

46
Chapter 4.2 Literature Search
Key Focus:
 What is Literature search?
 Need for literature search.
 Principles of Literature Search
 Biomedical Literature Search
 PubMed
 ScienceDirect
 Cochrane Library
 OMIM
 Clinical trials.gov
I. What is Literature Search?
Most large resources/databases have structured information content. We can base our searching
on such a structure and retrieve focused information. A literature search is a systematic and
thorough search of all types of published literature in order to identify a breadth of good quality
references relevant to a specific topic. The success of any research project is dependent on a
thorough review of the academic literature at the outset. It is therefore a fundamental element of
the methodology of any research project. Effective literature searching is a critical skill in its
own right and will prove valuable for any future information gathering activity. Getting the
literature search right will save hours of time through the course of the research project and will
inform and improve the quality of the research one can go on to do.
II. Need for Literature Searching
Earlier when there was no internet available, people used to read a huge no. of books, journals,
newspapers, articles, magazines etc. to search for a particular article of their interest. In the
present era of information and technology, staying up to date with the latest advances in
biomedical sciences is a major challenge for clinical practitioners. Because the amount of
biomedical information doubles every five years, clinicians must have free and easy access to the
current literature database for easy and effective evidence based clinical decision-making
Traditionally, there have been several systems available that condense and dispense the medical
intelligence in to easily absorbable forms (e.g. medical and dental textbooks and dictionaries).
However, these are frequently based on the synopsis and ideas of established experts and may
not be refreshed with current information. In our day-to-day practice, we often come across a
single and specific clinical problem that may be explained well in a single article. Until recently,
the problem for many clinicians has been accessing this information. The World Wide Web or
Internet has resolved these dilemmas to a large extent. Its rapid growth has created a boom in the

47
field of biomedical investigation and research, although there is a long way to go before its full
potential is realized. Searching biomedical literature is a very organized and specific procedure.
It requires systematic planning so as to develop a well-constructed clinical question or precise
keyword. Unplanned and messy efforts may result in the retrieval of several, apparently
irrelevant articles thus discouraging the professional to look further. Web based search engines
are tools designed specially to search for information in the form of images, databases, journals
and dictionaries. As these search engines are computer operated, they mostly search
algorithmically.
Goals and objectives of literature search:
i. To review existing critical opinions/theories
ii. To identify current research findings on a topic.
iii. To identify potential research methods or models one can use
iv. To compare our own research with carried out findings
v. To be more specific regarding our topic.
Fig: 36 Goals and objective of literature search
III. Principles of Literature search:
1. Plain keyword searching
2. Thesaurus Based Searching
3. Combining Search terms
4. Searching in specific fields

48
Fig:37 Parts of PubMed Results
 The searching is performed using simple keywords.
 The number of results is generally large
 Search query contains only few terms
 The data may not be exactly relevant.
1.) Thesaurus based searching:
MESH Searching:
 MESH terms are standard keywords added by indexers at the NLM to every
article from every jornal that is included in PubMed
 With MeSH terms, one will not have to think about word variations, word
endings, plural or singular forms, or synonyms. One can use MeSH features to
refine our searches with subheadings and choose, if we like, to select specific
MeSH terms to be tagged as the major focus of references retrieved.
 We can find best standardized terms using MeSH database
2.) Combining Search Terms:
 Our search query generally contains a number of terms
 The biomedical search engines like PubMed do not recognize words like of, for,
in, etc. which we use in our search queries
 USE OF BOOLEAN OPERATORS makes it easy
 We need to use connecting terms like Boolean operators

49
 PubMed uses AND, OR, NOT.
 AND: It means compulsory, presence of all search terms in any part. Records are
retrieved that contain both the terms. E.g Diarrhea AND Malnutrition. It will get
us reference that contain both the terms.
 OR: It means ‘any’. There may be any one term or both the terms in any part of
bibliographic details. E.g. Diarrhea OR Malnutrition
 NOT: It means elimination. It will retrieve information containing first term and
eliminating second term. E.g. Diarrhea NOT Malnutrition
3.) Field Based Searches: Field searches help us search for our terms in specific field or
areas of the bibliographic record.
BIOMEDICAL Literature Search
a. PubMed
b. ScienceDirect
c. Cochrane Library
d. OMIM
e. Clinical trials.gov
i. PubMed:
Introduction: PubMed is produced by the U.S. National Library of Medicine (NLM) and is one
of several databases available from the NLM. It covers medicine, nursing, dentistry, veterinary
medicine, the healthcare system, preclinical sciences and other life sciences. It is a
bibliographical database citation and abstracts from about 5,000 biomedical journals. It includes
details such as authors, titles and abstracts, but not the full text of journal articles. PubMed also
has links to online full-text articles from participating publishers.
Searching:
 Enter search terms and keywords.
 Combine the keywords with Boolean Operators like AND, OR, NOT.
 If we do not use Boolean operators, PubMed automatically uses AND
 USE Brackets ( ),for nest searches
 Amend or view results on the right side of screen.

50
Fig:38 PubMed search result window
Advanced Searching:
 Select the field you want to search
 Select MeSH Terms
 USE Boolean Operators AND, OR, NOT
 Repeat as often as needed
 To add an existing search, use # and the number of the search from Search History
 Click Search
Fig 39: PubMed Advanced Searcing - I

51
Fig:40 PubMed Advanced searching II
MeSh Searching:
 MeSH (Medical Subject Headings) is controlled subject vocabulary, which help you
searching.
 On the main PubMed page select MeSH Database (under More Resources)
 Enter your topic in the search box
 Click on Go
 PubMed will map your topic to valid MeSH headings. If it can’t map your search, some
related suggestions may be offered. If no term comes up, try describing the topic with
different term(s).
Fig:41 PubMed showing MeSH window

52
Fig:42 Comparison of PubMed simple search and MeSH Search
MESH Terms are standard keywords added by indexers at the NLM, to every article from every
journal that is included in PubMed.
ii. ScienceDirect
ScienceDirect is a website operated by the Anglo-Dutch publisher Elsevier. It was launched in
March 1997. It is a platform for access to nearly 2,500 academic journals and over 26,000 e-
books. The journals are grouped into four main sections: Physical Sciences and
Engineering, Life Sciences, Health Sciences, and Social Sciences and Humanities. For most
articles abstracts are freely available; access to the full text (in PDF and, for newer publications,
also HTML) generally requires a subscription or pay-per-view purchase.
Features of ScienceDirect :
1. Search terms are not case-sensitive, so it does not matter if you use lowercase or
uppercase letters.
2. Entering singular nouns will also search for plural nouns and possessives. For example:
City finds city, cities, and city's ; Criterion finds criteria and criterion
3. Entering search terms using either US or UK spellings will automatically search for both
spellings.
4. Multiple words set off by spaces will search for documents or images with both words.
5. You can use either quotation marks or curly brackets to search for a phrase, but the
results will differ in these ways:

53
6. Searches in quotation marks (such as “heart-attack”) will be fuzzy searches – the search
engine will search for plural and singular nouns, US and UK spellings, ignore symbols
and punctuation, and allow wildcards.
7. Searches in curly brackets, such as {heart-attack}, will be exact searches. The search
engine will look only for that exact phrase, including symbols and punctuation.
Fig:43 Science Direct Search Results
iii. Cochrane Library
The Cochrane Library is a collection of databases that contain different types of high-quality
independent evidence to inform health care decision-making in medical, dental and other health
care specialties. They include systematic reviews and a central register of controlled clinical
trials. The collection may be accessed through the Wiley Online system.
Features:
1. MeSH terms
MeSH terms are sourced and added to reviews post-publication by Wiley. On an annual basis,
Wiley downloads the MeSH thesaurus related files from from the National Library of Medicine
(NLM) to build a new MeSH structure, which is introduced to the Cochrane Library.
2. Linking Cochrane Reviews to other related Cochrane Reviews
In 2011, Wiley introduced a ‘more content like this’ feature for Cochrane Reviews, which is
built on the topics lists. At the bottom of each Cochrane Review, there is a hyperlink to the

54
topics list heading under which the Cochrane Review is listed. Clicking on this hyperlink takes
the user to that level of the topics list tree so that the user can view related reviews.
3. Browse options for Cochrane Reviews: Readers are given the option of browsing
Cochrane Reviews through several different lists:
1) Browse list – as shown on the left-hand side of the Cochrane Library homepage (prepared by
Cochrane – see below)
2) New Reviews
3) Updated Reviews
4) A–Z: all Protocols and Reviews
5) A–Z: by Cochrane Review Group
6) Topics by Cochrane Review Group (prepared by Cochrane – see below) Browse list
All of these are prepared automatically with the exception of the browse list on the homepage of
the Cochrane Library and the topics lists.
The Cochrane Editorial Unit prepares the homepage ‘Browse by topics’ menu. This is a three-
level browse menu for Cochrane Reviews only (excludes protocols) designed to present the
scope of Cochrane Reviews across the landscape of medical specialties and broad healthcare
topics, rather than being restricted to the way that coverage is defined by Review Groups. The
aim is to helping users find relevant reviews. The Browse menu is populated automatically from
the Cochrane Editorial Unit’s own topics list, which is maintained for this purpose. Cochrane
Protocols are tagged appropriately in Archie, often based on the way that CRGs have assigned
the review to their own topics lists, and then appear automatically in the browse menu when the
Review is published.
Figure:44 Cochrane Library search results and abstract of an article on myocardial infarction

55
iv Online Mendelian Inheritance in Man (OMIM)
Online Mendelian Inheritance in Man (OMIM) is a continuously-updated catalog of human
genes and genetic disorders and traits, with a particular focus on the gene-phenotype
relationship. As of 23 July 2015, approximately 8,062 of the over 23,000 entries in OMIM
represented phenotypes; the rest represented genes, many of which were related to known
phenotypes. OMIM has collected human Mendelian disease data for over 40 years by classifying
and naming new disorders and by cataloging associations between phenotypes and their
causative genes in humans. OMIM is authored and edited at the McKusick-Nathans Institute of
Genetic Medicine at the Johns Hopkins University School of Medicine, under the direction of
Dr. Ada Hamosh.
Features:
 In the center of the page is a basic search form. At the top there are navigation tabs. As
the name Online Mendelian Inheritance in Man indicates, OMIM focuses its content on
single-gene Mendelian diseases, disorders, and phenotypes. OMIM is intended for use
primarily by physicians and other professionals concerned with genetic disorders, by
genetics researchers, and by advanced students in science and medicine.
 The OMIM search window accepts a wide variety of advanced search syntax, such as
quotes, Boolean operators and more.
 Searching from the OMIM homepage is a powerful way to find either Gene Map data or
clinical synopses, and these can be accessed from buttons on our search results page.
 In OMIM you will have access to a wealth of gene and phenotype data in various
formats including: detailed free-text entries, Gene Map displays, Phenotypic Series and
Clinical Synopses.
 OMIM plays a central role in naming and classifying mendelian phenotypes, and it also
provides a vetted catalog of manually-curetted associations between human phenotypes
and their causative genes with many dynamic views and search tools. As such it is a
valuable resource for those with an interest in Mendelian disease information.
Hence, OMIM is a comprehensive knowledgebase of human genes and genetic disorders. It
consists of full-text overviews of genes and genetic phenotypes, particularly disorders, and is
useful to students, researchers, and clinicians. It was initiated in the early 1960s as a trilogy of
catalogs of autosomal dominant, autosomal recessive, and X-linked phenotypes. It has been
maintained as an electronic file since 1964 and has been published in 12 print editions the first in
1966, the most recent (in three volumes) in 1998. In 1987, it became generally available on the
Internet, under the designation “OMIM,” from the Welch Medical Library at Johns Hopkins
University. Since December 1995, it has been distributed on the World Wide Web from the
National Center for Biotechnology Information (NCBI) of the National Library of Medicine.
This knowledgebase is updated daily. Authoring and editing are headquartered at Johns Hopkins
University School of Medicine.

56
Fig: 45 Showing OMIM search results and article of interest
Clinical trials.gov
A clinical study involves research using human volunteers (also called participants) that is
intended to add to medical knowledge. There are two main types of clinical studies: clinical trials
(also called interventional studies) and observational studies. ClinicalTrials.gov includes both
interventional and observational studies. In a clinical trial, participants receive specific
interventions according to the research plan or protocol created by the investigators. These
interventions may be medical products, such as drugs or devices; procedures; or changes to
participants' behavior, such as diet. Clinical trials may compare a new medical approach to a
standard one that is already available, to a placebo that contains no active ingredients, or to no
intervention. Clinical trials used in drug development are sometimes described by phase.
These phases are defined by the Food and Drug Administration (FDA).
In an observational study, investigators assess health outcomes in groups of participants
according to a research plan or protocol. Participants may receive interventions (which can
include medical products such as drugs or devices) or procedures as part of their routine medical
care, but participants are not assigned to specific interventions by the investigator (as in a clinical
trial). For example, investigators may observe a group of older adults to learn more about the
effects of different lifestyles on cardiac health.
 ClinicalTrials.gov is a Web-based resource that provides patients, their family members,
health care professionals, researchers, and the public with easy access to information on
publicly and privately supported clinical studies on a wide range of diseases and
conditions.

57
 The Web site is maintained by the National Library of Medicine (NLM) at the National
Institutes of Health (NIH).
 Information on ClinicalTrials.gov is provided and updated by the sponsor or principal
investigator of the clinical study.
Each ClinicalTrials.gov record presents summary information about a study protocol and
includes the following:
 Intervention (for example, the medical product, behavior, or procedure being studied)
 Disease or condition
 Title, description, and design of the study
 Requirements for participation (eligibility criteria)
 Locations where the study is being conducted
 Contact information for the study locations
Fig: 46 Search results for ‘Polio AND India’ in clinical trials.gov

58
Chapter 4.3 Essentials of Medical Writing
Overview:
Health sector has gone through enormous change during the last decade with people getting more
aware and conscious about their health. Online searches have contributed tremendously towards
the health awareness. The patients started exploring about the information related to different
diseases and their treatment through the internet and become well versed with the indications and
contradictions of the drugs compelling the pharmaceutical companies to search for new
molecules. Growing demands of the people for different healthcare products such as
pharmaceuticals, alternate medicines, neutraceuticals, special nutritional supplements, cosmetics
and herbal products have intensified the competition between various national and multinational
companies who boost about their health care products through massive campaigns. There is
massive boom in advertisements, promotional materials for marketing and materials for creating
awareness among health care professionals as well as patients.
Types of medical writing:
I. Regulatory medical writing
II. Educational medical writing
 Scientific writing
 Consumer health writing
III. Medico marketing writing
Fig:47 Types of Medical Writing
Regulatory medical
writing
•Creating
documentation
that regulatory
agencies require in
the approval
process for drugs,
devices and
biologics.
Educational
Medical Writing
•Writing documents
about drugs,
devices and
biologics for
general audiences
and specific
audiences such as
health care
professionals(physi
cians, nurses,
surgeons, physical
therapists)
Medico Marketing
Writing
•Production of
marketing
collaterals such as
Brochures,
instrument
manuals, posters,
advertisements,
etc.

59
Fundamentals of good medical writing:
1. Writing scientific documents of different types which include regulatory and research-
related documents
2. Information about disease or drug-related educational and promotional literature,
publication articles like journal manuscripts and abstracts, content for healthcare websites,
health-related magazines or news articles.
3. Need to be familiar with searching medical literature, understanding and presenting
research data, the document review process, and editing and publishing requirements.
The medical writer needs to have a clear understanding of the medical concepts and ideas,
and be able to present the data and its interpretation in the way the target audience will
understand. Medical writers combine their knowledge of science and their research
understanding to present information at the right level for the target audience. Moreover, the
writing needs to meet the specific requirements for different types of documents. Medical
writing has become established as an important function in the pharmaceutical industry,
because it requires specialized knowledge and skills to be able to write scientific documents
which are well- structured, and presented in a clear and lucid manner.
The demand for medical writing has gone up considerably in the last few
years. The reasons are many – more research studies are being conducted today in the
biomedical field; pharmaceutical companies are developing more new drugs and medical
devices, and various scientific documents need to be generated for submission to regulatory
authorities during their approval process; the number of biomedical journals has gone up
considerably and many more scientific articles are now published than before; similarly, with
the addition of a new and a powerful medium like the ‘internet’ a lot of medical information
is generated as ‘web content’ for medical professionals as well as for the general public.
Job and Career Opportunities:
Medial writers mostly work with the pharmaceutical industry. However, there are many other
setting in which medical writers are required:
1. Pharmaceutical / healthcare product companies including medical device companies
2. Contract Research Organizations (CROs) & Business/Knowledge Process Outsourcing
companies (BPOs/ KPOs)
3. Scientific content and healthcare communication companies (Functional Service
Providers)
4. Media & Publishing companies and Medical Journals Academic medical institutions,
Medical/scientific societies
5. Healthcare Websites
The scope for medical writers is therefore tremendous and growing. This is also a profession
which one can practice either independently as a freelancer, or as an employee in an

60
organization, depending on one's experience, level of expertise and liking. So, learning medical
writing can be the beginning of a life-long profession.
Fig:48. Job and Career Opportunities
General Knowledge and skill of a medical writer:
 Language and grammar: Use of grammatically correct language, simple and short
sentences, active voice, appropriate punctuation marks, and a logical flow of ideas can go
a long way in making the information understandable to the readers.
 Literature reference searching: Keeping in mind what exactly you are looking for,
knowing where to search and selecting only the authentic sources, planning your search
strategy, use of correct keywords for searching and then carrying out the search as per the
set plan is more likely to bring up useful information. Reviewing your search results to
consider if the information is relevant, and systematically classifying and filing useful
information for later retrieval is equally important.
 Interpretation and Presentation of research data: writing scientific documents
involves review and interpretation of research data, presentation of those data in text,
tables, and graphs, and developing logical discussion and conclusions as to what the data
means.
 Ethical & legal issues – issues of concern to medical writers are – giving truthful and
complete information including negative findings, following copyright laws, not
indulging in plagiarism, following authorship criteria for research manuscripts, and
respecting journal review process.
Scientific Medical
Writing
•Journals
•Review Articles
•Letters to editors
•CD-ROM, Web
Pages
•Regulatory
documents and
white papers
Job opportunities
•Responsibilities
of Medical writer
in CRO:
•Preparing Study
protocols
•Investigator
brochures
•Study reports
•Other several
papers
Freelance Writers
•Reserch Writer
•Translator
•Proof reader
•Copy Editor
•Web Content
Writer
•Ghost Writer

61
Qualities of a good medical writer:
1. Ability to understand the purpose and requirements of the project
2. Ability to write at a level appropriate to the target audience
3. Thorough research of the subject
4. Ability to think, logical organization of thoughts and ideas
5. Scientific accuracy
6. Attention to details
7. Ability to work across teams (often remotely) as well as independently
8. Good communication & coordination with various people involved in the process
9. Good time management, and meeting deadlines and commitments
Roles and Responsibilities of a medical writer:
a. To plan and produce scientific quality content in a specific discipline
b. Interaction and good communication with clients and working staff
c. Gathering specialized information and adequate attention towards specific details is also
very crucial
d. Improving quality of end result
Assess the audience:
i. Know readers to make a good presentation.
ii. Writer should assess the knowledge and experience of readers.
iii. Background of topic and the purpose of audience to go through document should be clear
to audience.
iv. Writer should clear weather to write as narrative or it in sections.
Types of research publications:
1. Original articles: authors discuss article in detail and discuss its results
2. Review Article: Work conducted by gathering information from collection of articles on a
single topic.
3. Short Papers: Original research articles meant for publishing in journals.
4. Case reports: Detailed report describing symptoms, signs, diagnosis, treatment and follow up.
5. Protocol Writing: At pharmaceutical companies to develop a protocol that can be easily
executed in clinical setting.
6. Editorial: Opinion of editor of periodical on a current topic or issue reflecting the opinion of
periodicals.

62
For a Research article:
1. Title: Short and informative identifying general field and scientific branch
 Name of contributor
 Address
 Phone number
 E-mail
2. Abstract: Allows to quickly read about the content and to decide whether to continue
reading or switch to other article
 200-300 words
 Unstructured or structured with Introduction, aim, objective methods
3. Introduction: Contains statement of problem, related background, relevance of current work.
4. Materials and methods: To allow reader to repeat work and produce desired results.
 To give details of study and techniques.
 Full detail to be given along with techniques if new technique.
 Citation of journal to be incorporated
 Add Details like study duration, temperature, measurement, amounts, animal
species, sex and weight may be given.
 Result should not be given in the procedure.
 Brief method of statistical analysis to be added.
5. Results: Findings of the study described
 With the help of tables, line diagrams, figures and images.
 The results should be clearly explained in a simple manner
 Data should not be too condensed or elaborate
 Tabulated data should not be repeated.
6. Discussion: Discuss the results in this section.
 Conclusion to be drawn: whether findings support hypothesis or not?
 Are the findings similar to previously published articles or it contradicts?
 Plausible explanation for this.
7. Conclusion: This section includes major contributions of paper and related untouched areas
in which research can be conducted.
8. Acknowledgement: Author(s) can add a line of acknowledgement for any financial assistance
or assistance while preparing the manuscript or using any facility etc.
9. References: References to support the research study are cited in the text and enlisted in the
end of the paper.
 Can be written as names of journals, books or websites
To write a Research protocol: Publications of writing research article need to fulfill certain
requirements like having Title, body, references.

63
1. Project Title Page: It should be descriptive of the study done. It may need to be revised
after completion of writing of protocol and it should reflect meaning of article.
2. Primary Investigator (PI) or co-investigator: It contains title of the Paper, name,
mailing address, phone number, email address of corresponding Author, name of
contributors and their affiliations, short running title. It is needed because someone may
want to remain touch with PI.
3. Abstract: It helps reader to have a quick idea about the topic. Generally every research or
review article and case reports contain short abstract of word size 200 to 300. Abstract can
be structured or unstructured and contains introduction, Aim, method, results and
conclusion. Some keywords must be there at the end.
4. Background: This contains importance of topic in context of current knowledge any results
one obtained that indicate the research question can be answered by given approach.
5. Study goals and objectives: Here researcher need to describe broadly objectives and aims
of the study which should be specific, relevant and time based.
6. Study Design: Here description should be given that trial is randomized or not, case
control, cross sectional study. The choice of the design should be explained that is how it
will address study objectives.
7. Study Description: During description it is recommended to use active voice and future
tense. If subjects are to be randomized method should be explained. Continuous description
should be there explaining each steps.
8. Safety considerations: This heading covers details about risks to the subjects enrolled
for study. Risks if any present like drug side effects, allergic reactions, and any
complications should be mentioned.
9. Follow up: In this one need to explain what will happen to the subject(s) who have enrolled
for study after the research study is completed.
10. Data Management: Here description should be given about statistical methods that is
used for each study objective. Research protocol should provide information about what
type of data will be obtained and how they will be managed. Description should be
mentioned about plan for the data using statistical analysis and data analysis for each of the
study objectives.
11. Quality Assurance: It is mainly important when Human subjects are enrolled and there is a
need to tell how good Clinical practices should be followed.

64
12. Expected Outcomes: Description should be given about outcome that is expected based
upon the objective and goals of the study. Ideal clinical study have some impact on segment
of population.
13. Final Steps: After the protocol is written it is needed to review it by some good critical
reader. He or she will help to identify if there is any problem and needs editing. After the
protocol is ready there is requirement for its approval by Human subject committee.
For Case study:
Case Study: It is defined as in depth analysis and systematic description of one patient or group
of similar patients for understanding the circumstances. It allows in depth explorations of
complex issues in real time settings. Based on different types of circumstances several
healths related case studies are designed. These helps to explain, describe, explore events on
daily basis. It helps to capture information thus knowledge about the subject becomes more
concrete and contextual which is mainly based on reference population. This helps to answer
certain research questions.
Characteristics of case study
There are primarily three characteristics of case study. These are as follows:
I. Particularistic: This type of case study particularly focuses on a particular event, process
or situation.
II. Descriptive: This type of case study has highly descriptive content of the topic being
studied.
III. Heuristic: The cases studied shows readers understanding of the phenomena under study.
It helps to bring about discovery of new meanings, extend the readers experience and
confirm what is already known.
Types of Case study
1. Historical: This case study is based on development of particular phenomena over time.
Holistic analysis and description from historical perspective preferred when there is
virtually no access or control.
2. Observational: This case study involves data collection by participants with both formal and
informal interviews.
3. Illustrative case study: It can be considered primarily as descriptive study which utilizes one
or two instances to show what the situation is like. It tries to make familiar, unfamiliar and
readers a common language about topic of interest.
4. Exploratory or Pilot case study: These case studies are conducted before implementing a
large scale investigation. Main function is to identify questions and select type of measurement
before main investigations.

65
5. Cumulative case studies: These case study serves to aggregate information from several
sites collected at different times. Past studies collection will allow for greater generalization
without additional cost or time being expended on new repetitive studies,
6. Critical Instance case studies: It analyzes one or more sites for a situation of unique interest
with little to no interest in general, or to call into question or challenge a highly generalized
assertion.
How to write a case study?
 Defining the case: A Case is an instance of a particular situation; an example of
something occurring. In clinical terms it is an instance of a disease, injury, or problem.
Each case should have predefined boundary which clarifies the nature and time
period covered by the case study, organization or graphical area, relevant social group,
type of evidence to be collected and priorities for data collection and analysis. The theory
driven approach to defining case may help to generate knowledge that is potentially
transferable to range of clinical context and behavior.
 Selecting the case: This step which involves selection of particular case is very
important. The case that is to be selected should not be because it is representative of
other cases but because of its uniqueness which is interest of researchers. In case of
multiple case studies number of cases should be carefully selected. This gives advantage
of allowing comparisons to be made across several cases. The selected case study should
allow the research team access to the group of individuals, organization, processes or
whatever else constitutes chosen unit of analysis for study.
 Collecting the data: For proper understanding of case, the case study approach involves
collection of multiple sources of evidence using range of quantitative and
qualitative techniques. An underlying assumption is that data collected in different ways
lead to similar conclusions. In these case study data collection needs to be flexible
enough to allow detailed description of each individual.
 Analyzing, interpreting and reporting case studies: This case study helps to analyze
data relating to individual component case first, before making comparisons across cases.
Data will need to be organized and coded to allow key issues, both derived from literature
and emerging from dataset. Framework approach is practical approach, comprising five
stages familiarization, identification, indexing, charting, mapping, interpretation to
analyze datasets particularly when time is limited. Case study findings can have
implications both for theory development and testing. During report findings it is
important to provide reader enough information to understand the process.
Example of an abstract written for a drug:

Bioinformatics resources and search tools - report on summer training project institute of pharma inquest to explore advanced concepts of computational biology, sc

Bioinformatics resources and search tools - report on summer training project institute of pharma inquest to explore advanced concepts of computational biology, sc

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Bioinformatics resources and search tools - report on summer training project institute of pharma inquest to explore advanced concepts of computational biology, sc

Similar to Bioinformatics resources and search tools - report on summer training project institute of pharma inquest to explore advanced concepts of computational biology, sc (20)

More from Sapan Anand

More from Sapan Anand (7)

Recently uploaded

Recently uploaded (20)

Bioinformatics resources and search tools - report on summer training project institute of pharma inquest to explore advanced concepts of computational biology, sc