SlideShare a Scribd company logo
1 of 9
Download to read offline
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
Research Article
Assigning Level in Data-mining Exercises
Paul Hooley, Ian J. Chilton, Daron A.Fincham, Alan T.Burns and Michael
P.Whitehead
School of Applied Sciences, University of Wolverhampton
Date received: 14/02/07 Date accepted: 16/03/07
Abstract
There is currently much interest in ascribing outcomes to Masters (M) level
programmes. It is particularly difficult to define M level outcomes in bioinformatics for
students on non-specialist programmes. An approach is described that attempts to
discriminate undergraduate from M level in a data-mining exercise. Differentiation of
level is based upon the taxonomic origin of a DNA sequence, the relative increase in
gene complexity from lower to higher eukaryote and the initiative required to use a
wider range of databases and analytical tools.
Keywords: Masters, descriptors of level, data-mining, bioinformatics
Introduction
Vast quantities of original research data, often generated by the genome
projects, are available to the public via a variety of websites and can provide
intriguing opportunities for novel teaching activities (Campbell, 2003). These
now include user-friendly packages to teach bioinformatics, notably the
European Multimedia Bioinformatics Educational Resource, EMBER (Attwood
et al., 2005). Data-mining uses computers to extract and analyse information
to generate useful biological hypotheses and insights. We have previously
described an exercise in data-mining that formed part of a final year
undergraduate module. This used fungal DNA sequence data as the starting
point for final year undergraduate (level 3) students on a range of B.Sc
(Hons.) programmes to explore some of the tools available for bioinformatics
(Hooley et al. 2004 available on British Mycological Society sponsored
website; Fungi 4 Schools, http://www.fungi4schools.org/). Fungi have
relatively short genes, for example Aspergillus nidulans has a mean gene
length of less than 2000 base pairs, and many genes have only 1 or two
introns (http://www.broad.mit.edu/annotation/fungi/aspergillus/genome).
Nonetheless fungal genes illustrate all the major features of eukaryotic gene
structure. In contrast human genes have a mean of 9 or 10 introns per gene
making the typical human gene very much longer and commonly allowing for
complex alternative splice reactions giving more than one protein product
(IHGSC, 2004).
Information concerning an individual gene is referenced as an “accession
number” on a publically available database such as NCBI (National Centre for
Biotechnology Information, http://www.ncbi.nlm.nih.gov/). Fungal accessions
were chosen to incorporate information relating to the entire gene structure,
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
generally including information on translation start and stop sites and intron
number and position, as well as links to complete annotations on the relevant
genome projects. Each student was randomly assigned a fungal accession
number from NCBI. The exercise was carried out over an entire term,
comprising 50% of the assessment on the final year undergraduate module
Gene Manipulation and was based largely upon a workshop approach. Each
student worked at their own pace on an individual accession number and
asked for staff assistance as required. The final assessment was based upon
a short report where the student accurately identified the gene, described the
major features of its structure, and compared its similarity at both the DNA
and encoded protein levels to its nearest relatives (Hooley et al., 2004).
Having used this exercise successfully at undergraduate level we then
considered how such an approach could be adopted for an audience of M
level students from a wide variety of geographical and academic backgrounds
(Table1). The objective was to develop an M level data-mining module
suitable for students on non-specialist and specialist bioinformatics courses.
Table 1 Countries of origin of students registered on the MSc. DNA Data-mining module
Country of origin 2005/6 2006/7
UK 3 2
India 16 12
Nigeria 2 2
Ghana 1 2
Cameroon 0 1
A Masters (M Level 4) Data-mining module
It is difficult to find a common consensus for criteria that mark the move from
B.Sc. (Hons.) to M level particularly in such a new area as bioinformatics. The
U.K.’s Quality Assurance Agency for higher education have attempted to
provide statements of outcomes as “threshold” or “good” for undergraduate
work. They have also developed benchmarks for three subject areas at M
level (business/management, engineering and pharmacy) and continue to
work towards drafting a generic statement of M level outcomes (QAA, 2006,
Bellingham pers. comm.). Hack and Kendall (2005) recently proposed some
useful learning outcomes to distinguish between undergraduate, postgraduate
(Masters in Bioscience and Masters in Bioinformatics) and PhD levels. These
focused largely upon the move from undergraduate to M level involving the
ability to produce biological models and a deeper understanding and
application of analytical models and their parameters. The EMBER package
used on the M.Sc. in Bioinformatics at the University of Manchester defines
several basic tutorials with a smaller number of advanced tutorials and case
studies.
One might expect M level material to be more challenging and open ended
than undergraduate level, perhaps with a stronger element of original thought
or analysis and incorporating the ability to negotiate the form of the particular
assessment under study. We could not assume that our M level students had
a common grounding in molecular biology (Table 1) so the teaching had to
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
consider an introductory approach which nonetheless led to work acceptable
by its conclusion at an M level. Only a small number of our students were
enrolled on a course in Molecular Biology and Bioinformatics which contains a
significant component of computing modules. The majority specialised in
Applied Microbiology and Biotechnology which has no compulsory computing
components. Hence, as pointed out by Hack and Kendall (2005), whilst we
can assume Masters students in Bioinformatics will be able to write
programmes and develop databases, these skills are not expected in Masters
students studying other Bioscience subjects. Although Honts (2003) reports
that the successful introduction of simple programming skills at undergraduate
level in Cell Biology courses at Drake University in the US was a useful
addition to teaching bioinformatics to non-specialist students.
The M level module ‘DNA Data-mining’ begins with three formal lectures that
cover revision of basic features of transcription and translation, the key tools
of similarity comparison focusing upon the Basic Local Alignment Search Tool
(BLAST, Altschul et al. 1990), the main genome databases and an
introduction to protein structure modelling. This is followed by several tutorial
exercises that utilise the application of BLAST in a simple worked example
from a fungal gene to illustrate the key principles. Each student then works on
an analysis of their own accession both in their own time and in a weekly
timetabled slot in the computer lab with a member of staff present. The overall
aim is for the student to provide as full an analysis as possible of an individual
DNA sequence and the protein it encodes. In contrast to the undergraduate
exercise, these sequences are selected from a collection of higher eukaryotic
accessions on NCBI which do not contain complete details of gene structure
and function. Students are supplied with a workbook and a support package
on the university intranet, the Wolverhampton on line framework (Wolf) that
logically takes them through the analytical tools that they may expect to use
with a suggested time planner. For example this includes the application of
easily used resources such as PSIPRED (McGuffin et al., 2000) that allows
the prediction of protein secondary structure and the prediction of potential
transmembrane domains and more advanced programmes like Genthreader
(Jones, 1999) which attempts protein fold predictions. Multiple sequence
alignment programs for comparison of several DNA or protein sequences
together e.g. ClustalX (available from European Bioinformatics Institute, EBI,
http://www.ebi.ac.uk/) and 3D-Coffee (Wallace et al., 2005) may be employed.
They were guided to specific tools freely available on some sites such as the
EMBER tutorials (Attwood et al., 2005) initially available on line, now as a CD
also, or less structured but more comprehensive resources like EBI. Some
reports then conclude with the design of phylogenetic trees to explore the
evolutionary relationships of their accession (Baldauf, 2003, Hall, 2004).
Weekly surgery tutorial workshops are provided where staff are on hand to
deal with individual problems and suggest appropriate analytical approaches.
The final module assessment is based around the submission of a 5,000 word
report with Figures, Tables, Legends, etc. being additional to this limit to
encourage a concise reporting style.
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
Defining M Level Features
Table 2 compares details available from two separate example accession
numbers – one representing a relatively simple fungal gene and the other a
human DNA sequence. The entire gene structure can be modelled from a
single paper and the accession itself for the fungal gene. In contrast the
human gene gives multiple references and the sequence is mRNA derived
and therefore lacks genomic annotation.
Table 2 A comparison of data available from a selected fungal and a human accession number at NCBI
Feature AF202995 NM_152854
Origin Complete genomic DNA
clone
cDNA derived from
mRNA
Species Aspergillus nidulans Homo sapiens
Gene Product Zn finger transcription factor CD40/ Tumour necrosis
factor receptor
Papers referenced 1 10
DNA sequence length
b.p.
3814 1554
Introns 2 with exact positions given No detailed information
but two alternative
splice products
mentioned
Upstream activating
sequence (promoter)
Complete Incomplete
The student will have to access other databases (such as ensembl:
http://www.ensembl.org/) to acquire gene structure details. This may involve
renewed Blast searches on different sites as NCBI accession numbers may
not be recognised on a foreign database. For example searches of the OMIM
(Online Mendelian Inheritance in Man) database will be required. One could
also choose higher eukaryotic accessions where no clear functions had been
ascribed and no papers were referenced. The following features then mark a
move from an undergraduate exercise to an M level.
• Use of accession numbers that relate only to non-genomic gene sequences
– i.e. cDNA’s generated from reverse transcription of mRNA molecules. Some
initiative was therefore needed in employing a wider range of search tools
and databases to collate analyses of complete gene sequences and their
encoded products.
• Use of higher eukaryotic genes i.e. from animals and plants. These may have
a very complex structure showing greater size and a large number of introns
with alternative splicing reactions also possible to generate more than one
potential protein product. Students were given a wide range of choice of
accession number, often reflecting their own wider interests or research
project work.
• Performing phylogenetic analysis on the DNA or protein obtained would be
viewed as indicative of excellence at undergraduate level. However at M level
this was seen as a standard form of analysis to undertake.
• Students are encouraged to use more than one analytical approach to
scrutinise any inconsistencies in predictions, for example of secondary
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
structures of proteins encoded by their genes. This fits Hack and Kendall’s
(2005) suggestion of the need for a good understanding of the underlying
parameters of models and their statistical bases.
• The mark scheme rewards curiosity – where students have followed an
unusual or ingenious method of analysis, perhaps using less conventional
databases or altering search tools away from their default settings. Criticisms
of the shortcomings of the tools are encouraged. Whilst the tools and
websites introduced in the early weeks are relatively straightforward to use,
other analytical tools require a more sophisticated level of understanding by
the student. This supports another of Hack and Kendall’s (2005) criteria for M
level implying the use of a wider range of analytical tools.
Presently we have only chosen conventional protein-encoding genes for
analysis so avoiding genes encoding rRNA, tRNA or micro RNA (Thadani and
Tammi 2006) as the final product. However as our knowledge of the RNA
world increases, teaching a data-mining approach to the analysis of such
genes could follow a similar pattern.
Table 3 Web sites and programs used by a student who undertook a data-mining task at both
undergraduate (fungal gene) and M (human gene) levels
Program/database Description URL (Uniform Resource Locator)
Undergraduate assignment
Rps-Blast (NCBI) Conserved domain search www.ncbi.nlm.nih.gov/BLAST
Blastn (NCBI) DNA sequence homology
search
www.ncbi.nlm.nih.gov/BLAST
Blastx (NCBI) Translated DNA sequence
homology search
www.ncbi.nlm.nih.gov/BLAST
Postgraduate assignment
Rps-Blast (NCBI) Conserved domain search www.ncbi.nlm.nih.gov/BLAST
Blastn (NCBI) DNA sequence homology
search
www.ncbi.nlm.nih.gov/BLAST
Blastx (NCBI) Translated DNA sequence
homology search
www.ncbi.nlm.nih.gov/BLAST
InterPro database
(EBI)
Taxonomic coverage
analysis
www.ebi.ac.uk/interpro
ClustalW (EBI) Phylogeny analysis www.ebi.ac.uk
Ensembl database Primary structure www.ensembl.org
PSIPRED Secondary structure
prediction
bioinf.cs.ucl.ac.uk/psipred/psiform.html
MEMSAT 2 Transmembrane topology
prediction
bioinf.cs.ucl.ac.uk/psipred/psiform.html
GenTHREADER Tertiary structure
prediction
bioinf.cs.ucl.ac.uk/psipred/psiform.html
UniProt (Universal
Protein Database)
Domains www.pir.uniprot.org/
CATH Protein
Structure
Classification
Database
Domain architecture cathwww.biochem.ucl.ac.uk/
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
An example of how M level can be observed is by comparing the work of one
of the authors (I.J.C.) who undertook these bioinformatics assignments both
as an undergraduate and postgraduate student. For the undergraduate
exercise (on the honours year module ‘Gene Manipulation’), the 1000 word
limit resulted in an 11-page report. The M level (DNA Data-mining module)
5000 word limit resulted in a 34 page report; 12 and 24 references were cited
respectively. Table 3 outlines the databases and programs that the student
used. This highlights the far greater degree of independent investigation
undertaken by the student at M level.
Evaluation
The student-centred format of the module quickly allows the student to find
their existing level, identify their shortcomings and seek help on a one to one
basis with staff. Conversely as suggested by Campbell (2003), gifted students
can rise to the challenge to provide work of exceptional quality. The chances
of plagiarism are reduced by each student being independently assigned a
separate sequence for analysis with weekly sessions with staff providing an
“informal viva” environment so that personal student input and overall
understanding are monitored. Whilst some supporting literature references are
expected in the report, the bulk of the text represents the student’s own
unique analysis so reducing the “cut and paste” culture. This workshop
approach is therefore quite demanding of staff time when compared to a
conventional lecture format.
Table 4 A summary of module evaluation forms from three separate iterations of the M level DNA
Data-mining module (24 forms returned from 58 distributed)
Statement Responses
Excessive About Right Light No
Response
The amount of
directed reading
10 12 1 1
The volume of
assessment
6 18 0 0
High Medium Low
The degree of
difficulty compared
to other modules
10 14 0
Yes No No Response
Did you find
tutorials/workshops
helpful?
19 3 2
Did you find the
learning
experience
stimulating?
18 6 0
Would you
recommend the
module to other
students?
19 3 2
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
M level modules at the University of Wolverhampton are scored on a non
linear grade scheme, A – D for pass marks, E for a marginal fail and F for a
fail. Of 58 students taking the module over 3 separate iterations in 3 years, the
mean grade has been C for each year with a mean pass mark of 83%, 84%
and 95%. The numbers of students enrolled on the M.Sc. in Molecular Biology
and Bioinformatics were too small to compare statistically with the major
cohort studying the M.Sc. in Applied Microbiology and Biotechnology but it
was noted that these students, as one might expect, tended to score higher
than the average (for example in one year : A, A and B, compared with a
module mean of C). Table 4 summarises the responses of students to a
voluntary questionnaire at the end of the module. Interestingly they were
aware of the heavy load of outside reading required with approaching half the
respondents considering this excessive. Nonetheless the actual assessment
loading was considered reasonable by a large majority of students. Just under
half the students found the degree of difficulty high compared to other M level
modules (most of which comprised a traditional diet of teacher led lectures
and tutorials). Most encouragingly a large majority of students found the
learning experience stimulating and would recommend the module to their
peers.
Table 5 Performance on M level modules accessed by the 2005/2006 cohort to compare numbers of
students and grades on DNA Data-mining with four other modules assessed by conventional exams
and continuous coursework. Total module numbers vary depending upon some flexibility in individual
student programmes with the Gene Technology module including a large number of additional
students studying the M.Sc. in Biomedical Sciences
Module Grade
Title A B C D E F
Genes & Genomes 0 4 6 8 3 6
DNA Data-mining 3 6 4 7 0 1
Masters Lab.Techniques 3 7 9 3 2 1
Research Methods 0 12 8 4 2 4
Gene Technology 6 17 22 5 6 2
Table 5 shows the relative performance of students on the DNA Data-mining
module compared with: Masters Laboratory Techniques assessed by
continuous assessment of course work; Research Methods, which uses a
multi component, task-based assessment regime; Genes and Genomes
culminating in a substantial piece of coursework handed in at the end of the
course; and Gene Technology, which employs a traditional three hour
examination. The performance of students on DNA Data-mining compares
favourably with those modules that use a variety of conventional assessment
tools.
Conclusions
There is disagreement over the ability to provide useful benchmark
statements of a generic nature at M level or for subject specific outcomes at
this level. For example some awards may be based around professional
bodies’ requirements and M.Phil. programmes will have a more distinctive
research flavour than taught programmes (QAA 2006). Nonetheless we
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
suggest that within individual exercises or modules M level achievements can
be discriminated. We propose that a useful criterion to apply to teaching data-
mining at undergraduate versus M level relies upon the taxonomic origin and
hence the complexity of the genes used in exercises, particularly for students
that are not on specialist bioinformatics programmes. Fungal genes can
provide ideal and simple models of eukaryotic gene structure where
remarkably complete analyses can be accomplished with access to a
minimum number of websites. Genes from higher eukaryotes can often
provide more challenging projects better suited to M level work that require a
deeper understanding of gene and protein structure as well as the concepts of
phylogenetic relationships. They also allow the student to express the ability
to think creatively in the use of a diverse range of websites and databases.
Acknowledgements
We would like to thank Dr Eldridge Buultjens of Abertay University for his
many helpful suggestions and support in the establishment of our teaching
programmes in bioinformatics. We thank the University of Wolverhampton
Celt initiative for support.
Communicating Author
Dr Paul Hooley, School of Applied Sciences, University of Wolverhampton,
Wolverhampton, West Midlands. WV1 1SB. Tel: 01902 322667 Fax: 01902 322714
Email: p.hooley@wlv.ac.uk
References
Altschul, S.F., Gish W., Miller W., Myers E.W. and Lipman D.J. (1990) Basic
local alignment search tool. Journal of Molecular Biology 215, 403–410
Attwood T.K., Selimas I., Buis R., Altenburg R., Herzog R., Ledent V., Ghita
V., Fernandes P., Marques I. and Brugman M. (2005) Report on the
EMBER project – a European multimedia bioinformatics educational
resource. Bioscience Education e-Journal 6
http://www.bioscience.heacademy.ac.uk/journal/vol6/Beej-6-4.htm (last
accessed 16/03/2007)
Baldauf S.L. (2003) Phylogeny for the faint of heart ; a tutorial. Trends in
Genetics 19, 345–351
Campbell A.M. (2003) Public access for teaching genomics, proteomics and
bioinformatics. Cell Biology Education 2, 98–111
Hack C. and Kendall G. 2005 Bioinformatics education in the UK : are we
educating scientists or training technicians? Centre for Bioscience Bulletin
15 , 10–11
Hall B.G. (2004) Phylogenetic Trees Made Easy – A How-To Manual.
Sinauer Associates Inc.; US
Hooley P., Burns A.T.H. and Whitehead M.P. (2004) Fungal gene sequences
make excellent models for teaching data-mining. The Mycologist 18,
118–124
Volume 9:
June 2007
http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf
Honts J.E. (2003) Evolving strategies for the incorporation of bioinformatics
within the undergraduate cell biology curriculum. Cell Biology Education 2,
233–247
International Human Genome Sequencing Consortium (2004) Finishing the
euchromatic sequence of the human genome. Nature 431, 931–945
Jones D.T. (1999) GenTHREADER: An efficient and reliable protein fold
recognition method for genomic sequences. Journal of Molecular Biology
287, 797–815
McGuffin L.J., Bryson K. and Jones D.T. (2000) The PSIPRED protein
structure prediction server. Bioinformatics 16, 404–405
QAA (2006) Securing and maintaining academic standards : benchmarking M
level programmes.
http://www.qaa.ac.uk/academicinfrastructure/benchmark/masters/Mlevelbe
nchmarkingFeb06.pdf (accessed 16/03/2007)
Thadani R. and Tammi M.T. (2006) BMC Bioinformatics 7 (Suppl. 5 ):S20
Wallace I.M., Blackshields G. and Higgins D.G. (2005) Multiple sequence
alignments. Current Opinion in Structural Biology 15, 261–266
View publication stats
View publication stats

More Related Content

Similar to Assigning Level in Data-mining Exercises.pdf

Addressing The Issue Of E-Learning And Online Genetics For Health Professionals
Addressing The Issue Of E-Learning And Online Genetics For Health ProfessionalsAddressing The Issue Of E-Learning And Online Genetics For Health Professionals
Addressing The Issue Of E-Learning And Online Genetics For Health ProfessionalsEmily Smith
 
Teaching Pathology To Medical Students 10 25 09
Teaching Pathology To Medical Students 10 25 09Teaching Pathology To Medical Students 10 25 09
Teaching Pathology To Medical Students 10 25 09vshidham
 
Supporting researchers in the molecular life sciences Jeff Christiansen
Supporting researchers in the molecular life sciences Jeff Christiansen Supporting researchers in the molecular life sciences Jeff Christiansen
Supporting researchers in the molecular life sciences Jeff Christiansen ARDC
 
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...Nat Rice
 
Introduction to IB Diploma Biology
Introduction to IB Diploma BiologyIntroduction to IB Diploma Biology
Introduction to IB Diploma BiologyMiltiadis Kitsos
 
Experimental research data quality in
Experimental research data quality inExperimental research data quality in
Experimental research data quality inijait
 
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Greg Landrum
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformaticsbizoxahosu
 
Protein sequence classification in data mining– a study
Protein sequence classification in data mining– a studyProtein sequence classification in data mining– a study
Protein sequence classification in data mining– a studyZac Darcy
 
PROTEIN SEQUENCE CLASSIFICATION IN DATA MINING– A STUDY
PROTEIN SEQUENCE CLASSIFICATION IN DATA MINING– A STUDYPROTEIN SEQUENCE CLASSIFICATION IN DATA MINING– A STUDY
PROTEIN SEQUENCE CLASSIFICATION IN DATA MINING– A STUDYZac Darcy
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
BioCreative2023_proceedings_instructions_authors_template.pdf
BioCreative2023_proceedings_instructions_authors_template.pdfBioCreative2023_proceedings_instructions_authors_template.pdf
BioCreative2023_proceedings_instructions_authors_template.pdfAlHayyan
 
Biomedical-named entity recognition using CUDA accelerated KNN algorithm
Biomedical-named entity recognition using CUDA accelerated KNN algorithmBiomedical-named entity recognition using CUDA accelerated KNN algorithm
Biomedical-named entity recognition using CUDA accelerated KNN algorithmTELKOMNIKA JOURNAL
 
Computational of Bioinformatics
Computational of BioinformaticsComputational of Bioinformatics
Computational of Bioinformaticsijtsrd
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
 
HLTH 1150 Literature Review writing.pdf
HLTH 1150 Literature Review writing.pdfHLTH 1150 Literature Review writing.pdf
HLTH 1150 Literature Review writing.pdfMartin McMorrow
 
A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...CSCJournals
 

Similar to Assigning Level in Data-mining Exercises.pdf (20)

Three posters presented at AAAS2015
Three posters presented at AAAS2015Three posters presented at AAAS2015
Three posters presented at AAAS2015
 
Addressing The Issue Of E-Learning And Online Genetics For Health Professionals
Addressing The Issue Of E-Learning And Online Genetics For Health ProfessionalsAddressing The Issue Of E-Learning And Online Genetics For Health Professionals
Addressing The Issue Of E-Learning And Online Genetics For Health Professionals
 
Teaching Pathology To Medical Students 10 25 09
Teaching Pathology To Medical Students 10 25 09Teaching Pathology To Medical Students 10 25 09
Teaching Pathology To Medical Students 10 25 09
 
Supporting researchers in the molecular life sciences Jeff Christiansen
Supporting researchers in the molecular life sciences Jeff Christiansen Supporting researchers in the molecular life sciences Jeff Christiansen
Supporting researchers in the molecular life sciences Jeff Christiansen
 
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
 
Introduction to IB Diploma Biology
Introduction to IB Diploma BiologyIntroduction to IB Diploma Biology
Introduction to IB Diploma Biology
 
150522 bioinfo gis lr
150522 bioinfo gis lr150522 bioinfo gis lr
150522 bioinfo gis lr
 
Health Informatics
Health InformaticsHealth Informatics
Health Informatics
 
Experimental research data quality in
Experimental research data quality inExperimental research data quality in
Experimental research data quality in
 
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Protein sequence classification in data mining– a study
Protein sequence classification in data mining– a studyProtein sequence classification in data mining– a study
Protein sequence classification in data mining– a study
 
PROTEIN SEQUENCE CLASSIFICATION IN DATA MINING– A STUDY
PROTEIN SEQUENCE CLASSIFICATION IN DATA MINING– A STUDYPROTEIN SEQUENCE CLASSIFICATION IN DATA MINING– A STUDY
PROTEIN SEQUENCE CLASSIFICATION IN DATA MINING– A STUDY
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
BioCreative2023_proceedings_instructions_authors_template.pdf
BioCreative2023_proceedings_instructions_authors_template.pdfBioCreative2023_proceedings_instructions_authors_template.pdf
BioCreative2023_proceedings_instructions_authors_template.pdf
 
Biomedical-named entity recognition using CUDA accelerated KNN algorithm
Biomedical-named entity recognition using CUDA accelerated KNN algorithmBiomedical-named entity recognition using CUDA accelerated KNN algorithm
Biomedical-named entity recognition using CUDA accelerated KNN algorithm
 
Computational of Bioinformatics
Computational of BioinformaticsComputational of Bioinformatics
Computational of Bioinformatics
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
HLTH 1150 Literature Review writing.pdf
HLTH 1150 Literature Review writing.pdfHLTH 1150 Literature Review writing.pdf
HLTH 1150 Literature Review writing.pdf
 
A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...
 

More from Carrie Tran

Reflective Essay About Leadership Essay On Lea
Reflective Essay About Leadership Essay On LeaReflective Essay About Leadership Essay On Lea
Reflective Essay About Leadership Essay On LeaCarrie Tran
 
31 Classification Of Essay Examp. Online assignment writing service.
31 Classification Of Essay Examp. Online assignment writing service.31 Classification Of Essay Examp. Online assignment writing service.
31 Classification Of Essay Examp. Online assignment writing service.Carrie Tran
 
How To Write A Paper Introduction. How To Write An Essay Introducti
How To Write A Paper Introduction. How To Write An Essay IntroductiHow To Write A Paper Introduction. How To Write An Essay Introducti
How To Write A Paper Introduction. How To Write An Essay IntroductiCarrie Tran
 
Impressive Admission Essay Sample Thatsnotus
Impressive Admission Essay Sample ThatsnotusImpressive Admission Essay Sample Thatsnotus
Impressive Admission Essay Sample ThatsnotusCarrie Tran
 
Printable Blank Sheet Of Paper - Get What You Nee
Printable Blank Sheet Of Paper - Get What You NeePrintable Blank Sheet Of Paper - Get What You Nee
Printable Blank Sheet Of Paper - Get What You NeeCarrie Tran
 
Fortune Telling Toys Spell Writing Parchment Paper Lig
Fortune Telling Toys Spell Writing Parchment Paper LigFortune Telling Toys Spell Writing Parchment Paper Lig
Fortune Telling Toys Spell Writing Parchment Paper LigCarrie Tran
 
How To Write An Article Paper 2. Online assignment writing service.
How To Write An Article Paper 2. Online assignment writing service.How To Write An Article Paper 2. Online assignment writing service.
How To Write An Article Paper 2. Online assignment writing service.Carrie Tran
 
Writing A Personal Philosophy Essay. Online assignment writing service.
Writing A Personal Philosophy Essay. Online assignment writing service.Writing A Personal Philosophy Essay. Online assignment writing service.
Writing A Personal Philosophy Essay. Online assignment writing service.Carrie Tran
 
Oxford Admission Essay Sample. Online assignment writing service.
Oxford Admission Essay Sample. Online assignment writing service.Oxford Admission Essay Sample. Online assignment writing service.
Oxford Admission Essay Sample. Online assignment writing service.Carrie Tran
 
How To Write A Cause And Effect Essay - IELTS ACHI
How To Write A Cause And Effect Essay - IELTS ACHIHow To Write A Cause And Effect Essay - IELTS ACHI
How To Write A Cause And Effect Essay - IELTS ACHICarrie Tran
 
Dltk Custom Writing Paper DLTKS Custom Writing Pa
Dltk Custom Writing Paper DLTKS Custom Writing PaDltk Custom Writing Paper DLTKS Custom Writing Pa
Dltk Custom Writing Paper DLTKS Custom Writing PaCarrie Tran
 
Well Written Article (600 Words) - PHDessay.Com
Well Written Article (600 Words) - PHDessay.ComWell Written Article (600 Words) - PHDessay.Com
Well Written Article (600 Words) - PHDessay.ComCarrie Tran
 
Fantastic Harvard Accepted Essays Thatsnotus
Fantastic Harvard Accepted Essays ThatsnotusFantastic Harvard Accepted Essays Thatsnotus
Fantastic Harvard Accepted Essays ThatsnotusCarrie Tran
 
Free Printable Lined Paper For Letter Writing - A Line D
Free Printable Lined Paper For Letter Writing - A Line DFree Printable Lined Paper For Letter Writing - A Line D
Free Printable Lined Paper For Letter Writing - A Line DCarrie Tran
 
WritingJobs.Net - With Them To The Top Job In Three Possible Steps
WritingJobs.Net - With Them To The Top Job In Three Possible StepsWritingJobs.Net - With Them To The Top Job In Three Possible Steps
WritingJobs.Net - With Them To The Top Job In Three Possible StepsCarrie Tran
 
003 Professional College Essay Writers Example Ad
003 Professional College Essay Writers Example Ad003 Professional College Essay Writers Example Ad
003 Professional College Essay Writers Example AdCarrie Tran
 
Master Thesis Writing Help Why Use Our Custom Mast
Master Thesis Writing Help  Why Use Our Custom MastMaster Thesis Writing Help  Why Use Our Custom Mast
Master Thesis Writing Help Why Use Our Custom MastCarrie Tran
 
Surprising How To Write A Short Essay Thatsnotus
Surprising How To Write A Short Essay  ThatsnotusSurprising How To Write A Short Essay  Thatsnotus
Surprising How To Write A Short Essay ThatsnotusCarrie Tran
 
Example Position Paper Mun Harvard MUN India 14 P
Example Position Paper Mun  Harvard MUN India 14 PExample Position Paper Mun  Harvard MUN India 14 P
Example Position Paper Mun Harvard MUN India 14 PCarrie Tran
 
005 Essay Example How To Start English Formal Forma
005 Essay Example How To Start English Formal Forma005 Essay Example How To Start English Formal Forma
005 Essay Example How To Start English Formal FormaCarrie Tran
 

More from Carrie Tran (20)

Reflective Essay About Leadership Essay On Lea
Reflective Essay About Leadership Essay On LeaReflective Essay About Leadership Essay On Lea
Reflective Essay About Leadership Essay On Lea
 
31 Classification Of Essay Examp. Online assignment writing service.
31 Classification Of Essay Examp. Online assignment writing service.31 Classification Of Essay Examp. Online assignment writing service.
31 Classification Of Essay Examp. Online assignment writing service.
 
How To Write A Paper Introduction. How To Write An Essay Introducti
How To Write A Paper Introduction. How To Write An Essay IntroductiHow To Write A Paper Introduction. How To Write An Essay Introducti
How To Write A Paper Introduction. How To Write An Essay Introducti
 
Impressive Admission Essay Sample Thatsnotus
Impressive Admission Essay Sample ThatsnotusImpressive Admission Essay Sample Thatsnotus
Impressive Admission Essay Sample Thatsnotus
 
Printable Blank Sheet Of Paper - Get What You Nee
Printable Blank Sheet Of Paper - Get What You NeePrintable Blank Sheet Of Paper - Get What You Nee
Printable Blank Sheet Of Paper - Get What You Nee
 
Fortune Telling Toys Spell Writing Parchment Paper Lig
Fortune Telling Toys Spell Writing Parchment Paper LigFortune Telling Toys Spell Writing Parchment Paper Lig
Fortune Telling Toys Spell Writing Parchment Paper Lig
 
How To Write An Article Paper 2. Online assignment writing service.
How To Write An Article Paper 2. Online assignment writing service.How To Write An Article Paper 2. Online assignment writing service.
How To Write An Article Paper 2. Online assignment writing service.
 
Writing A Personal Philosophy Essay. Online assignment writing service.
Writing A Personal Philosophy Essay. Online assignment writing service.Writing A Personal Philosophy Essay. Online assignment writing service.
Writing A Personal Philosophy Essay. Online assignment writing service.
 
Oxford Admission Essay Sample. Online assignment writing service.
Oxford Admission Essay Sample. Online assignment writing service.Oxford Admission Essay Sample. Online assignment writing service.
Oxford Admission Essay Sample. Online assignment writing service.
 
How To Write A Cause And Effect Essay - IELTS ACHI
How To Write A Cause And Effect Essay - IELTS ACHIHow To Write A Cause And Effect Essay - IELTS ACHI
How To Write A Cause And Effect Essay - IELTS ACHI
 
Dltk Custom Writing Paper DLTKS Custom Writing Pa
Dltk Custom Writing Paper DLTKS Custom Writing PaDltk Custom Writing Paper DLTKS Custom Writing Pa
Dltk Custom Writing Paper DLTKS Custom Writing Pa
 
Well Written Article (600 Words) - PHDessay.Com
Well Written Article (600 Words) - PHDessay.ComWell Written Article (600 Words) - PHDessay.Com
Well Written Article (600 Words) - PHDessay.Com
 
Fantastic Harvard Accepted Essays Thatsnotus
Fantastic Harvard Accepted Essays ThatsnotusFantastic Harvard Accepted Essays Thatsnotus
Fantastic Harvard Accepted Essays Thatsnotus
 
Free Printable Lined Paper For Letter Writing - A Line D
Free Printable Lined Paper For Letter Writing - A Line DFree Printable Lined Paper For Letter Writing - A Line D
Free Printable Lined Paper For Letter Writing - A Line D
 
WritingJobs.Net - With Them To The Top Job In Three Possible Steps
WritingJobs.Net - With Them To The Top Job In Three Possible StepsWritingJobs.Net - With Them To The Top Job In Three Possible Steps
WritingJobs.Net - With Them To The Top Job In Three Possible Steps
 
003 Professional College Essay Writers Example Ad
003 Professional College Essay Writers Example Ad003 Professional College Essay Writers Example Ad
003 Professional College Essay Writers Example Ad
 
Master Thesis Writing Help Why Use Our Custom Mast
Master Thesis Writing Help  Why Use Our Custom MastMaster Thesis Writing Help  Why Use Our Custom Mast
Master Thesis Writing Help Why Use Our Custom Mast
 
Surprising How To Write A Short Essay Thatsnotus
Surprising How To Write A Short Essay  ThatsnotusSurprising How To Write A Short Essay  Thatsnotus
Surprising How To Write A Short Essay Thatsnotus
 
Example Position Paper Mun Harvard MUN India 14 P
Example Position Paper Mun  Harvard MUN India 14 PExample Position Paper Mun  Harvard MUN India 14 P
Example Position Paper Mun Harvard MUN India 14 P
 
005 Essay Example How To Start English Formal Forma
005 Essay Example How To Start English Formal Forma005 Essay Example How To Start English Formal Forma
005 Essay Example How To Start English Formal Forma
 

Recently uploaded

Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)Dr. Mazin Mohamed alkathiri
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 

Recently uploaded (20)

Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 

Assigning Level in Data-mining Exercises.pdf

  • 1. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf Research Article Assigning Level in Data-mining Exercises Paul Hooley, Ian J. Chilton, Daron A.Fincham, Alan T.Burns and Michael P.Whitehead School of Applied Sciences, University of Wolverhampton Date received: 14/02/07 Date accepted: 16/03/07 Abstract There is currently much interest in ascribing outcomes to Masters (M) level programmes. It is particularly difficult to define M level outcomes in bioinformatics for students on non-specialist programmes. An approach is described that attempts to discriminate undergraduate from M level in a data-mining exercise. Differentiation of level is based upon the taxonomic origin of a DNA sequence, the relative increase in gene complexity from lower to higher eukaryote and the initiative required to use a wider range of databases and analytical tools. Keywords: Masters, descriptors of level, data-mining, bioinformatics Introduction Vast quantities of original research data, often generated by the genome projects, are available to the public via a variety of websites and can provide intriguing opportunities for novel teaching activities (Campbell, 2003). These now include user-friendly packages to teach bioinformatics, notably the European Multimedia Bioinformatics Educational Resource, EMBER (Attwood et al., 2005). Data-mining uses computers to extract and analyse information to generate useful biological hypotheses and insights. We have previously described an exercise in data-mining that formed part of a final year undergraduate module. This used fungal DNA sequence data as the starting point for final year undergraduate (level 3) students on a range of B.Sc (Hons.) programmes to explore some of the tools available for bioinformatics (Hooley et al. 2004 available on British Mycological Society sponsored website; Fungi 4 Schools, http://www.fungi4schools.org/). Fungi have relatively short genes, for example Aspergillus nidulans has a mean gene length of less than 2000 base pairs, and many genes have only 1 or two introns (http://www.broad.mit.edu/annotation/fungi/aspergillus/genome). Nonetheless fungal genes illustrate all the major features of eukaryotic gene structure. In contrast human genes have a mean of 9 or 10 introns per gene making the typical human gene very much longer and commonly allowing for complex alternative splice reactions giving more than one protein product (IHGSC, 2004). Information concerning an individual gene is referenced as an “accession number” on a publically available database such as NCBI (National Centre for Biotechnology Information, http://www.ncbi.nlm.nih.gov/). Fungal accessions were chosen to incorporate information relating to the entire gene structure,
  • 2. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf generally including information on translation start and stop sites and intron number and position, as well as links to complete annotations on the relevant genome projects. Each student was randomly assigned a fungal accession number from NCBI. The exercise was carried out over an entire term, comprising 50% of the assessment on the final year undergraduate module Gene Manipulation and was based largely upon a workshop approach. Each student worked at their own pace on an individual accession number and asked for staff assistance as required. The final assessment was based upon a short report where the student accurately identified the gene, described the major features of its structure, and compared its similarity at both the DNA and encoded protein levels to its nearest relatives (Hooley et al., 2004). Having used this exercise successfully at undergraduate level we then considered how such an approach could be adopted for an audience of M level students from a wide variety of geographical and academic backgrounds (Table1). The objective was to develop an M level data-mining module suitable for students on non-specialist and specialist bioinformatics courses. Table 1 Countries of origin of students registered on the MSc. DNA Data-mining module Country of origin 2005/6 2006/7 UK 3 2 India 16 12 Nigeria 2 2 Ghana 1 2 Cameroon 0 1 A Masters (M Level 4) Data-mining module It is difficult to find a common consensus for criteria that mark the move from B.Sc. (Hons.) to M level particularly in such a new area as bioinformatics. The U.K.’s Quality Assurance Agency for higher education have attempted to provide statements of outcomes as “threshold” or “good” for undergraduate work. They have also developed benchmarks for three subject areas at M level (business/management, engineering and pharmacy) and continue to work towards drafting a generic statement of M level outcomes (QAA, 2006, Bellingham pers. comm.). Hack and Kendall (2005) recently proposed some useful learning outcomes to distinguish between undergraduate, postgraduate (Masters in Bioscience and Masters in Bioinformatics) and PhD levels. These focused largely upon the move from undergraduate to M level involving the ability to produce biological models and a deeper understanding and application of analytical models and their parameters. The EMBER package used on the M.Sc. in Bioinformatics at the University of Manchester defines several basic tutorials with a smaller number of advanced tutorials and case studies. One might expect M level material to be more challenging and open ended than undergraduate level, perhaps with a stronger element of original thought or analysis and incorporating the ability to negotiate the form of the particular assessment under study. We could not assume that our M level students had a common grounding in molecular biology (Table 1) so the teaching had to
  • 3. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf consider an introductory approach which nonetheless led to work acceptable by its conclusion at an M level. Only a small number of our students were enrolled on a course in Molecular Biology and Bioinformatics which contains a significant component of computing modules. The majority specialised in Applied Microbiology and Biotechnology which has no compulsory computing components. Hence, as pointed out by Hack and Kendall (2005), whilst we can assume Masters students in Bioinformatics will be able to write programmes and develop databases, these skills are not expected in Masters students studying other Bioscience subjects. Although Honts (2003) reports that the successful introduction of simple programming skills at undergraduate level in Cell Biology courses at Drake University in the US was a useful addition to teaching bioinformatics to non-specialist students. The M level module ‘DNA Data-mining’ begins with three formal lectures that cover revision of basic features of transcription and translation, the key tools of similarity comparison focusing upon the Basic Local Alignment Search Tool (BLAST, Altschul et al. 1990), the main genome databases and an introduction to protein structure modelling. This is followed by several tutorial exercises that utilise the application of BLAST in a simple worked example from a fungal gene to illustrate the key principles. Each student then works on an analysis of their own accession both in their own time and in a weekly timetabled slot in the computer lab with a member of staff present. The overall aim is for the student to provide as full an analysis as possible of an individual DNA sequence and the protein it encodes. In contrast to the undergraduate exercise, these sequences are selected from a collection of higher eukaryotic accessions on NCBI which do not contain complete details of gene structure and function. Students are supplied with a workbook and a support package on the university intranet, the Wolverhampton on line framework (Wolf) that logically takes them through the analytical tools that they may expect to use with a suggested time planner. For example this includes the application of easily used resources such as PSIPRED (McGuffin et al., 2000) that allows the prediction of protein secondary structure and the prediction of potential transmembrane domains and more advanced programmes like Genthreader (Jones, 1999) which attempts protein fold predictions. Multiple sequence alignment programs for comparison of several DNA or protein sequences together e.g. ClustalX (available from European Bioinformatics Institute, EBI, http://www.ebi.ac.uk/) and 3D-Coffee (Wallace et al., 2005) may be employed. They were guided to specific tools freely available on some sites such as the EMBER tutorials (Attwood et al., 2005) initially available on line, now as a CD also, or less structured but more comprehensive resources like EBI. Some reports then conclude with the design of phylogenetic trees to explore the evolutionary relationships of their accession (Baldauf, 2003, Hall, 2004). Weekly surgery tutorial workshops are provided where staff are on hand to deal with individual problems and suggest appropriate analytical approaches. The final module assessment is based around the submission of a 5,000 word report with Figures, Tables, Legends, etc. being additional to this limit to encourage a concise reporting style.
  • 4. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf Defining M Level Features Table 2 compares details available from two separate example accession numbers – one representing a relatively simple fungal gene and the other a human DNA sequence. The entire gene structure can be modelled from a single paper and the accession itself for the fungal gene. In contrast the human gene gives multiple references and the sequence is mRNA derived and therefore lacks genomic annotation. Table 2 A comparison of data available from a selected fungal and a human accession number at NCBI Feature AF202995 NM_152854 Origin Complete genomic DNA clone cDNA derived from mRNA Species Aspergillus nidulans Homo sapiens Gene Product Zn finger transcription factor CD40/ Tumour necrosis factor receptor Papers referenced 1 10 DNA sequence length b.p. 3814 1554 Introns 2 with exact positions given No detailed information but two alternative splice products mentioned Upstream activating sequence (promoter) Complete Incomplete The student will have to access other databases (such as ensembl: http://www.ensembl.org/) to acquire gene structure details. This may involve renewed Blast searches on different sites as NCBI accession numbers may not be recognised on a foreign database. For example searches of the OMIM (Online Mendelian Inheritance in Man) database will be required. One could also choose higher eukaryotic accessions where no clear functions had been ascribed and no papers were referenced. The following features then mark a move from an undergraduate exercise to an M level. • Use of accession numbers that relate only to non-genomic gene sequences – i.e. cDNA’s generated from reverse transcription of mRNA molecules. Some initiative was therefore needed in employing a wider range of search tools and databases to collate analyses of complete gene sequences and their encoded products. • Use of higher eukaryotic genes i.e. from animals and plants. These may have a very complex structure showing greater size and a large number of introns with alternative splicing reactions also possible to generate more than one potential protein product. Students were given a wide range of choice of accession number, often reflecting their own wider interests or research project work. • Performing phylogenetic analysis on the DNA or protein obtained would be viewed as indicative of excellence at undergraduate level. However at M level this was seen as a standard form of analysis to undertake. • Students are encouraged to use more than one analytical approach to scrutinise any inconsistencies in predictions, for example of secondary
  • 5. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf structures of proteins encoded by their genes. This fits Hack and Kendall’s (2005) suggestion of the need for a good understanding of the underlying parameters of models and their statistical bases. • The mark scheme rewards curiosity – where students have followed an unusual or ingenious method of analysis, perhaps using less conventional databases or altering search tools away from their default settings. Criticisms of the shortcomings of the tools are encouraged. Whilst the tools and websites introduced in the early weeks are relatively straightforward to use, other analytical tools require a more sophisticated level of understanding by the student. This supports another of Hack and Kendall’s (2005) criteria for M level implying the use of a wider range of analytical tools. Presently we have only chosen conventional protein-encoding genes for analysis so avoiding genes encoding rRNA, tRNA or micro RNA (Thadani and Tammi 2006) as the final product. However as our knowledge of the RNA world increases, teaching a data-mining approach to the analysis of such genes could follow a similar pattern. Table 3 Web sites and programs used by a student who undertook a data-mining task at both undergraduate (fungal gene) and M (human gene) levels Program/database Description URL (Uniform Resource Locator) Undergraduate assignment Rps-Blast (NCBI) Conserved domain search www.ncbi.nlm.nih.gov/BLAST Blastn (NCBI) DNA sequence homology search www.ncbi.nlm.nih.gov/BLAST Blastx (NCBI) Translated DNA sequence homology search www.ncbi.nlm.nih.gov/BLAST Postgraduate assignment Rps-Blast (NCBI) Conserved domain search www.ncbi.nlm.nih.gov/BLAST Blastn (NCBI) DNA sequence homology search www.ncbi.nlm.nih.gov/BLAST Blastx (NCBI) Translated DNA sequence homology search www.ncbi.nlm.nih.gov/BLAST InterPro database (EBI) Taxonomic coverage analysis www.ebi.ac.uk/interpro ClustalW (EBI) Phylogeny analysis www.ebi.ac.uk Ensembl database Primary structure www.ensembl.org PSIPRED Secondary structure prediction bioinf.cs.ucl.ac.uk/psipred/psiform.html MEMSAT 2 Transmembrane topology prediction bioinf.cs.ucl.ac.uk/psipred/psiform.html GenTHREADER Tertiary structure prediction bioinf.cs.ucl.ac.uk/psipred/psiform.html UniProt (Universal Protein Database) Domains www.pir.uniprot.org/ CATH Protein Structure Classification Database Domain architecture cathwww.biochem.ucl.ac.uk/
  • 6. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf An example of how M level can be observed is by comparing the work of one of the authors (I.J.C.) who undertook these bioinformatics assignments both as an undergraduate and postgraduate student. For the undergraduate exercise (on the honours year module ‘Gene Manipulation’), the 1000 word limit resulted in an 11-page report. The M level (DNA Data-mining module) 5000 word limit resulted in a 34 page report; 12 and 24 references were cited respectively. Table 3 outlines the databases and programs that the student used. This highlights the far greater degree of independent investigation undertaken by the student at M level. Evaluation The student-centred format of the module quickly allows the student to find their existing level, identify their shortcomings and seek help on a one to one basis with staff. Conversely as suggested by Campbell (2003), gifted students can rise to the challenge to provide work of exceptional quality. The chances of plagiarism are reduced by each student being independently assigned a separate sequence for analysis with weekly sessions with staff providing an “informal viva” environment so that personal student input and overall understanding are monitored. Whilst some supporting literature references are expected in the report, the bulk of the text represents the student’s own unique analysis so reducing the “cut and paste” culture. This workshop approach is therefore quite demanding of staff time when compared to a conventional lecture format. Table 4 A summary of module evaluation forms from three separate iterations of the M level DNA Data-mining module (24 forms returned from 58 distributed) Statement Responses Excessive About Right Light No Response The amount of directed reading 10 12 1 1 The volume of assessment 6 18 0 0 High Medium Low The degree of difficulty compared to other modules 10 14 0 Yes No No Response Did you find tutorials/workshops helpful? 19 3 2 Did you find the learning experience stimulating? 18 6 0 Would you recommend the module to other students? 19 3 2
  • 7. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf M level modules at the University of Wolverhampton are scored on a non linear grade scheme, A – D for pass marks, E for a marginal fail and F for a fail. Of 58 students taking the module over 3 separate iterations in 3 years, the mean grade has been C for each year with a mean pass mark of 83%, 84% and 95%. The numbers of students enrolled on the M.Sc. in Molecular Biology and Bioinformatics were too small to compare statistically with the major cohort studying the M.Sc. in Applied Microbiology and Biotechnology but it was noted that these students, as one might expect, tended to score higher than the average (for example in one year : A, A and B, compared with a module mean of C). Table 4 summarises the responses of students to a voluntary questionnaire at the end of the module. Interestingly they were aware of the heavy load of outside reading required with approaching half the respondents considering this excessive. Nonetheless the actual assessment loading was considered reasonable by a large majority of students. Just under half the students found the degree of difficulty high compared to other M level modules (most of which comprised a traditional diet of teacher led lectures and tutorials). Most encouragingly a large majority of students found the learning experience stimulating and would recommend the module to their peers. Table 5 Performance on M level modules accessed by the 2005/2006 cohort to compare numbers of students and grades on DNA Data-mining with four other modules assessed by conventional exams and continuous coursework. Total module numbers vary depending upon some flexibility in individual student programmes with the Gene Technology module including a large number of additional students studying the M.Sc. in Biomedical Sciences Module Grade Title A B C D E F Genes & Genomes 0 4 6 8 3 6 DNA Data-mining 3 6 4 7 0 1 Masters Lab.Techniques 3 7 9 3 2 1 Research Methods 0 12 8 4 2 4 Gene Technology 6 17 22 5 6 2 Table 5 shows the relative performance of students on the DNA Data-mining module compared with: Masters Laboratory Techniques assessed by continuous assessment of course work; Research Methods, which uses a multi component, task-based assessment regime; Genes and Genomes culminating in a substantial piece of coursework handed in at the end of the course; and Gene Technology, which employs a traditional three hour examination. The performance of students on DNA Data-mining compares favourably with those modules that use a variety of conventional assessment tools. Conclusions There is disagreement over the ability to provide useful benchmark statements of a generic nature at M level or for subject specific outcomes at this level. For example some awards may be based around professional bodies’ requirements and M.Phil. programmes will have a more distinctive research flavour than taught programmes (QAA 2006). Nonetheless we
  • 8. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf suggest that within individual exercises or modules M level achievements can be discriminated. We propose that a useful criterion to apply to teaching data- mining at undergraduate versus M level relies upon the taxonomic origin and hence the complexity of the genes used in exercises, particularly for students that are not on specialist bioinformatics programmes. Fungal genes can provide ideal and simple models of eukaryotic gene structure where remarkably complete analyses can be accomplished with access to a minimum number of websites. Genes from higher eukaryotes can often provide more challenging projects better suited to M level work that require a deeper understanding of gene and protein structure as well as the concepts of phylogenetic relationships. They also allow the student to express the ability to think creatively in the use of a diverse range of websites and databases. Acknowledgements We would like to thank Dr Eldridge Buultjens of Abertay University for his many helpful suggestions and support in the establishment of our teaching programmes in bioinformatics. We thank the University of Wolverhampton Celt initiative for support. Communicating Author Dr Paul Hooley, School of Applied Sciences, University of Wolverhampton, Wolverhampton, West Midlands. WV1 1SB. Tel: 01902 322667 Fax: 01902 322714 Email: p.hooley@wlv.ac.uk References Altschul, S.F., Gish W., Miller W., Myers E.W. and Lipman D.J. (1990) Basic local alignment search tool. Journal of Molecular Biology 215, 403–410 Attwood T.K., Selimas I., Buis R., Altenburg R., Herzog R., Ledent V., Ghita V., Fernandes P., Marques I. and Brugman M. (2005) Report on the EMBER project – a European multimedia bioinformatics educational resource. Bioscience Education e-Journal 6 http://www.bioscience.heacademy.ac.uk/journal/vol6/Beej-6-4.htm (last accessed 16/03/2007) Baldauf S.L. (2003) Phylogeny for the faint of heart ; a tutorial. Trends in Genetics 19, 345–351 Campbell A.M. (2003) Public access for teaching genomics, proteomics and bioinformatics. Cell Biology Education 2, 98–111 Hack C. and Kendall G. 2005 Bioinformatics education in the UK : are we educating scientists or training technicians? Centre for Bioscience Bulletin 15 , 10–11 Hall B.G. (2004) Phylogenetic Trees Made Easy – A How-To Manual. Sinauer Associates Inc.; US Hooley P., Burns A.T.H. and Whitehead M.P. (2004) Fungal gene sequences make excellent models for teaching data-mining. The Mycologist 18, 118–124
  • 9. Volume 9: June 2007 http://www.bioscience.heacademy.ac.uk/journal/vol9/beej-9-4.pdf Honts J.E. (2003) Evolving strategies for the incorporation of bioinformatics within the undergraduate cell biology curriculum. Cell Biology Education 2, 233–247 International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 Jones D.T. (1999) GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences. Journal of Molecular Biology 287, 797–815 McGuffin L.J., Bryson K. and Jones D.T. (2000) The PSIPRED protein structure prediction server. Bioinformatics 16, 404–405 QAA (2006) Securing and maintaining academic standards : benchmarking M level programmes. http://www.qaa.ac.uk/academicinfrastructure/benchmark/masters/Mlevelbe nchmarkingFeb06.pdf (accessed 16/03/2007) Thadani R. and Tammi M.T. (2006) BMC Bioinformatics 7 (Suppl. 5 ):S20 Wallace I.M., Blackshields G. and Higgins D.G. (2005) Multiple sequence alignments. Current Opinion in Structural Biology 15, 261–266 View publication stats View publication stats