âOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
Â
Epidata presentation course for heath science
1. WALLAGA UNIVERSITY INSTITUTE OF
HEALTH SCIENCES
DEPARTMENT OF PUBLIC HEALTH
BY: JIRA WAKOYA (BSc, MPH/EPIBIO)
STREAM OF EPIDEMIOLOGYAND
BIOSTATISTICS
COMPUTER AND APPLICATION SOFTWARE
3. What is referencing ?
⢠Tell the readers the source of the ideas used in
assignment, research, etcâŚ
⢠Why it is necessary?
ďźShows the reader the source then can create solid
argument
ďźCredits the original work/findings of the research
ďźAvoid plagiarism to avoid academic penalties
4. What needs to be referenced
ďśOnce you get:
⢠ideas
⢠facts
⢠words
⢠theories
⢠and interpretation from sources should be
referenced
5. What is the deference between
citation and reference?
6. What Is EndNote
⢠EndNote is a software program available online.
⢠EndNote can store, recall and cite bibliographic information
when composing research papers.
⢠EndNote allows writers to easily insert citations while at the
same time creating bibliographies in the style of their choice:
⢠(such as vancover, Harvard, MLA, APA, Chicago, Turabian, and
most others) using Microsoft Word, Apple Pages '09,
OpenOffice.org Writer 3 and Mathematica 8,
7. uses
⢠To create library
⢠To import and export citation and pdf
⢠saves time of the author
⢠For citation
⢠For creating the interest of bibliography
⢠For management of citation
⢠multipurpose
⢠In generally, endnote is appreciated citation
management by scholars and managers
8. Short Facts about EndNote Web
⢠At this time, you cannot easily export citations
from databases other than ISI Web of Science
and EBSCOHost platforms directly into
EndNote Web.
⢠Maximum storage capacity in EndNote Web is
10,000 citations.
9. Short Facts about EndNote WebâŚ
⢠Use of Connection Files to search directly from EndNote Web is limited
to Web of Science (Science Citation Index and Social Science Citation
Index) and freely available databases, which have been
Âť identified as:
Âť Any institutional library catalog
Âť Canada Dept. of Foreign Affairs
Âť Center for Research Libraries
Âť PubMed
Âť U.S. Military Academy
Âť U.S. Natl. Institutes of Health
Âť U.S. Naval Academy
Âť Wayne State University Catalog
Âť World Bank
10. What could you do with Endnotes?
⢠Citation
⢠Attaching pdf file to their citation from where it
can be available
⢠Manual insertion of new citation for unpublished
reports and researches (new file editing and
writing)
⢠Storing pdf with their citation/
reference/biblography
11.
12.
13.
14.
15.
16.
17. Objective
⢠Design the questionnaire and create a coding
sheet.
⢠Set up the database on your computer.
⢠Enter data
⢠export data to other statistical software
18. EpiData
⢠EpiData is a Windows program for creating
data entry forms and entering data.
⢠It can be downloaded free of charge from
http://www.epidata.dk/ .
19. Why use EpiData?
⢠EpiData is a small and portable program.
⢠The setup file is less than 1 MB in size.
⢠The program and all of its configuration files
are contained in a single directory which can
be located anywhere on your computer.
⢠EpiData does not interfere with your
computerâs system settings.
20. .
⢠EpiData is entirely free of charge.
⢠EpiData uses the same file format as Epi Info.
It can also import data from plain text, Stata
and dBase, and export to plain text, dBase,
Excel, Stata, SPSS and SAS formats.
21. Variable names
ďś A field or variable contains responses to particular questions.
ďś Each variable in the database has a unique name, such as
âidnumâ, âsexâ or âageâ.
ďś In EpiData, a variable name can be up to 10 characters long,
can contain letters or numbers, and must begin with a letter.
ďś You should be consistent in making variable names
uppercase or lowercase.
ďś Lowercase is recommended if you will be analyzing your data
in Stata.
ďś Variable names should be meaningful and consistent.
22. Variable namesâŚ
There are two common methods for creating variable names:
1. The variable name is a common word based on the question
text. Examples âageâ, âsexâ, âbloodtypeâ.
â˘These are easy to remember and understand, but in
questionnaires with many variables it can become difficult to
generate unique variable names within the database systemâs
constraints.
2. Most questionnaires will have numbered questions, so the
variable names can be based on the question number, e.g. q1,
q2âŚ
â˘In EpiData, variable names must begin with a letter, so you can
use a system like q1, or a1, a2, b1, b2 if the questionnaire has
sections identified by letter
23. Coding sheets
⢠You should always create a coding sheet (also called
a codebook) for your questionnaire and database,
and it is a good idea to begin this process when you
are writing the questionnaire, rather than
afterwards.
⢠A coding sheet is used to keep track of the all the
variables in your questionnaire.
⢠The following information should be provided for
each variable:
24. .
⢠Variable name used in the database Corresponding
text/question in the questionnaire
⢠Type of variable (numeric, categorical or string)
Values allowed
⢠Other special requirements (for example: âmust be
uniqueâ) and branching options (for example: âif
response is 1, go to question 8â)
25. The coding sheet can be created in Excel, Word (using tables) or
any other format that can use tables.
Variable Name Description Type Values
allowed
Other
requirement
s
Idnum ID number Numeric Numeric 1-500 Must enter
Must unique
Sex Sex Categorical 1=male
2=female
1,2
Age Age in years Numeric 6-18
Sample coding sheet
26. Unique identifiers
⢠Unique identifiers are used to refer to specific records in
your database without using identifying information such
as names.
⢠The unique identifier is usually a number assigned by the
researcher.
⢠The identifying variable should be the first or one of the
first variables to appear in your database.
⢠It is difficult to manage your data without unique
identifiers.
⢠Usually only one variable is used as the identifier, but it
may be useful to use a composite variable, e.g.School ID +
Student ID.
⢠In EpiData, this can be done using check files.
27. Categorical variables
⢠Categorical variables are variables which have a fixed
set of possible responses, such as sex, country of
birth or blood type. It should only be possible to
choose one correct response.
⢠Categorical variables are commonly represented in a
database by a number, for
⢠example: Sex: 1=male, 2=female
⢠Blood group: A=1,B=2,O=3,AB=4
28. Continuous and discrete variables
⢠Continuous and discrete variables contain
numbers only, e.g. height or weight
⢠You will often have reasonable limits or
ranges for continuous variables, When setting
up a database you can specify these limits to
reduce the chance of data entry error.
29. ⢠Many measurements come in units such as
days, grams or millilitres.
⢠Decide on the most appropriate unit (grams
or kilograms, minutes or hours) and specify
this unit on the coding sheet, questionnaire
and data entry form.
⢠All entries for one variable must use the same
unit.
30. String variables
⢠String variables are variables that can contain text and
numbers.
⢠As it is difficult to perform mathematical or statistical
operations on strings, they should not be used if a numeric
variable can be used instead.
⢠For instance, You might a question that asks for country of
birth.
⢠The paper questionnaire can have a space for the country to
be filled in, but before the questionnaires are entered in the
database, you can give each country a numeric code which is
entered in the database.
⢠String variables usually contain data that is too complicated to
categorize, such as long comments
31. Missing data
⢠Missing data is represented by special codes
to indicate that something is missing.
⢠All fields should have data in them unless they
are skipped because of a conditional jump.
32. ⢠It will only be possible for these fields to be blank if
they are skipped. Some variables, such as ID
numbers and eligibility criteria, should never be
missing.
⢠For numeric variables, missing data is conventionally
represented by 8,88,888,9,99,999,âŚ
⢠The missing code must never be a valid response for
that variable
Missing data
33. Queries
⢠Your database should allow âquery codesâ to
indicate that a response has been queried and
is being followed up. We often use â8â (or
â88â, â888â, etc.) to indicate a query, or ââ2â
throughout.
⢠When the data entry operator finds a strange
response, a query code should be entered and
the offending questionnaire set aside with a
PostâIt note or similar item to indicate the
query.
34. .
⢠The researchers can contact the respondent for
clarification, or otherwise decide what response
should be entered.
35. The EpiData interface
When you start EpiData, at the top of the screen you
will see the:
⢠Menu bar: all EpiData commands are available
from this menu.
⢠Workflow toolbar with the most essential
commands in logical order.
36. 1. Define data
⢠Creating the data form
â A questionnaire file contains field names, field
definitions and other text.
â It is used to generate a data entry form, which has
the extension New.qes file.
â There are two ways you can create the
questionnaire file: you can type it up from scratch,
or if you already have the questionnaire in
electronic format, you can use the text from that
file as a basis for the EpiData questionnaire,
Open.qes file.
37. Starting from scratch
⢠To create a new questionnaire from scratch, do
one of the following:
⢠From the workflow toolbar, select 1. select project
â New â title â dataset.
38. Using an existing file
⢠If the questionnaire is in Microsoft Word
format
1.In Word, select File â Save As⌠from the
menu bar.
2. In the âSave as typeâ box, select âText Onlyâ.
The file extension is now .txt.You can change
the name of the file if you like.
3.Click the Save button.
4.Close the file in Word.
39. 6. Open the file in EpiData by selecting File â
Open from the menu bar. Make sure that âAllâ
is selected in the âFiles of typeâ box.
7. Select File â Save As⌠from the menu bar.
8. In the âFile nameâ box, change the file
extension from txt to qes.
You can now edit the questionnaire file as
needed in EpiDataâs editor
40. Variable Types
⢠Text
Text data types are useful for storing text or numeric
information such as names and addresses
⢠Text (Upper Case)
A special type of text data type is the upper case text. This
data type will store whatever is
entered as upper case (capital) letters, regardless of the way
you enter the data. For example, if
you type âwayneâ (in all lower case letters) into a field that
stores data as upper case text,
EpiData will store the item as âWAYNEâ in the dataset.
41. Variable Types
⢠Numeric
â Numeric variables store numbers (e.g, age,
birthweight, etc.). They can be used for storing
categorical or continuous variables,
â If you know that you will be performing
mathematical operations with a particular field, it
must be of the numeric field type in order to do
the operations (using EpiData Analysis).
42. ⢠Date
â The date field type can store dates in two different formats:
â mm/dd/yyyy (American format)
â dd/mm/yyyy (European format).
⢠Todayâs Date
â You can create a variable that automatically stores todayâs date.
⢠Yes/No
â This data type is useful for storing binary categorical data such as âYesâ
or âNoâ (which can also be stored as â1â or â0â).
⢠Auto ID Number
â Auto ID Number field type is a unique number that is assigned to each
observation entered into a data file. This unique number usually starts at
1 and is assigned to the first record of the dataset and is incremented by
one for every new record that is entered thereafter.
43. ⢠SOUNDEX
â Soundex is a special type of text variable that applies SOUNDEX coding
rules to text data as it is entered. The code is supposed to be a
representation of the text as it sounds (rather than the way it is
spelled). The SOUNDEX code consists of one uppercase letter, a
hyphen, followed by three numbers.
Phone Number
NOTE: EpiData does not support PHONENUM variable types, as Epi Info
does. However,telephone numbers can be entered into ordinary text
fields.
Time EpiData does not support time fields. However, the help file states
that there are two functions,
Time2Num and Num2Time, that allow one to convert numeric data to
time data and vice-versa.
44. Format of the questionnaire file
⢠The questionnaire file contains lines of text
that either define a data entry field or are
ignored by the database. A line is recognised
as a field line if it contains special characters
used to represent a data entry field. You can
have more than one field on one line.
45. ⢠Variable names can be up to 10 characters
long, can contain only letters and numbers,
and must start with a letter.
⢠There are two ways to specify the variable
name.
⢠You can change this behaviour in the menu
File â Options â Create data file â How to
generate field names.
46. a) First word in question is field name: The first
word on the line (up to 10 characters before
the first space) becomes the variable name.
b) Automatic field names:
Each line containing a variable should follow this
format:
(Field name) (Field label) (Field definition)
47. ⢠Variable labels describe individual variables.
EpiData automatically assigns variable labels
based on the text that appears before the
field definition.
48. Helping you create field
definitions
a)The field pick list
⢠To use the pick list, open the .qes file that you
want to edit, position the cursor at the place
where you want to enter a field definition, and
do one of the following:
⢠Press Ctrl+Q on the keyboard
⢠Use the menu Edit â Field Pick List
⢠Click the âField pick listâ button
⢠click on the Close button
49. b) the code writer
⢠The code writer automatically completes field
definitions when you start typing field
characters.
⢠To turn the code writer on or off, do one of
the following
⢠Press Ctrl+W on the keyboard
⢠Use the menu Edit â Code Writer
⢠Click the âCode writerâ button on the editor
toolbar
50. Formatting issues
⢠take care that the text of questions and
comments does not include any characters
that have special meaning to EpiData, such as
field definition characters orâ@â (used to
force a tab character â<â (less than) character
⢠which indicates the start of a date, Boolean or
uppercase text field.
⢠If the text in your questionnaire contains this
character, replace it with the text âless thanâ
or âLTâ.
51. ⢠to keep the questionnaire file tidy. One useful
commandis âAlign Fieldsâ.
⢠This will attempt to line up the field
definitions in the file so that each field
definition starts from the same position
⢠1. Place the cursor on a line which has the
field definition positioned where you would
like it.
⢠2. Use the menu Edit ď¨Align Fields.
52. For example:
⢠If the cursor is positioned on the first line, the
following form
⢠v1 A small text ####
⢠v2 Other text <A > v3 ###.#
⢠v4 Text ###
Becomes
⢠v1 A small text ####
⢠v2 Other text <A > v3 ###.#
⢠v4 Text ###
53. Previewing the data form
⢠Press Ctrl+T on the keyboard
⢠Use the menu Data File â Preview Data Form
⢠Click the âPreview data formâ button
54. Creating the data entry file
⢠Now we create the file where data is actually
entered
⢠On the workflow toolbar, click 2. Make Data
File
⢠From the menu bar, select Data in/out â
New Data File
55. Check files â checking validity of
data
⢠Check files are plain text files with special
commands. They can be created and edited in
one of two ways: using an interactive
interface, or editing the file manually. Names
of check files end with the extension .chk and
the first part of the name must be the same as
⢠the corresponding data file.
56. Types of check available in
EpiData
⢠Unique: Each record must have a unique
value for this field. This check is often used for
identification fields.
⢠Must enter: This field must have a value; it
cannot be missing. (Use a missing code if the
⢠value really is missing. See the discussion
under âMissing dataâ.)
⢠Range/Legal values: Limit the values that can
be entered in this field.
57. ⢠Jumps: Jumps are used to move to different
fields.
⢠The Repeat command means that when you
start a new record, any field which is set to
repeat will already contain the value that was
entered in that field in the previous record.
58. The interactive check interface
⢠On the workflow toolbar, click 3. Checks
⢠From the menu bar, select Checks ď¨Add /
Revise
59. Entering data
⢠On the workflow toolbar, click 4. Enter Data
⢠From the menu bar, select Data in/out ď¨ Enter
Data
60. Editing data
⢠Open the âFindâ
⢠dialogue by pressing Ctrl+F or using the
⢠menu Goto ď¨ Find Record.
61. Exporting data
⢠You can export your EpiData database to the
following formats:
⢠plain text
⢠dBase
⢠Excel
⢠Stata
⢠SPSS
⢠SAS
62. ⢠Begin the export process by clicking one of the
following:
⢠On the workflow toolbar, click 6. Export Data
⢠From the menu bar,
select Data in/out Export
63. Limitations of EpiData
⢠Questionnaire files cannot exceed 999 lines of
text.
⢠If you have a large database that exceeds this
length, you can break it up into different files.
⢠A string variable cannot have more than 80
characters:
⢠if you anticipate long text comments, you can
break them up into multiple comment
variables.