2. Since this is such a big topic, I have
decided to break this lecture up into a
number of smaller powerpoints
This is part 1 on
Collecting Data
Organizing Data
Analysing Data
3. Overview
This unit forms the bulk of the Year 11 IPT
course
The CSTA, in its teaching program,
allocates the whole of term 2 for this unit
However, this is a little ambitious
Twelve to 14 weeks is a more achievable
time frame, given the numerous
interruptions to year 11
4. Overview
During this unit, students study in detail the
seven information processes
Note: Although these processes are presented
to students as distinct entities, in reality, this
may not always be the case
Often these processes will overlap each other
They may not necessarily occur in the order that
they are presented in the syllabus as well
Students must be aware of computer-based
(automatic) and non-computer-based (manual)
methods for each process
6. Defining data
This refers to the need to clarify the problem
To begin with, this involves interviews and
observations to identify issues and goals
Open ended questions are used so that an
overview of a situation is obtained
The responses are reviewed by the project team
and management
Result: An area of focus is identified
From this starting point, surveys are developed
(using closed questions) to gather data of
greater relevance to the focus area
7. Identifying the data source
Primary data - is data that is collected first
hand from surveys, questionnaires,
observations, etc
It is the most accurate but most costly and
time consuming
Secondary data – is data collected by
someone else (like the ABS, Gallop,
A.C.Nielson, etc)
This is cheaper and quicker but not always
exactly what you want
8. Gathering the data
Manually – data is collected by people
completing forms and the data being
entered by a keyboard
Automatically – via an electronic device,
e.g. scanner, microphone or even an
automated traffic counter
Data can also be collected via a web page
9. Hardware used for data collection
This is a fairly broad area and includes devices
like: different types of keyboards, mice, track
pads, trackballs, light pens, graphics tablets,
touch screens, microphones, scanners, digital
cameras, digital video cameras, etc.
This can be covered by setting an assignment
where students have to research the operation
of two devices
how it works
what it is typically used for
any software used by the device to aid in the
collection of data
10. Hardware used for data collection
There needs to be sufficient detail presented
e.g. for devices like scanners and digital
cameras it is important that students mention the
role of the Charged Couple Device (CCD) and
the analogue to digital converter (ADC) chip
The CCD is a grid of light sensitive sensors that
generate electrical signals when light falls on them
These signals are then converted by the ADC chip
into digital signals
11. CCD
Scanner
Sensor
Analogue
Data Digital Data
Light
Source ADC Chip
Scanner Head
Light
Reflected
Light
Document
12. Digital Camera
Analogue Data
Light
ADC Chip
Digital Data
Camera
Lens CCD – A
grid of
light
sensors
13. Software used for data collection
The operating system is the most
important piece of software as it
essentially runs the entire computer
It also plays a major role in accepting data
from input devices
It is important for students to be able to
distinguish between Graphical User
Interfaces (GUI) and Command Line
Interfaces
14. Software used for data collection
It is also worth-while talking about the boot
process, but only in very general terms
e.g. booting a Windows PC
When a computer is first powered-up, certain
programs stored permanently in ROM are
activated
One of the programs does a diagnostic check
on the computer. This information is
displayed on the screen before Windows
loads
15. Software used for data collection
After the diagnostic check is complete,
another program stored in ROM called “NT
loader” is activated
This program goes out to the hard drive and
looks for the Windows operating system
software. If it finds it, Windows is loaded into
RAM and activated
Keep it simple
16. Non-computer procedures
One weakness with the current syllabus that the
importance of good survey design is not given
enough emphasis
The link between survey design and the
identified focus area of data collection is critical
Students need experience at brainstorming
issues and devising a series of questions that
will provide meaningful data
In order to obtain useful data, there should be a
majority of closed questions
17. Non-computer procedures
There are whole books devoted to survey design,
however I emphasize three types of responses:
Numerical (the respondent gives a number)
Lickert Scales e.g. Always, Mostly, Sometimes, Rarely, Never.
(there are many variations)
Categorical: e.g. The state of your birth is: NSW, Qld, Vic, etc
The questions that students develop should make use of
these survey design techniques
Don’t forget that open questions are still fine to use, but
limit them
I save old survey forms and analyse their structure with
students e.g. Australian Lifestyle Survey
Encourage students to use checkboxes in their surveys
18. Non-computer procedures
Surveys are important because the data is
often incorporated in databases
Be aware though that not all data
collection involves a survey e.g. making a
student newsletter will involve interviews
and observations but not necessarily a
survey
19. Organizing
This involves
Arranging
Representing
Formatting
data for use by other information processes
Often, data is organised as part of the
collection process
20. Organizing
Remember, there are five different types
of data:
Text
Numbers
Image
Audio
Video
These can be organized in a number of
ways:
21. Text
Includes punctuation signs, symbols,
spaces, etc
Most text is converted into binary using
ASCII encoding
EBCDIC encoding is used less today
With Word Art, the text is actually
organized as a graphic
Text can also be ‘hypertext’ i.e. linked text
22. Numbers
Numbers can be organized as text, but we
cannot do any calculations with them
Numbers are most useful when organized
using non-text formats and placed in a
table-like structure (such as a
spreadsheet)
23. Images
These are organised as:
Bit maps (aka raster graphics), or
Vectors
The difference between the two lies in how
data about the image is stored in memory
24. Images - Bitmaps
Data is stored about the colour and intensity of every pixel (picture
element) on the screen in a ‘frame buffer’ which could be part of
main RAM (on-board video) or on a video card
For each pixel there is a corresponding memory location. The
amount of data we store for each pixel determines the ‘colour depth’
of the image and the number of colours available
E.g. 1 bit => 21 = 2 colours for each pixel i.e. on or off, monochrome
e.g. black and white images
‘8 bit colour’ => 28 = 256 colours for each pixel
‘16 bit colour’ => 216 = 65536 colours
Requires large amount of RAM and very fast processing
Suitable for photographic images
Difficult to move part of an image without effecting the rest of the
image
Resizing can result in pixelation, ‘stair-casing’, etc
Created by a ‘paint’ programs, e.g. Microsoft Paint, Photoshop
25. Images - Vector
The graphic is composed of objects- such as rectangles,
circles, lines, etc
For each object, all that is stored in memory is the
starting and ending coordinates, object type, line
thickness, fill colour/pattern, etc
Uses a lot less memory and data is processed faster
Individual objects may be selected and manipulated
without effecting the rest of the image
Objects can be resized without loss of detail but
individual pixels cannot be edited, only whole objects
Created by ‘draw’ programs e.g. Microsoft Word Draw
Tools, AppleWorks Draw
26. Audio
MIDI – Musical Instrument Digital Interface
Data is in the form of ‘note information’ for
the attached instrument e.g. the pressure
and duration of every note strike
Small file sizes
Cannot produce speech
Editing requires knowledge of music
Suits synthesizers
27. Audio
Waveform files (MP3, WAV, etc)
Samples are taken of the sound and saved as a file
By playing back the measurements the original sound
wave is recreated
Sample rate – the number of samples of the sound wave
per second
Sample size – the number of data bits used to store data
about the sound
The greater the sample rate and size the better the
quality of the play back sound
Many sound files are compressed to save storage space
e.g. mp3
28. Video
Storing visual and auditory data by taking a
number of samples
Each sample is called a fame
Each frame contains data describing the light
intensity and colour of all of the pixels that make
up the CCD (and also the screen) of the camera
Huge demand on storage, hence many
compression formats e.g. mpeg, QuickTime
(overlaps with ‘storing and retrieving’ and
‘transmitting and receiving’ processes)
29. Organizing – In general…
How data is organised really depends on
the subsequent information processes that
are going to be applied to the data
e.g. A story may be organised as text (.doc),
however it may also be organized as a
graphic file (.pdf) or an audio file (.wav)
30. File Formats
A good clue as to how a document is
organized is given by the file extension
e.g. A file named “pc102.jpg” is an image file
because it has a .jpg extension.
Just by knowing this we can infer that the
graphic :
Uses 24 bit (16.7 million) colour,
Is probably a photograph,
Uses a high, lossy compression,
Is probably being used for the internet
31. File Formats
As an in-class exercise I get my students
to research various file formats and what
they are used for (there is a huge number)
Ware and Grover’s book “Information
Processes and Technology – Preliminary
Course” has some good information on
this aspect of the course
32. Software for Organising Data
As you’d expect, most of the common applications
software can be used to organise data into a desired
format
e.g. Text – Word processor and DTP software
There are other important software tools used to convert
data from one format to another, e.g. “Graphic
Converter” on the Macs
Data tables can be created in a number of ways:
In a word processor
Using web authoring software
Using a database
Using a spreadsheet
33. Organising – social & ethical issues
If data is not organised properly then the old
acronym ‘G.I.G.O.’ (garbage in, garbage out) will
apply.
The importance of data organisation can be
stressed to students by describing the Y2K
phenomenon
Although Y2K is ancient history, its ‘worst case
scenarios’ serve the purpose of illustrating how
badly organised data can have deleterious
effects on humans
34. End of Tools - Part 1
Please Open:
Lecture_5_Tools_Part_2_.ppt