SlideShare a Scribd company logo
1 of 8
Download to read offline
Get HIT140 Foundations of Data
Science Assignment Help
HIT140 Foundations of Data Science
Diagnosing Parkinson‟s Disease using voice sample data analysis
According to the U.S. National Institute of Neurological Disorders and Stroke, Parkinson‟s
Disease (PD) is a „movement disorder of the nervous system that gets worse over
time‟.1
Common symptoms include tremors, rigidity, bradykinesia, and postural instability.
Notable figures who suffered from PD include the legendary boxer Muhammad Ali, the former
U.S. president George H.W. Bush, the Back to the Future actor Michael J. Fox, the heavy metal
legend Ozzy Osbourne, and the late Pope John Paul II.2
This Photo by Unknown Author is licensed under CC BY-NC-ND
Sadly, there is currently no known cure for PD. There are also no specific diagnostic tests for the
disease. Blood and laboratory tests, as well as brain scans, have been used to rule out other
disorders that may cause the symptoms. These diagnostics are invasive with rigorous procedures
that may potentially cause further stress for a person living with Parkinson‟s Disease (PPD).
There is growing interest in developing new, non-invasive methods for diagnosing PD. One
possible approach is to analyse a person‟s speech data obtained from multiple types of sound
recordings. PPD usually experiences dysphonia, which leads to reduced loudness, breathiness,
roughness, and
1. National Institute of Neurological Disorders and Stroke (2023).
https://www.ninds.nih.gov/health- information/disorders/parkinsons-disease
2. https://www.parkinson.org/understanding-parkinsons/statistics/notable-figures
exaggerated vocal tremors. These symptoms can be diagnosed in a non-invasive way by
analysing various frequency-based characteristics in a person‟s voice.
This group assessment project focuses on analysing a set of acoustic measurement data extracted
from the voice samples of PPD and, where indicated, from those who were healthy. For these
subjects, physicians also carried out medical examinations on the PPD and assigned them an
appropriate Unified Parkinson‟s Disease Rating Scale (UPDRS) score to indicate the disease
severity and progression.
Datasets
There are two datasets in this project, downloadable from Learnline.
Dataset 1
Filename: po1_data.txt
This dataset contains data collected from 40 study subjects, of which 20 were individuals living
with Parkinson‟s Disease (PPD) and the remaining 20 were healthy individuals (non-PPD
group).3
The mean age of the PPD group is 64.86 years old (standard deviation: 8.97). The mean
age of the non-PPD group is
62.55 years old (standard deviation: 10.79).
In this study, each subject was asked to record 26 different voice samples. These samples include
the recording of saying 3 sustained vowels (“a”, “o”, “u”), 10 numbers (1 to 10), 9 different
words, and 4 rhymed short sentences. This dataset includes all these voice samples but does not
indicate the specific recording it represents (i.e. we do not know whether a voice sample comes
from recording a sustained vowel or a short sentence). Please note this limitation.
A set of 26 acoustic features were further extracted from each voice sample using a free acoustic
analysis software called Praat.4
The variables in this dataset represent these features and are
given in Table 1.
Table 1. Description of variables in the po1_data.txt dataset.
Column
No.
Measurement
category
Description
1 Subject identifier This number identifies a study subject
2 Jitter Jitter in %
3 Jitter Absolute jitter in microseconds
4 Jitter Jitter as relative amplitude perturbation (r.a.p.)
5 Jitter Jitter as 5-point period perturbation quotient (p.p.q.5)
6 Jitter
Jitter as average absolute difference of differences between jitter
cycles (d.d.p.)
7 Shimmer Shimmer in %
8 Shimmer Absolute shimmer in decibels (dB)
9 Shimmer Shimmer as 3-point amplitude perturbation quotient (a.p.q.3)
10 Shimmer Shimmer as 5-point amplitude perturbation quotient (a.p.q.5)
 Sakar, B.E. et al. (2013). https://ieeexplore.ieee.org/abstract/document/6451090
 https://www.fon.hum.uva.nl/praat/
11 Shimmer Shimmer as 11-point amplitude perturbation quotient (a.p.q.11)
12 Shimmer
Shimmer as average absolute differences between consecutive differences between the
amplitudes of shimmer cycles (d.d.a.)
13 Harmonicity Autocorrelation between NHR and HNR
14 Harmonicity Noise-to-Harmonic ratio (NHR)
15 Harmonicity Harmonic-to-Noise ratio (HNR)
16 Pitch Median pitch
17 Pitch Mean pitch
18 Pitch Standard deviation of pitch
19 Pitch Minimum pitch
20 Pitch Maximum pitch
21 Pulse Number of pulses
22 Pulse Number of periods
23 Pulse Mean period
24 Pulse Standard deviation of period
25 Voice Fraction of unvoiced frames
26 Voice Number of voice breaks
27 Voice Degree of voice breaks
28 UPDRS
The Unified Parkinson‟s Disease Rating Scale (UPDRS) score that is assigned to the
subject by a physician via a medical examination to determine the severity and
progression of Parkinson‟s disease.
29
PD
indicator
Value “1” indicates a subject suffering from PD. Value “0” indicates a healthy subject.
Here are some explanations about the measurement categories. 5
The jitter variables measure the
variation in the frequency of the sound. The shimmer variables measure the variation in the
amplitude of the sound. The harmonicity variables are related to vocal quality and assess the
noise in the sound. The pitch variables measure how high or low is the sound based on the
frequency of vibration of the sound waves produced. The pulse variables measure the glottal
pulse, which are variances in voice quality affected by manipulating the vocal cords when
speaking. The voice variables measure the extent to which a subject has trouble maintaining
vocal cord vibration when saying a sustained vowel.
Dataset 2
Filename: po2_data.csv
This second dataset contains a range of voice measurements and basic demographic profiles
from 42 subjects with early-stage Parkinson‟s disease. There are no non-PPD subjects in this
dataset. Note that these subjects originate from a different study and are different from the
subjects described in the first dataset. Each record in this dataset corresponds to a set of features
extracted from a voice sample of a subject. The number of voice samples recorded from a subject
varies between 101 and 168 voice samples.
Table 2 gives the list of variables in this dataset. The motor UPDRS score could range from 0 to
108, with 0 indicating Parkinson‟s Disease symptom-free and 108 denoting severe motor
impairment. The total UPDRS represents the level of total disability of PPD and ranges from 0 to
176, where 0 denoting a healthy condition and 176 for total disabilities for untreated patients.
 Costanzo, P. and Orphanou, K. (2022). https://arxiv.org/abs/2206.03716
Table 2. Description of variables in the po2_data.csv dataset
Column Name
Measurement
category
Description
subject# Subject identifier This number identifies a study subject
Age Demography The subject‟s age at data collection
sex Demography The subject‟s gender (“0” for male, “1” for female)
test_time Test detail
The time since subject recruitment into the study. The integer part
is the number of days since recruitment.
motor_updrs UPDRS
The subject‟s motor UPDRS score assigned by a physician;
linearly interpolated.
total_updrs UPDRS
The subject‟s total UPDRS score assigned by a physician; linearly
interpolated.
jitter(%) Jitter Jitter in %
jitter(abs) Jitter Absolute jitter in microseconds
jitter(rap) Jitter Jitter as relative amplitude perturbation (r.a.p.)
jitter(ppq5) Jitter Jitter as 5-point period perturbation quotient (p.p.q.5)
jitter(ddp) Jitter
Jitter as average absolute difference of differences between jitter
cycles (d.d.p.)
shimmer(%) Shimmer Shimmer in %
shimmer(abs) Shimmer Absolute shimmer in decibels (dB)
shimmer(apq3) Shimmer Shimmer as 3-point amplitude perturbation quotient (a.p.q.3)
shimmer(apq5) Shimmer Shimmer as 5-point amplitude perturbation quotient (a.p.q.5)
shimmer(apq11) Shimmer Shimmer as 11-point amplitude perturbation quotient (a.p.q.11)
shimmer(dda) Shimmer
Shimmer as average absolute differences between consecutive
differences between the amplitudes of shimmer cycles (d.d.a.)
nhr Harmonicity Noise-to-Harmonic ratio (NHR)
hnr Harmonicity Harmonic-to-Noise ratio (HNR)
rpde Complex Recurrence Period Density Entropy
dfa Complex Detrended Fluctuation Analysis
ppe Complex Pitch Period Entropy
Here are some explanations about the additional measurement categories in this
dataset. 6
The Recurrence Period Density Entropy variable measures the ability of the vocal
folds to sustain simple vibrations. The Detrended Fluctuation Analysis describes the extent of
turbulent noise in the speech signal. The Pitch Period Entropy measures the impaired control of
stable pitch during sustained phonation.
Project objectives
Given the speech datasets above, your team is tasked with determining the extent to which the
severity
of Parkinson‟s disease can be non-invasively diagnosed (i.e. predicted) using voice sample data.
 Tsanas, A. et al. (2009). https://www.nature.com/articles/npre.2009.3920.1
Project objective 1
The first project objective requires that your team determine a set of salient variables (features)
that could be used to distinguish people with PD from those that are healthy.
This objective must be addressed in Assessment 2. It evaluates your team‟s ability to apply
descriptive and inferential statistical analyses, as well as data wrangling.
You must only use the dataset file po1_data.txt dataset for this objective. Please see additional
information on Learnline under Assessment 2.
Project objective 2
The second project objective requires that your team predict the motor and the total UPDRS
scores
assigned by a physician to people with Parkinson‟s Disease.
This objective must be addressed in Assessment 3. It evaluates the overall data science skills that
your team learned in this unit, including data exploration, data visualisation methods, and linear
regression modelling.
You must at least use the po2_data.csv dataset for this objective. Further consideration and
experimentation with the po1_data.txt dataset to validate your predictive models is essential for
scoring any higher grade. Please see additional information on Learnline under Assessment 3.
Additional notes
Any creative derivation of new variables from these existing variables or data transformation is
permitted. This is known as feature engineering and is a common data science practice.
However, you must not alter the original values of these datasets. You must maintain ethical
conduct in the course of your data analyses. If in doubt, please consult the Unit Coordinator.

More Related Content

Similar to HIT140 Foundations of Data Science Assignment Help

Current trends in audiological practices and implications for developing coun...
Current trends in audiological practices and implications for developing coun...Current trends in audiological practices and implications for developing coun...
Current trends in audiological practices and implications for developing coun...Alexander Decker
 
POWER SPECTRAL ANALYSIS OF EEG AS A POTENTIAL MARKER IN THE DIAGNOSIS OF SPAS...
POWER SPECTRAL ANALYSIS OF EEG AS A POTENTIAL MARKER IN THE DIAGNOSIS OF SPAS...POWER SPECTRAL ANALYSIS OF EEG AS A POTENTIAL MARKER IN THE DIAGNOSIS OF SPAS...
POWER SPECTRAL ANALYSIS OF EEG AS A POTENTIAL MARKER IN THE DIAGNOSIS OF SPAS...ijbesjournal
 
autistic traits in individuals with normal intelligence
autistic traits in individuals with normal intelligenceautistic traits in individuals with normal intelligence
autistic traits in individuals with normal intelligencekhalid mansour
 
Bridging the gap between emotion and health
Bridging the gap between emotion and healthBridging the gap between emotion and health
Bridging the gap between emotion and healthAndrew Kemp
 
Evaluation of cardiac responses to stress in healthy individuals
Evaluation of cardiac responses to stress in healthy individualsEvaluation of cardiac responses to stress in healthy individuals
Evaluation of cardiac responses to stress in healthy individualsChiranjeevi JIPMER Puducherry
 
Improved Algorithm for Pathological and Normal Voices Identification
Improved Algorithm for Pathological and Normal Voices Identification Improved Algorithm for Pathological and Normal Voices Identification
Improved Algorithm for Pathological and Normal Voices Identification Yayah Zakaria
 
Improved Algorithm for Pathological and Normal Voices Identification
Improved Algorithm for Pathological and Normal Voices Identification Improved Algorithm for Pathological and Normal Voices Identification
Improved Algorithm for Pathological and Normal Voices Identification IJECEIAES
 
Brief Report: OSA Evaluations for the Anaesthesiologist, Surgeon, Surgery Centre
Brief Report: OSA Evaluations for the Anaesthesiologist, Surgeon, Surgery CentreBrief Report: OSA Evaluations for the Anaesthesiologist, Surgeon, Surgery Centre
Brief Report: OSA Evaluations for the Anaesthesiologist, Surgeon, Surgery Centresemualkaira
 
Early Detection of Parkinson Disease through Biomedical Speech and Voice Anal...
Early Detection of Parkinson Disease through Biomedical Speech and Voice Anal...Early Detection of Parkinson Disease through Biomedical Speech and Voice Anal...
Early Detection of Parkinson Disease through Biomedical Speech and Voice Anal...ijscai
 
A study on impact of alcohol among young
A study on impact of alcohol among youngA study on impact of alcohol among young
A study on impact of alcohol among youngijcseit
 
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISEARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISijscai
 
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISEARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISijscai
 
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISEARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISijscai
 
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara RajendranMusic Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara RajendranTara Rajendran
 
Comfort and loudness measures
Comfort and loudness measuresComfort and loudness measures
Comfort and loudness measuresbethfernandezaud
 
Sound and Acoustic
Sound and AcousticSound and Acoustic
Sound and AcousticAiman Daniel
 

Similar to HIT140 Foundations of Data Science Assignment Help (20)

Telemonitoring of Four Characteristic Parameters of Acoustic Vocal Signal in...
Telemonitoring of Four Characteristic Parameters of Acoustic  Vocal Signal in...Telemonitoring of Four Characteristic Parameters of Acoustic  Vocal Signal in...
Telemonitoring of Four Characteristic Parameters of Acoustic Vocal Signal in...
 
Current trends in audiological practices and implications for developing coun...
Current trends in audiological practices and implications for developing coun...Current trends in audiological practices and implications for developing coun...
Current trends in audiological practices and implications for developing coun...
 
POWER SPECTRAL ANALYSIS OF EEG AS A POTENTIAL MARKER IN THE DIAGNOSIS OF SPAS...
POWER SPECTRAL ANALYSIS OF EEG AS A POTENTIAL MARKER IN THE DIAGNOSIS OF SPAS...POWER SPECTRAL ANALYSIS OF EEG AS A POTENTIAL MARKER IN THE DIAGNOSIS OF SPAS...
POWER SPECTRAL ANALYSIS OF EEG AS A POTENTIAL MARKER IN THE DIAGNOSIS OF SPAS...
 
autistic traits in individuals with normal intelligence
autistic traits in individuals with normal intelligenceautistic traits in individuals with normal intelligence
autistic traits in individuals with normal intelligence
 
Bridging the gap between emotion and health
Bridging the gap between emotion and healthBridging the gap between emotion and health
Bridging the gap between emotion and health
 
Bahrain paper 3
Bahrain paper 3Bahrain paper 3
Bahrain paper 3
 
Evaluation of cardiac responses to stress in healthy individuals
Evaluation of cardiac responses to stress in healthy individualsEvaluation of cardiac responses to stress in healthy individuals
Evaluation of cardiac responses to stress in healthy individuals
 
Improved Algorithm for Pathological and Normal Voices Identification
Improved Algorithm for Pathological and Normal Voices Identification Improved Algorithm for Pathological and Normal Voices Identification
Improved Algorithm for Pathological and Normal Voices Identification
 
Improved Algorithm for Pathological and Normal Voices Identification
Improved Algorithm for Pathological and Normal Voices Identification Improved Algorithm for Pathological and Normal Voices Identification
Improved Algorithm for Pathological and Normal Voices Identification
 
Brief Report: OSA Evaluations for the Anaesthesiologist, Surgeon, Surgery Centre
Brief Report: OSA Evaluations for the Anaesthesiologist, Surgeon, Surgery CentreBrief Report: OSA Evaluations for the Anaesthesiologist, Surgeon, Surgery Centre
Brief Report: OSA Evaluations for the Anaesthesiologist, Surgeon, Surgery Centre
 
Early Detection of Parkinson Disease through Biomedical Speech and Voice Anal...
Early Detection of Parkinson Disease through Biomedical Speech and Voice Anal...Early Detection of Parkinson Disease through Biomedical Speech and Voice Anal...
Early Detection of Parkinson Disease through Biomedical Speech and Voice Anal...
 
A study on impact of alcohol among young
A study on impact of alcohol among youngA study on impact of alcohol among young
A study on impact of alcohol among young
 
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISEARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
 
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISEARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
 
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSISEARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
EARLY DETECTION OF PARKINSON DISEASE THROUGHBIOMEDICAL SPEECH AND VOICE ANALYSIS
 
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara RajendranMusic Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
Music Therapy's Impact in Palliative Care| IAPCON2024| Dr. Tara Rajendran
 
Neurofeedback Research Summary
Neurofeedback Research SummaryNeurofeedback Research Summary
Neurofeedback Research Summary
 
Kumar2011
Kumar2011Kumar2011
Kumar2011
 
Comfort and loudness measures
Comfort and loudness measuresComfort and loudness measures
Comfort and loudness measures
 
Sound and Acoustic
Sound and AcousticSound and Acoustic
Sound and Acoustic
 

Recently uploaded

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Recently uploaded (20)

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

HIT140 Foundations of Data Science Assignment Help

  • 1. Get HIT140 Foundations of Data Science Assignment Help HIT140 Foundations of Data Science Diagnosing Parkinson‟s Disease using voice sample data analysis According to the U.S. National Institute of Neurological Disorders and Stroke, Parkinson‟s Disease (PD) is a „movement disorder of the nervous system that gets worse over time‟.1 Common symptoms include tremors, rigidity, bradykinesia, and postural instability. Notable figures who suffered from PD include the legendary boxer Muhammad Ali, the former U.S. president George H.W. Bush, the Back to the Future actor Michael J. Fox, the heavy metal legend Ozzy Osbourne, and the late Pope John Paul II.2 This Photo by Unknown Author is licensed under CC BY-NC-ND Sadly, there is currently no known cure for PD. There are also no specific diagnostic tests for the disease. Blood and laboratory tests, as well as brain scans, have been used to rule out other disorders that may cause the symptoms. These diagnostics are invasive with rigorous procedures that may potentially cause further stress for a person living with Parkinson‟s Disease (PPD). There is growing interest in developing new, non-invasive methods for diagnosing PD. One possible approach is to analyse a person‟s speech data obtained from multiple types of sound recordings. PPD usually experiences dysphonia, which leads to reduced loudness, breathiness, roughness, and 1. National Institute of Neurological Disorders and Stroke (2023). https://www.ninds.nih.gov/health- information/disorders/parkinsons-disease 2. https://www.parkinson.org/understanding-parkinsons/statistics/notable-figures exaggerated vocal tremors. These symptoms can be diagnosed in a non-invasive way by analysing various frequency-based characteristics in a person‟s voice. This group assessment project focuses on analysing a set of acoustic measurement data extracted from the voice samples of PPD and, where indicated, from those who were healthy. For these subjects, physicians also carried out medical examinations on the PPD and assigned them an appropriate Unified Parkinson‟s Disease Rating Scale (UPDRS) score to indicate the disease severity and progression. Datasets There are two datasets in this project, downloadable from Learnline.
  • 2. Dataset 1 Filename: po1_data.txt This dataset contains data collected from 40 study subjects, of which 20 were individuals living with Parkinson‟s Disease (PPD) and the remaining 20 were healthy individuals (non-PPD group).3 The mean age of the PPD group is 64.86 years old (standard deviation: 8.97). The mean age of the non-PPD group is 62.55 years old (standard deviation: 10.79). In this study, each subject was asked to record 26 different voice samples. These samples include the recording of saying 3 sustained vowels (“a”, “o”, “u”), 10 numbers (1 to 10), 9 different words, and 4 rhymed short sentences. This dataset includes all these voice samples but does not indicate the specific recording it represents (i.e. we do not know whether a voice sample comes from recording a sustained vowel or a short sentence). Please note this limitation. A set of 26 acoustic features were further extracted from each voice sample using a free acoustic analysis software called Praat.4 The variables in this dataset represent these features and are given in Table 1. Table 1. Description of variables in the po1_data.txt dataset. Column No. Measurement category Description 1 Subject identifier This number identifies a study subject 2 Jitter Jitter in % 3 Jitter Absolute jitter in microseconds 4 Jitter Jitter as relative amplitude perturbation (r.a.p.) 5 Jitter Jitter as 5-point period perturbation quotient (p.p.q.5) 6 Jitter Jitter as average absolute difference of differences between jitter
  • 3. cycles (d.d.p.) 7 Shimmer Shimmer in % 8 Shimmer Absolute shimmer in decibels (dB) 9 Shimmer Shimmer as 3-point amplitude perturbation quotient (a.p.q.3) 10 Shimmer Shimmer as 5-point amplitude perturbation quotient (a.p.q.5)  Sakar, B.E. et al. (2013). https://ieeexplore.ieee.org/abstract/document/6451090  https://www.fon.hum.uva.nl/praat/ 11 Shimmer Shimmer as 11-point amplitude perturbation quotient (a.p.q.11) 12 Shimmer Shimmer as average absolute differences between consecutive differences between the amplitudes of shimmer cycles (d.d.a.) 13 Harmonicity Autocorrelation between NHR and HNR 14 Harmonicity Noise-to-Harmonic ratio (NHR) 15 Harmonicity Harmonic-to-Noise ratio (HNR) 16 Pitch Median pitch 17 Pitch Mean pitch
  • 4. 18 Pitch Standard deviation of pitch 19 Pitch Minimum pitch 20 Pitch Maximum pitch 21 Pulse Number of pulses 22 Pulse Number of periods 23 Pulse Mean period 24 Pulse Standard deviation of period 25 Voice Fraction of unvoiced frames 26 Voice Number of voice breaks 27 Voice Degree of voice breaks 28 UPDRS The Unified Parkinson‟s Disease Rating Scale (UPDRS) score that is assigned to the subject by a physician via a medical examination to determine the severity and progression of Parkinson‟s disease. 29 PD indicator Value “1” indicates a subject suffering from PD. Value “0” indicates a healthy subject. Here are some explanations about the measurement categories. 5 The jitter variables measure the variation in the frequency of the sound. The shimmer variables measure the variation in the amplitude of the sound. The harmonicity variables are related to vocal quality and assess the noise in the sound. The pitch variables measure how high or low is the sound based on the
  • 5. frequency of vibration of the sound waves produced. The pulse variables measure the glottal pulse, which are variances in voice quality affected by manipulating the vocal cords when speaking. The voice variables measure the extent to which a subject has trouble maintaining vocal cord vibration when saying a sustained vowel. Dataset 2 Filename: po2_data.csv This second dataset contains a range of voice measurements and basic demographic profiles from 42 subjects with early-stage Parkinson‟s disease. There are no non-PPD subjects in this dataset. Note that these subjects originate from a different study and are different from the subjects described in the first dataset. Each record in this dataset corresponds to a set of features extracted from a voice sample of a subject. The number of voice samples recorded from a subject varies between 101 and 168 voice samples. Table 2 gives the list of variables in this dataset. The motor UPDRS score could range from 0 to 108, with 0 indicating Parkinson‟s Disease symptom-free and 108 denoting severe motor impairment. The total UPDRS represents the level of total disability of PPD and ranges from 0 to 176, where 0 denoting a healthy condition and 176 for total disabilities for untreated patients.  Costanzo, P. and Orphanou, K. (2022). https://arxiv.org/abs/2206.03716 Table 2. Description of variables in the po2_data.csv dataset Column Name Measurement category Description subject# Subject identifier This number identifies a study subject Age Demography The subject‟s age at data collection sex Demography The subject‟s gender (“0” for male, “1” for female) test_time Test detail The time since subject recruitment into the study. The integer part is the number of days since recruitment. motor_updrs UPDRS The subject‟s motor UPDRS score assigned by a physician;
  • 6. linearly interpolated. total_updrs UPDRS The subject‟s total UPDRS score assigned by a physician; linearly interpolated. jitter(%) Jitter Jitter in % jitter(abs) Jitter Absolute jitter in microseconds jitter(rap) Jitter Jitter as relative amplitude perturbation (r.a.p.) jitter(ppq5) Jitter Jitter as 5-point period perturbation quotient (p.p.q.5) jitter(ddp) Jitter Jitter as average absolute difference of differences between jitter cycles (d.d.p.) shimmer(%) Shimmer Shimmer in % shimmer(abs) Shimmer Absolute shimmer in decibels (dB) shimmer(apq3) Shimmer Shimmer as 3-point amplitude perturbation quotient (a.p.q.3) shimmer(apq5) Shimmer Shimmer as 5-point amplitude perturbation quotient (a.p.q.5) shimmer(apq11) Shimmer Shimmer as 11-point amplitude perturbation quotient (a.p.q.11) shimmer(dda) Shimmer Shimmer as average absolute differences between consecutive differences between the amplitudes of shimmer cycles (d.d.a.)
  • 7. nhr Harmonicity Noise-to-Harmonic ratio (NHR) hnr Harmonicity Harmonic-to-Noise ratio (HNR) rpde Complex Recurrence Period Density Entropy dfa Complex Detrended Fluctuation Analysis ppe Complex Pitch Period Entropy Here are some explanations about the additional measurement categories in this dataset. 6 The Recurrence Period Density Entropy variable measures the ability of the vocal folds to sustain simple vibrations. The Detrended Fluctuation Analysis describes the extent of turbulent noise in the speech signal. The Pitch Period Entropy measures the impaired control of stable pitch during sustained phonation. Project objectives Given the speech datasets above, your team is tasked with determining the extent to which the severity of Parkinson‟s disease can be non-invasively diagnosed (i.e. predicted) using voice sample data.  Tsanas, A. et al. (2009). https://www.nature.com/articles/npre.2009.3920.1 Project objective 1 The first project objective requires that your team determine a set of salient variables (features) that could be used to distinguish people with PD from those that are healthy. This objective must be addressed in Assessment 2. It evaluates your team‟s ability to apply descriptive and inferential statistical analyses, as well as data wrangling. You must only use the dataset file po1_data.txt dataset for this objective. Please see additional information on Learnline under Assessment 2. Project objective 2
  • 8. The second project objective requires that your team predict the motor and the total UPDRS scores assigned by a physician to people with Parkinson‟s Disease. This objective must be addressed in Assessment 3. It evaluates the overall data science skills that your team learned in this unit, including data exploration, data visualisation methods, and linear regression modelling. You must at least use the po2_data.csv dataset for this objective. Further consideration and experimentation with the po1_data.txt dataset to validate your predictive models is essential for scoring any higher grade. Please see additional information on Learnline under Assessment 3. Additional notes Any creative derivation of new variables from these existing variables or data transformation is permitted. This is known as feature engineering and is a common data science practice. However, you must not alter the original values of these datasets. You must maintain ethical conduct in the course of your data analyses. If in doubt, please consult the Unit Coordinator.