Cultivation of KODO MILLET . made by Ghanshyam pptx
Enhancing Psychotherapy Treatment by Analyzing Alliance Ruptures through Gaze Detection: A Statistical Head Pose Analysis Approach
1. Enhancing Psychotherapy
Treatment by Analyzing Alliance
Ruptures through Gaze
Detection: A Statistical Head
Pose Analysis Approach
By Muhammad Zbeedat
Master’s student – University of Haifa
Supervised by: Prof. Ilan Shimshuni and Prof.
Hagit Hel-Or
2. Meet our
team
Ruptures Detection
Muhammad
Zbeedat
Master’s student and
member of the
Computational Human
Behavior Lab
Ilan Shimshoni
Professor in the department
of Information Systems and
formerly served as chair of
the department.
Hagit Hel-Or
Faculty member in the
dept of Computer Science
and head of the
Computational Human
Behavior Lab.
Sigal Zlicha-
Mano
A licensed clinical psychologist
and a full professor of clinical
psychology in the department
of psychology, and the
manager of the psychotherapy
lab.
Tohar Dolev-Amit
Doctoral graduate in clinical
psychology, a researcher and
lab manager in Prof. Sigal
Zilcha-Mano’s psychotherapy
lab.
Tal BenDavid -
Sela
Doctoral candidate in clinical
psychology, a researcher in
Prof. Sigal Zilcha-Mano’s
psychotherapy lab. 3
3. Introduction
• In psychotherapy, an alliance rupture is a
deterioration in the alliance, manifested by a
lack of collaboration between patient and
therapist on tasks or goals, or a strain in the
emotional bond.
• Ruptures have been identified in 91%–100% of
sessions. Ruptures have the potential to either
undermine the treatment or enhance it.
Ruptures Detection 5
4. Introduction –
Contd.
• Ruptures may be categorized into two main
subtypes:
• In Withdrawal ruptures, patients either move away
from the therapist and the treatment or move toward
the therapist in a way that denies the patient’s own
experience.
• In Confrontational ruptures, patients move against
the therapist or the work of therapy. Confrontational
ruptures may include complaints about the therapist
or the treatment.
Ruptures Detection 6
5. Research
Question
How to detect ruptures in a
psychotherapy therapeutic recorded
session using Human Action
Recognition techniques?
Identifying ruptures is a critical stage in
reaching resolution, a resolution process
enables the patient and therapist to renew
or strengthen their emotional bond, and to
begin or resume collaborating on the tasks
and goals of therapy.
6. Introduction –
Contd.
• The primary objective of this study is to monitor
and identify such ruptures throughout recorded
therapy sessions using three different cameras.
• To achieve this goal:
• Human Action Recognition techniques were
employed, with a specific emphasis on Gaze
Detection through statistical head pose analysis.
• Additionally, we utilized some Facial Action Units
features.
Facial Action Units (AUs) are a way to describe
human facial expressions.
Ruptures Detection 8
7. Ruptures Resolution Process
Detection
Detect Withdrawal
or Confrontational
ruptures
Analysis
Reevaluate the
therapy session,
and focus on the
rupture sections
(segments)
Resolution
A resolution
process enables
the patient and
therapist to renew
or strengthen their
emotional bond
Strategy
Activating resolution
strategies, such as
changing the task,
or disclosing the
therapist’s internal
experience of the
rupture
Success
The goal is to get a
successful
treatment at the
end and minimize
failures
Ruptures Detection 9
8. The Rupture Resolution Rating System (3RS)
Ruptures Detection 10
An observational
system for coding
rupture markers and
resolution.
The coders received
six months of training
with an experienced
coder.
Each session was
coded by a pair of
coders, drawn from a
pool of 8
undergraduate
students in
psychology.
To examine rupture
occurrence, ruptures
were coded in 5-
minutes segments.
Identified ruptures
were coded as a
Confrontation (CF) or
Withdrawal (WD).
The coded 5-minute
segments were then
aggregated to
achieve one overall
score for ruptures per
patient per session.
The Rupture Resolution Rating System (3RS), is the gold standard observer manual for detecting
ruptures.
It was applied to our recorded sessions.
9. 3RS - Rupture markers
Ruptures Detection 11
Withdrawal rupture markers:
Denial – denial
MinResponse – minimal response
AbstrComm – abstract communication
AvStoryShiftT – avoidant storytelling or shifting
topic
Deferential – deferential or appeasing
ContAffectSplit – content affect split
Selfcrithopeless – self-criticism or hopelessness
Confrontational rupture markers:
ComplTherapist – complains therapist
Rejectform – reject formulation
ComplActivity – complains activities
ComplParameter – complains
parameters
ComplProgress – complains progress
Ptdefendsself – patient defends self
Controlpressure – control or pressure
Coders analyzed each segment, looked for certain markers, and gave each marker a
value between 0 and 3 (0-no sign, 1-low intensity, 2-high intensity, 3-very high intensity).
10. Ground Truth Manual Coding Labels
Ruptures Detection 12
• WD/CF - the overall mean of withdrawal/confrontational rupture markers
• WD2/CF2 - the count of withdrawal/confrontational rupture high-intensity
markers (2 and above)
• WD_binaryhigh/CF_binaryhigh - coded as:
• 0- no rupture or low intensity (WD2/CF2 = 0, all markers got 0 or 1)
• 1- high or very high rupture intensity (WD2/CF2 > 0, at least one marker
got 2 or 3)
• WD_binarylow/CF_binarylow - coded as:
• 0- no rupture (WD/CF ~ 0)
• 1- low, high or very high rupture intensity (WD/CF > 0, at least one marker
got 1 or above)
11. Challenges
Sensitive data
The analysis of
recorded sessions
was conducted within
the confines of the
psychotherapy labs
of the University of
Haifa
Un-balanced
data
Most of the segments
didn’t include ruptures.
Handled by creating
the new binarylow
labels, and using
SMOTE for
oversampling.
Experiment
setup
How accurate are the
features for Action
Units, Gaze, and
Head Pose, based
on cameras location
in the clinics?
Lack of
dominant
features
We extracted features
only from
images/frames, but
other features like voice
from the video, or the
text can help to reach
better performance.
Ruptures Detection 13
Noisy data
Extracted features
(are noisy (by their
nature). Handled by
averaging features
across smaller units
in each segment, in
order not to be
flattened.
12. Method
14
Experiment
setup
Three cameras were utilized to record sessions: one focused on the
therapist, another on the patient, and a third captured both
individuals.
Cameras were positioned at a distance from the faces of the patient
and therapist, and they were not directly facing them. This setup
posed some challenges.
Ruptures Detection
Note: the therapist and patient in the scene are psychotherapy lab actors
13. Method
15
Participants
96 patients between the ages of 18–60, with major depressive
disorder, from the pilot and the main trial phases of a
Randomized Controlled Trial (RCT), participated in the this
study.
The whole therapy series for each patient (about 16 sessions)
was videotaped, but only three sessions (2, 4 and 8) were
codded manually using the 3RS for ruptures by coders.
Ruptures Detection
14. Method
Features
extraction
For the extraction of features from the recorded
therapy sessions, the study employed a Computer
Vision open-source tool named OpenFace (but any
other tool can be used). This tool offers an array of
capabilities:
• Head position assessment
• Facial Action Units detection
• Eye tracking and facial landmark detection, among
others.
16
Ruptures Detection
15. Calibrated Mode
3D coordinates of the eyes and
face were used for computing the
gaze vector originating from the
patient’s eyes toward the
therapist’s face. The objective was
to identify a direct gaze if the
vector alignment was sufficiently
close. A lack of such alignment
could indicate a potential rupture.
Method
Uncalibrated Mode
Statistical analysis of head pose was adopted to determine direct
gaze, replacing the previous geometric calculations.
17
Ruptures Detection
16. Method
Facial Action Units features
Facial Action Units (AUs) are a way
to describe human facial
expressions.
OpenFace can detect the intensity
(in a scale from 0 to 5) of 17 AUs.
Head pose features
By detecting head pose we can detect
if the patient is looking straight to the
therapist or not.
In our approach, Yaw and Pitch values
were used for statistical analysis of the
head rotation.
18
Ruptures Detection
Facial Action Coding System (FACS) - Guide
17. Method
21
Ruptures Detection
• Yaw/Pitch values were used to represent head pose of the patient and set a range of
straight gaze towards the therapist based on Yaw/Pitch probabilities range that
corresponds to the highest probabilities.
(KDE-Kernel Density Estimation was used to smooth the probability density
estimation)
• Max probability cut threshold was used in determining the ranges of Yaw and
Pitch. The cut threshold that was chosen is 20%.
• This range, however, exhibited variations across patients and sessions due to the
diverse profiles of individuals.
How to set the Direct Gaze range based on Head
Pose?
18. Method
22
Ruptures Detection
Novel approach
• Sub-division of segments into smaller units
(subsegments), spanning over 30-60 seconds.
• This approach facilitated the identification of precise
rupture instances within segments, while also
preventing feature flattening that could occur when
calculating and averaging derived feature values over
an entire segment.
19. Method - Subsegment size 60
sec
24
Ruptures Detection
For example, when
analyzing 60 seconds
subsegments we
identified a
problematic one:
that subsegment was
marked with a low
direct gaze, the red
points are the frames
inside the ranges of
yaw/pitch for that
session within that
subsegment.
20. Method - Subsegment size
30 sec
25
Ruptures Detection
But when
working with
30 sec
subsegments
and breaking it
into two, we
identified that:
The first one
was abnormal
(only 3%), but
the second
had a direct
gaze (81%).
21. Method
27
Segment1 S2 S3 S4 S5 S6 S7 S8 S9 S10
1. Over the course of the whole
session, calculate
mean/std/sem of Yaw/Pitch
values.
Unit1 U2 U3 U4 U5 2. Split each segment into smaller subsegments of
30 or 6o seconds and for each subsegment
calculate:
• Mean/Std/Sem of AUs and Yaw/Pitch values.
• Distances between yaw/pitch means for that
subsegment and the whole session.
• Ratios between yaw/pitch std/sem of that
subsegment and the entire session.
• Distribution similarity (Z-test) between that
subsegment and the entire session.
• Straight gaze percentage of patient looking
towards the therapist.
3. For each Segment, go through all its smaller subsegments and find the
following features:
• Action Units
- maximum & minimum of subsegments mean value inside a segment.
- mean of STDs/SEMs for all subsegments inside a segment.
• Yaw/Pitch
- max & min of Yaw/Pitch mean distance of all subsegments.
- mean of Yaw/Pitch std/sem ratios of all subsegments.
- max & min of Yaw/Pitch Z test of all subsegments.
• Straight gaze percentage
Number of straight gaze subsegments above a certain percentage (60%,
70%, 80%).
Ruptures Detection
Algorithm
22. Method - Features Analysis
28
Ruptures Detection
In segment 3, a substantial decrease in the number of units with straight gaze
was observed for all three threshold levels. Coincidentally, this same segment
was coded with a confrontational rupture of 57% intensity and a 30%
withdrawal rupture intensity.
In segment 3, there was a significant increase in Yaw Z-test
measures (min/max) which indicates a high difference between this
segment behavior and the entire session related to the patient
moving his head left/right.
In segment 3, there was a significant increase in Yaw means
distance measures (min/max) which indicates a high difference
between this segment behavior and the entire session related to
the patient moving his head left/right.
23. Machine Learning model
30
Ruptures Detection
• Classification machine learning models were
trained and tested within the
WD_binaryhigh/CF_binaryhigh and
WD_binarylow/CF_binarylow ground truth labels.
• To ensure the reliability of our results, we took great
care to ensure that sessions involving the same
patient were exclusively included either in the
training or testing phase.
• Grid Search with Cross Validation was implemented
to determine the optimal hyperparameters for each
RandomForestClassifier associated with every
ground truth label.
24. Withdrawal ML model - Features Importance
32
Ruptures Detection
AU06 AU07 AU12
25. Confrontation ML model - Features
Importance
34
Ruptures Detection
AU01 AU04 AU14
Facial Action Coding System (FACS) -
Guide
26. Results - WD_binaryhigh
36
Ruptures Detection
RandomForestClassifier for WD_binaryhigh with
SMOTE got a lower Test score, but it’s more
balanced (Test True positive/negative).
28. Results - CF_binaryhigh
40
Ruptures Detection
Since we have very low count of
confrontation ruptures over our data-set, the
results without oversampling are misleading!
29. Results - CF_binarylow
42
Ruptures Detection
The Test score without oversampling is higher, but
the true positive is very low. When using SMOTE
the Test score was reduced a bit, but the True
positive/negative are more balanced.
30. Results
43
• For Withdrawal ruptures, we recommend using the ML model of WD_binaryhigh
ground truth label with SMOTE. ML accuracy reached 65% with balanced true
positive/negative predictions.
• For Confrontational ruptures, we can’t use the CF_binaryhigh ground truth label,
due to the fact that data is extremely unbalanced for this label, Confrontational
ruptures were rare, and ML model was always classifying segments as non
rupture ones.
Instead, we recommend using the ML model for the CF_binarylow ground truth
label with SMOTE. The accuracy of this model reaches approximately 60%
and the true positive/negative predictions are balanced.
Ruptures Detection
Summary
31. Future Work
This study primarily revolves around images (frames)
extracted from recorded therapy sessions. But other
features can be explored:
• Voice analysis. The main challenge is distinguishing
whether the speaker was the therapist or the patient.
• Speech-to-Text tools. The main challenge was the
concern of data privacy for such tools that convert
speech to text (especially the online ones).
• Voice Emotion Detection techniques. Emotion
Detection (ED) stands as a subset of sentiment
analysis, focused on extracting and analyzing
emotions from text.
Ruptures Detection 44
32. Summary and Conclusions
In the psychotherapy domain, the
machine learning model’s accuracy was
deemed acceptable. By integrating
additional features like voice analysis
and text mining of speech-generated
text, the accuracy could be further
enhanced.
Ruptures Detection 45
Ruptures may be categorized into two main subtypes: withdrawal and confrontational ruptures:
In Withdrawal ruptures, patients either move away from the therapist and the treatment in a submissive manner or move toward the therapist in a way that denies the patient’s own experience.
In Confrontational ruptures, patients move against the therapist or the work of therapy. Confrontational ruptures may include complaints about the therapist or the treatment.
The primary objective of this study is to monitor such ruptures throughout recorded therapy sessions using three different cameras: one focused on the patient, another on the therapist, and a third capturing both from the side perspective. Our analysis primarily utilizes the camera focused on the patient.
Following a comprehensive review of the treatment session, the system aims to identify these ruptures. To achieve this goal:
Human Action Recognition techniques were employed, with a specific emphasis on Gaze Detection through statistical head pose analysis. This approach determines whether the patient maintains eye contact with the therapist or not, which might point on a rupture.
Additionally, we utilized some Facial Action Units features.Facial Action Units (AUs) are a way to describe human facial expressions.
96 patients between the ages of 18–60, with major depressive disorder, participated in this study. All therapy sessions were videotaped. Sessions 2, 4 and 8 were codded manually using the 3RS by coders.
Identified ruptures were coded as a Confrontation (CF) or Withdrawal (WD), and the clarity of the rupture was rated as :
- check minus (a weak or somewhat unclear example of the marker)
- a check (a solid example of the marker)
- a check plus (a very clear, "textbook" example of the marker).
WD/CF –
These labels have continuous quantitative values (not binary), so they can't be used for the existence of rupture classification. Nevertheless, it can be employed in regression-based machine learning models for predicting rupture intensity.
WD2/CF2 –
The count of markers with an intensity value of 2 or higher. As a result, markers with low intensity, indicated by values of 0 or 1, will be eliminated, and only the high-intensity markers with values will be considered in the count.
WD_binary/CF_binary:
These are binary labels indicating high-intensity ruptures.
If WD2/CF2 labels values are 0, the values of WD_binary/CF_binary will be 0.
If at least one marker is identified with high intensity of 2 or above, resulting in a total sum of 1 or more for WD2/CF2, then WD_binary/CF_binary will be set to 1.
WD_binary1/CF_binary1:
These binary labels indicate low-intensity ruptures. They were actually created later on by us (not by the psychotherapy coders), in order to have balanced segments in the dataset by reducing the intensity of rupture, even a minor rupture will be considered as a rupture.
Note: It’s referring to WD/CF labels and not WD2/CF2
Sensitive data:
- Excluding features that can be used to reconstruct and predict patient identity.
- Owing to data sensitivity, precautions were taken to ensure privacy. The analysis of these sessions was conducted within the confines of the psychotherapy labs of the University of Haifa, on a computer isolated from internet connectivity.
Three cameras were utilized to record sessions: one focused on the therapist, another on the patient, and a third captured both individuals.
Cameras were positioned at a distance from the faces of the patient and therapist, and they were not directly facing them. This setup posed challenges for the use of some techniques (techniques reliant on eye tracking for example)
Detection of facial action units faced difficulties as well due to the same reasons, as well as other factors such as individuals wearing glasses or patients obscuring their faces, especially during emotional reactions.
For the extraction of features from the recorded therapy sessions, the study employed a Computer Vision open-source tool named OpenFace. This tool offers an array of capabilities:
Head position assessment
Facial Action Units detection
Eye tracking and facial landmark detection, among others.Which were excluded btw due to:
Inaccurate values corresponding to the lab setup and camera positions
Data sensitivity - to allow working on the extracted features outside of labs premises and ensure the impossibility of reconstructing. participants’ facial attributes or discerning any aspect of their identity.
Calibrated mode:
The calibration process aimed to establish correspondences between objects captured by cameras 1 & 2 (centered on patient/therapist) and their positions in the third camera’s view. This facilitated the translation of pixels from cameras 1 or 2 into real-world 3D coordinates.
3D coordinates of the eyes and face were used for computing the gaze vector originating from the patient’s eyes toward the therapist’s face. The objective was to identify a direct gaze if the vector alignment was sufficiently close. A lack of such alignment could indicate a potential rupture.
Uncalibrated mode:
Recognizing the limitations of the calibrated mode (due to camera limitations, such as the considerable distance between the participant’s faces and the angle at which their faces were captured), the study transitioned to an uncalibrated approach.
Statistical analysis of head pose was adopted to determine direct gaze, replacing the previousgeometric calculations.
Facial Action Units features:
Facial Action Units (AUs) are a way to describe human facial expressions.
OpenFace can detect the intensity (in a scale from 0 to 5) of 17 AUs.
Facial Action Coding System (FACS) - Guide
AU01_r, AU02_r, AU04_r, AU05_r, AU06_r, AU07_r, AU09_r, AU10_r, AU12_r, AU14_r, AU15_r, AU17_r, AU20_r, AU23_r, AU25_r, AU26_r, AU45_r
Head pose features:
By detecting head pose (Up/Down/Left/Right), we can detect if the patient is looking straight to the therapist or not.
Lack of straight eye contact with the therapist can point on a rupture (most likely a withdrawal one).
*** Considering the center of rotation at the mid-point of patient’s head, Yaw is defined as the angle of rotation (in degrees) about the Y-axis from the standard frontal view (indication right and left head movement).
Similarly, Pitch is defined as the angle of rotation (in degrees) about the X-axis, (indicating up and down head movement). ***
Reduced variant of OpenFace features excluded features linked to participant facial coordinates. Omitted features:
Gaze coordinates (x, y, z) denoting the direction vectors of both eyes in world coordinates,
Eye landmarks, specifying the position of 2D landmarks within the eye region in pixels, were omitted.
Facial 2D & 3D landmarks
This measure ensured the impossibility of reconstructing participant’ facial attributes or discerning any aspect of their identity.
How to set the Direct Gaze range based on Head Pose?
In statistics, Kernel Density Estimation (KDE) is the application of kernel smoothing for probability density estimation. Using KDE we can obtain the probability density of Yaw/Pitch values, and set a range of straight gaze based on Yaw/Pitch probabilities range that correspond to the highest probabilities.
This range, however, exhibited variations across patients and sessions due to the diverse profiles of individuals.
Max probability cut threshold was used in determining the ranges of Yaw and Pitch. The cut threshold that was chosen is 20%.
(a) KDE plot of Yaw values over the whole session.Yaw direct gaze range is between -39 and -1.5
(b) KDE plot of pitch values over the whole session.Pitch direct gaze range is between 3 and 32.
(c) 2D KDE plot of yaw/pitch. We can notice, that in this session we have 2 density blocks, the small one was because of noisy data at the beginning while the patient entered the room but didn’t settle down yet.
- not a normal distribution (for any distribution)
- yaw (0.27), count all with around value
- from all data, how many we have with around value, devid by all getting the probability dinsity
Novel approach:
Considering that ruptures do not necessarily transpire across an entire segment, the approach adopted involved segment sub-division into smaller units, spanning over 30-60 seconds in order to determine if there was a direct eye contact between patient and therapist.
This decision was guided by the understanding that ruptures are not typically sustained over extended periods and are unlikely to span the entirety of 5-minute segments. This approach facilitated the identification of precise rupture instances within segments, while also preventing feature flattening that could occur when calculating and averaging OpenFace-derived feature values over an entire segment.
Features were averaged over each unit, leading to the creation of new features based on maximum, minimum, and standard deviation of these average values across all units within a single segment. This approach served to mitigate the influence of noisy data and enhance precision when pinpointing the precise moment of a rupture occurrence.
For example, when analyzing 60 seconds units we identified a problematic unit, this unit was marked with a low direct gaze, the red points in figure are the frames inside the ranges of yaw/pitch for that session
(some parameters are shown for that unit like: mean, std and Z-test which indicates the similarity between this unit and the whole session histograms).
But when working with 30 sec units and breaking it into two, we identified that the first one was normal (70% direct gaze out of the whole frames in that unit), but in the second it was very low (37%). When referring to the video of that session and watching that specific part, we identified a rupture, the patient looked down most of the time, seemed anxious and he caught his forehead with his hand.
For example, when analyzing 60 seconds units we identified a problematic unit, this unit was marked with a low direct gaze, the red points in figure are the frames inside the ranges of yaw/pitch for that session
(some parameters are shown for that unit like: mean, std and Z-test which indicates the similarity between this unit and the whole session histograms).
But when working with 30 sec units and breaking it into two, we identified that the first one was normal (70% direct gaze out of the whole frames in that unit), but in the second it was very low (37%). When referring to the video of that session and watching that specific part, we identified a rupture, the patient looked down most of the time, seemed anxious and he caught his forehead with his hand.
1.
2. Averaging the values of the features that we got from OpenFace fora complete segment will be misleading, rupture can happen only in one small unit of the segment and when averaging the value of one feature for the whole segment, this feature will be flatten and no longer strong enough to be identified as a rupture. So, splitting each segment into smaller units of 30 or 6o seconds and for each unit calculate:
1.
This graph illustrates the count of units identified as exhibiting direct gaze between thepatient and therapist across different segments of the session. Three distinct thresholds were considered:units with over 80%, 70%, and 60% of their frames within the prescribed straight yaw/pitch ranges.
Notably, in segment 3, a substantial decrease in the number of units with straight gaze was observed for all three threshold levels. Coincidentally, this same segment was coded with a confrontational rupture of 57% intensity and a 30% withdrawal rupture intensity.
2.
In segment 3, there was a significant increase in Yaw Z-test measures (min/max) which indicates a high difference between this segment behavior and the entire session related to patient moving his head left/right.
On reviewing the recorded session and seeking clarification from the coders about segment 3, it was discovered that the patient, during an emotionally charged moment discussing his spouse, consistently looked aside.
3.
The maximum and minimum yaw mean distances for all units within a segment, relative to the yaw mean of the entire session.
In segment 3, there was a significant increase in Yaw means distance measures (min/max) which indicates a high difference between this segment behavior and the entire session related to patient moving his head left/right.
1.Model scores comparison with different unit size and max probability cut thresholds as 20%.
Note: we can see that units with size=30 seconds have higher True Positive predictions than units with size=60 seconds or 5 minutes.
2. Model scores comparison with different max probability cut thresholds and unit size as 30 seconds. We can see that between 15% and 20% there is no big difference, But comparing to 25% there is a slight improvement. However, the threshold in the middle (20%) was chosen.
- Explain about SMOTE while explaining about class_weight
Classification machine learning models were trained and tested using a dataset with the engineered features derived from OpenFace extracted features, within the WD_binary/CF_binary and WD_binary1/CF_binary1 ground truth labels.
To ensure the reliability of our results, we took great care to ensure that sessions involving the same patient were exclusively included either in the training or testing phase. Specifically, we avoided a scenario where one session from a particular patient was used for training, while another session from the same patient was used for testing.
Grid Search with Cross Validation was implemented to determine the optimal hyperparameters for each RandomForestClassifier associated with every ground truth label.
AU23 was identified as dominant in the confrontation classifier exclusively. AU23 is associated with the Anger emotion, which includes AUs 4, 5, 7, and 23, as outlined in (FACS) – A Visual Guidebook.Anger emotion has a significant correlation with confrontation rupture, reinforcing our assertion.
The blue line indicates the confidence of the classifier for True Positive predictions, the probabilities in this case should be higher than 50% (such as in the Training scores.)
The orange line, indicates the True Negative cases (segments that are not ruptures, but the classifier identified them as ruptures). The probabilities in such case should be less than 50%.
Our recommendation is to use the WD_binary label with SMOTE for Withdrawal ruptures classification.
The blue line indicates the confidence of the classifier for True Positive predictions, the probabilities in this case should be higher than 50% (such as in the Training scores.)
The orange line, indicates the True Negative cases (segments that are not ruptures, but the classifier identified them as ruptures). The probabilities in such case should be less than 50%.
Our recommendation is to use the WD_binary label with SMOTE for Withdrawal ruptures classification.
Oversampling in this case was redundant since the data was balanced at first place. This label indicates a low intensity ruptures, hence weak ruptures are counted too, and types of ruptures are balanced.
Our recommendation is to use the WD_binary label for Withdrawal ruptures classification and not Wd_binary1 label.
Oversampling in this case was redundant since the data was balanced at first place. This label indicates a low intensity ruptures, hence weak ruptures are counted too, and types of ruptures are balanced.
Our recommendation is to use the WD_binary label for Withdrawal ruptures classification and not Wd_binary1 label.
1.
Since we have very low count of confrontation ruptures over our data-set, the results without oversampling are misleading!
It shows high scores, but actually, the classifier detects everything as non-rupture! So the results with oversampling are more reliable.
Our recommendation is to use the CF_binary label with SMOTE for Confrontation ruptures classification.
2.
The predictions confidence graphs are not ideal, it’s right that the accuracy scores are high, but the True Positive/True Negative predictions are not balanced!
Our recommendation is to use the CF_binary label with SMOTE for Confrontation ruptures classification.
1.
Since we have very low count of confrontation ruptures over our data-set, the results without oversampling are misleading!
It shows high scores, but actually, the classifier detects everything as non-rupture! So the results with oversampling are more reliable.
Our recommendation is to use the CF_binary label with SMOTE for Confrontation ruptures classification.
2.
The predictions confidence graphs are not ideal, it’s right that the accuracy scores are high, but the True Positive/True Negative predictions are not balanced!
Our recommendation is to use the CF_binary label with SMOTE for Confrontation ruptures classification.
Our recommendation is to use the CF_binary1 label with SMOTE for Confrontation ruptures classification, over CF_binary with SMOTE (since True positive predictions are more balanced).
Our recommendation is to use the CF_binary1 label with SMOTE for Confrontation ruptures classification, over CF_binary with SMOTE (since True positive predictions are more balanced).
This study primarily revolves around images (frames) extracted from recorded therapy sessions. But other features can be explored:
Voice analysis. The central challenge was distinguishing whether the speaker was the therapist or the patient. Changes in the patient's voice tone were identified as potential indicators of rupture, with elevated tones possibly pointing to confrontation ruptures, while unusually low tones could suggest withdrawal ruptures.
Additionally, Voice Emotion Detection techniques hold promise. Emotion Detection (ED) stands as a subset of sentiment analysis, focused on extracting and analyzing emotions from text. Text mining and analysis can be harnessed for ED.
Furthermore, Speech-to-Text tools can make a significant contribution, providing context regarding the interactions between patients and therapists during treatment sessions. However, challenges persist in this realm. Many tools are tailored for the English language and might not be adaptable to other languages (such as Hebrew). Even if adaptability exists, the concern of data privacy looms over the utilization of online tools.
- Cry detections
- Hands movements…