SlideShare a Scribd company logo
1 of 30
Multi Target Prediction &
Side Information
25 September 2020
By
Zaaba Bin Ahmad
Multi Target Prediction
Multi-target prediction (MTP) is involved with the simultaneous
projection of multiple target output and input features of diverse type,
such as binary, nominal, ordinal, or real-valued.
In contrast with Single-target prediction (STP), where a single target
output to be predicted based on a set of features describing an
instance.
Intro from our Bibliometric paper
Multi Target Prediction
Ref – 1,2
Why Multi Target Prediction?
Ref – 1
Multi Target Prediction
• MTP types without Side Info
• Multivariate regression (e.g., predicting whether a protein will bind to a set of
experimentally developed small molecules).
• Multi-task learning (e.g., predicting student marks in the final exam for a
typical high-school course).
• Multi-label classification (e.g., assigning appropriate category tags to
documents).
Ref – 1,2
Multi Target Prediction
MTP types with Side Info
• Multivariate regression (e.g., predicting whether a protein will bind to a set of
experimentally developed small molecules + a representation for the target
molecules).
• Multi-task learning (e.g., predicting student marks in the final exam for a
typical high-school course + such as geographical location, qualifications of
the teachers, reputation of the school).
• Multi-label classification (e.g., assigning appropriate category tags to
documents + a hierarchical structure).
• Multi-label classification with label ranking (suitable for DAS) maybe….
Ref – 1,
Multi Target Prediction
Multivariate
regression
Multi-task
learning
Multi-label
classification
Multi-label
classification
with ranking
Ref - 1
Multi Target Prediction
Data type
Ref - https://www.bigdataframework.org/data-types-structured-vs-
unstructured-data/
Multi Target Prediction
structured Vs unstructured dataset?
Ref - https://lawtomated.com/structured-data-vs-unstructured-data-
what-are-they-and-why-care/
Multi Target Prediction
What is unstructured dataset?
• Human-generated unstructured data
• Text files: word processing files, spreadsheets, presentations, emails.
• Email: largely text, but has some internal structure thanks to its metadata (e.g.
including the visible “to”, “from”, “date / time”, “subject” entered to send an email)
but also mixes in unstructured data via the message body. For this reason, email is
also referred to as semi-structured data.
• Social Media: like email, this is often semi-structured data, containing unstructured
data (e.g. a Tweet) but also structured data (e.g. the number of “Likes”, “retweets”,
“date”, “author” etc).
• Websites: YouTube, Instagram etc contain lots of unstructured data, but also much
structured data, e.g. like described above for Twitter
• Mobile data: text messages, locations.
• Communications: IMs, dictaphone recordings.
• Media: MP3, digital photos, audio recordings and video files.
• Business applications: MS Office documents, PDFs and similar.
Ref - https://lawtomated.com/structured-data-vs-unstructured-data-
what-are-they-and-why-care/
Multi Target Prediction
What is unstructured dataset?
• Machine-generated unstructured data
• Common types of machine-generated unstructured data include:
• Satellite imagery: weather data, geographic forms, military movements.
• Scientific data: oil and gas exploration, space exploration, seismic imagery and
atomosphereic data.Digital surveillance: CCTV.
Ref - https://lawtomated.com/structured-data-vs-unstructured-data-
what-are-they-and-why-care/
Multi Target Prediction
How it is done in structured and unstructured dataset?
• Convert unstructured to structured data types
• If Text, then
• tagging with metadata or part-of-speech tagging
• Dimensionality reduction - to identify the root word for actual words and
reduce the size of the text data.
• Disambiguation—the use of contextual clues
• Sentiment analysis involves discerning subjective (as opposed to factual)
material and extracting various forms of attitudinal information: sentiment,
opinion, mood, and emotion.
Side Information
• is of crucial importance for generalizing to novel targets that are unobserved
during the training phase
• i.e – a novel target molecule in the drug design
• a novel tag in the document annotation example
• a novel course in the student grading example
Ref- 2, 12
Side Information
Forms of existing side information for unstructured dataset.
• Example from Product Recommendation System
Ref- 11
Side Information
Categorisation of side information for unstructured dataset.
• To resolve data sparsity and cold-start issues, side information are widely used in recommender
systems.
• From Facebook?
• Coversation QnA (Textual)
• Status Update (Textual)
• Videos
• Image
• Audio
Ref – Slide Presentation from NTU, Singapore
DAS – A Multi Target Prediction Problem?
• DAS- Anxiety, depression and
stress
• Can be diagnosed by using the
Depression, Anxiety and Stress Scale -
21 Items (DASS-21)
• DASS-21- is a set of three self-report
scales designed to measure the
emotional states of depression,
anxiety and stress.
Ref- 14,15
Depression, Anxiety and Stress Scale - 21 Items
(DASS-21)
Ref- 13
Depression
Dysphoria
Hopelessness
Devaluation Of Life
Self-deprecation
Lack Of
Interest/Involvement
Anhedonia
Inertia
Anxiety
Autonomic Arousal
Skeletal Muscle
Effects
Situational Anxiety
Subjective Experience
Of Anxious Affect
Stress
levels of non-chronic
arousal through
difficulty relaxing
nervous arousal
being easily
upset/agitated,
irritable/over-
reactive
impatient
• Scores for depression, anxiety and stress are calculated by summing the scores for the relevant
items.
• DASS-21 is based on a dimensional rather than a categorical conception of psychological disorder.
The idea is –
• Improving an MTP classification problem (DAS) utilizing a type of side
information (for both structured/unstructured data)
• ?
Reference
1. Waegeman, W., Dembczyński, K., & Hüllermeier, E. (2019). Multi-target prediction: a unifying view on problems and methods. Data Mining and Knowledge Discovery, 33(2),
293–324. https://doi.org/10.1007/s10618-018-0595-5
2. Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., & Vlahavas, I. (2016). Multi-target regression via input space expansion: treating targets as inputs. Machine Learning,
104(1), 55–98. https://doi.org/10.1007/s10994-016-5546-z
3. Tu, C. H., & Li, C. (2019). Multitarget prediction—A new approach using sphere complex fuzzy sets. Engineering Applications of Artificial Intelligence.
https://doi.org/10.1016/j.engappai.2018.11.004
4. Xing, L., Lesperance, M. L., Zhang, X., & Hancock, J. (2020). Simultaneous prediction of multiple outcomes using revised stacking algorithms. Bioinformatics.
https://doi.org/10.1093/bioinformatics/btz531
5. Rahim, M., Thirion, B., Bzdok, D., Buvat, I., & Varoquaux, G. (2017). Joint prediction of multiple scores captures better individual traits from brain images. NeuroImage.
https://doi.org/10.1016/j.neuroimage.2017.06.072
6. Jonschkowski, R., Höfer, S., & Brock, O. (2015). Patterns for Learning with Side Information. Retrieved from http://arxiv.org/abs/1511.06429
7. Vashishth, S., Joshi, R., Prayaga, S. S., Bhattacharyya, C., & Talukdar, P. (2020). RESIDE: Improving distantly-supervised neural relation extraction using side information.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018. https://doi.org/10.18653/v1/d18-1157
8. Farias, V. F., & Li, A. A. (2019). Learning preferences with side information. Management Science. https://doi.org/10.1287/mnsc.2018.3092
9. Hu, C., Rai, P., & Carin, L. (2017). Deep generative models for relational data with side information. 34th International Conference on Machine Learning, ICML 2017.
10. Kang, D., Dhar, D., & Chan, A. B. (2017). Incorporating side information by adaptive convolution. Advances in Neural Information Processing Systems.
11. Pourgholamali, F., Kahani, M., Bagheri, E., & Noorian, Z. (2017). Embedding unstructured side information in product recommendation. Electronic Commerce Research and
Applications. https://doi.org/10.1016/j.elerap.2017.08.001
12. Wang, Y., Xiang, Y., Zhang, J., Zhou, W., & Xie, B. (2014). Internet traffic clustering with side information. Journal of Computer and System Sciences.
https://doi.org/10.1016/j.jcss.2014.02.008
13. Lovibond, S. H., & Lovibond, P. F. (1995). Manual for the Depression Anxiety Stress Scales. In Psychology Foundation of Australia. https://doi.org/DOI: 10.1016/0005-
7967(94)00075-U
14. Smoller, J. W. (2016). The Genetics of Stress-Related Disorders: PTSD, Depression, and Anxiety Disorders. Neuropsychopharmacology. https://doi.org/10.1038/npp.2015.266
15. Bener, A., Saleh, N., Bakir, A., & Bhugra, D. (2016). Depression, anxiety, and stress symptoms in menopausal arab women: Shedding more light on a complex relationship.
Annals of Medical and Health Sciences Research. https://doi.org/10.4103/amhsr.amhsr_341_15
Currently working on..
• Current
• Bibliometric paper “Research trends on MTP and Side Info”
• Side Info categorisation – Need criteria for sorting side information's type
• Proof for “In the present study, the combination of Multi-target (MT) prediction
approaches and Machine Learning algorithms has not been evaluated as an effective
strategy to improve prediction performances of social media data (structured or
unstructured)
• Find proof of “little focus on the unstructured dataset being used as side information.
• Not forgetting
• Data collection
• Proposal for DRP
Extra Notes
Multi Target Prediction
Other name for MTP-> Multi-source domain?
• Multi-source domain adaptation with graph embedding and adaptive label prediction
• Recently, deep methods with convolutional neural network (CNN) become popular in the community.
Hoffman, Mohri, and Zhang (2018) present new normalized solutions with strong theoretical
guarantees for the cross-entropy loss, which verifies the feasibility of utilizing CNNs on multi-source
scenario. Peng et al. (2019) match first-order moment in a deep network and achieve great
performance on a very large-scale dataset. Zhao et al. (2018) and Xu, Chen, Zuo, Yan, and Lin (2018)
further introduce adversarial learning in deep multi-source domain adaptation. Mancini et al. (2018)
present a novel deep model for automatically discovering latent domains within visual datasets. To sum
up, we can see that both moment matching and geometry alignment contribute to multi-source
domain adaptation in both shallow and deep models.
https://www-sciencedirect-
com.ezaccess.library.uitm.edu.my/science/article/pii/S0306457320308621
Inductive Vs Transductive (Learning Approaches)
Multi Target Prediction
What about Supervised and Unsupervised Training in MTP?
Multi Target Prediction
What about Feature Selection strategy?
Side Information
• Other terms keep coming up
• Matrix co-factorization
• Nonnegative matrix factorization
• Composite Absolute Penalty
Multi Target Prediction
Methods proposed for unstructured dataset – Textual Data?.
• Sentiment analysis
• Transfer learning BERT using pre-trained language model
BERT
Ref - http://jalammar.github.io/illustrated-bert/
Joint prediction of multiple scores captures
better individual traits from brain images
SIMULTANEOUS PREDICTION OF MULTIPLE OUTCOMES
USING REVISED STACKING ALGORITHMS

More Related Content

Similar to MTP and SI.pptx

6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx
priestmanmable
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx
sodhi3
 
Student responsesOriginal Question- Topic 6 DQ 1.docx
Student responsesOriginal Question-             Topic 6 DQ 1.docxStudent responsesOriginal Question-             Topic 6 DQ 1.docx
Student responsesOriginal Question- Topic 6 DQ 1.docx
mckellarhastings
 
13 Week Four Evaluation Data Sources and Deve
13 Week Four Evaluation Data Sources and Deve13 Week Four Evaluation Data Sources and Deve
13 Week Four Evaluation Data Sources and Deve
CicelyBourqueju
 
13 Week Four Evaluation Data Sources and Deve
13 Week Four Evaluation Data Sources and Deve13 Week Four Evaluation Data Sources and Deve
13 Week Four Evaluation Data Sources and Deve
ChantellPantoja184
 
Singapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxSingapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docx
jennifer822
 
Singapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxSingapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docx
edgar6wallace88877
 
Big Data Means Big Potential Challenges for Nurse Execs Response.pdf
Big Data Means Big Potential Challenges for Nurse Execs Response.pdfBig Data Means Big Potential Challenges for Nurse Execs Response.pdf
Big Data Means Big Potential Challenges for Nurse Execs Response.pdf
bkbk37
 

Similar to MTP and SI.pptx (20)

Nordic health data metadata
Nordic health data   metadataNordic health data   metadata
Nordic health data metadata
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Research methodology-Research Report
Research methodology-Research ReportResearch methodology-Research Report
Research methodology-Research Report
 
Research Methodology-Data Processing
Research Methodology-Data ProcessingResearch Methodology-Data Processing
Research Methodology-Data Processing
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx
 
Student responsesOriginal Question- Topic 6 DQ 1.docx
Student responsesOriginal Question-             Topic 6 DQ 1.docxStudent responsesOriginal Question-             Topic 6 DQ 1.docx
Student responsesOriginal Question- Topic 6 DQ 1.docx
 
T OP K-O PINION D ECISIONS R ETRIEVAL IN H EALTHCARE S YSTEM
T OP  K-O PINION  D ECISIONS  R ETRIEVAL IN  H EALTHCARE  S YSTEM T OP  K-O PINION  D ECISIONS  R ETRIEVAL IN  H EALTHCARE  S YSTEM
T OP K-O PINION D ECISIONS R ETRIEVAL IN H EALTHCARE S YSTEM
 
Depression Detection in Tweets using Logistic Regression Model
Depression Detection in Tweets using Logistic Regression ModelDepression Detection in Tweets using Logistic Regression Model
Depression Detection in Tweets using Logistic Regression Model
 
13 Week Four Evaluation Data Sources and Deve
13 Week Four Evaluation Data Sources and Deve13 Week Four Evaluation Data Sources and Deve
13 Week Four Evaluation Data Sources and Deve
 
13 Week Four Evaluation Data Sources and Deve
13 Week Four Evaluation Data Sources and Deve13 Week Four Evaluation Data Sources and Deve
13 Week Four Evaluation Data Sources and Deve
 
Singapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxSingapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docx
 
Singapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxSingapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docx
 
Big Data Means Big Potential Challenges for Nurse Execs Response.pdf
Big Data Means Big Potential Challenges for Nurse Execs Response.pdfBig Data Means Big Potential Challenges for Nurse Execs Response.pdf
Big Data Means Big Potential Challenges for Nurse Execs Response.pdf
 
Data at the NIH: Some Early Thoughts
Data at the NIH: Some Early ThoughtsData at the NIH: Some Early Thoughts
Data at the NIH: Some Early Thoughts
 
data science course with placement in hyderabad
data science course with placement in hyderabaddata science course with placement in hyderabad
data science course with placement in hyderabad
 
IRJET- Detection of Clinical Depression in Humans using Sentiment Analysis
IRJET-  	  Detection of Clinical Depression in Humans using Sentiment AnalysisIRJET-  	  Detection of Clinical Depression in Humans using Sentiment Analysis
IRJET- Detection of Clinical Depression in Humans using Sentiment Analysis
 
Characteristic of a Quantitative Research PPT.pptx
Characteristic of a Quantitative Research PPT.pptxCharacteristic of a Quantitative Research PPT.pptx
Characteristic of a Quantitative Research PPT.pptx
 
IJET-V2I6P22
IJET-V2I6P22IJET-V2I6P22
IJET-V2I6P22
 

Recently uploaded

Recently uploaded (20)

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answers
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactistics
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

MTP and SI.pptx

  • 1. Multi Target Prediction & Side Information 25 September 2020 By Zaaba Bin Ahmad
  • 2. Multi Target Prediction Multi-target prediction (MTP) is involved with the simultaneous projection of multiple target output and input features of diverse type, such as binary, nominal, ordinal, or real-valued. In contrast with Single-target prediction (STP), where a single target output to be predicted based on a set of features describing an instance. Intro from our Bibliometric paper
  • 4. Why Multi Target Prediction? Ref – 1
  • 5. Multi Target Prediction • MTP types without Side Info • Multivariate regression (e.g., predicting whether a protein will bind to a set of experimentally developed small molecules). • Multi-task learning (e.g., predicting student marks in the final exam for a typical high-school course). • Multi-label classification (e.g., assigning appropriate category tags to documents). Ref – 1,2
  • 6. Multi Target Prediction MTP types with Side Info • Multivariate regression (e.g., predicting whether a protein will bind to a set of experimentally developed small molecules + a representation for the target molecules). • Multi-task learning (e.g., predicting student marks in the final exam for a typical high-school course + such as geographical location, qualifications of the teachers, reputation of the school). • Multi-label classification (e.g., assigning appropriate category tags to documents + a hierarchical structure). • Multi-label classification with label ranking (suitable for DAS) maybe…. Ref – 1,
  • 8. Multi Target Prediction Data type Ref - https://www.bigdataframework.org/data-types-structured-vs- unstructured-data/
  • 9. Multi Target Prediction structured Vs unstructured dataset? Ref - https://lawtomated.com/structured-data-vs-unstructured-data- what-are-they-and-why-care/
  • 10. Multi Target Prediction What is unstructured dataset? • Human-generated unstructured data • Text files: word processing files, spreadsheets, presentations, emails. • Email: largely text, but has some internal structure thanks to its metadata (e.g. including the visible “to”, “from”, “date / time”, “subject” entered to send an email) but also mixes in unstructured data via the message body. For this reason, email is also referred to as semi-structured data. • Social Media: like email, this is often semi-structured data, containing unstructured data (e.g. a Tweet) but also structured data (e.g. the number of “Likes”, “retweets”, “date”, “author” etc). • Websites: YouTube, Instagram etc contain lots of unstructured data, but also much structured data, e.g. like described above for Twitter • Mobile data: text messages, locations. • Communications: IMs, dictaphone recordings. • Media: MP3, digital photos, audio recordings and video files. • Business applications: MS Office documents, PDFs and similar. Ref - https://lawtomated.com/structured-data-vs-unstructured-data- what-are-they-and-why-care/
  • 11. Multi Target Prediction What is unstructured dataset? • Machine-generated unstructured data • Common types of machine-generated unstructured data include: • Satellite imagery: weather data, geographic forms, military movements. • Scientific data: oil and gas exploration, space exploration, seismic imagery and atomosphereic data.Digital surveillance: CCTV. Ref - https://lawtomated.com/structured-data-vs-unstructured-data- what-are-they-and-why-care/
  • 12. Multi Target Prediction How it is done in structured and unstructured dataset? • Convert unstructured to structured data types • If Text, then • tagging with metadata or part-of-speech tagging • Dimensionality reduction - to identify the root word for actual words and reduce the size of the text data. • Disambiguation—the use of contextual clues • Sentiment analysis involves discerning subjective (as opposed to factual) material and extracting various forms of attitudinal information: sentiment, opinion, mood, and emotion.
  • 13. Side Information • is of crucial importance for generalizing to novel targets that are unobserved during the training phase • i.e – a novel target molecule in the drug design • a novel tag in the document annotation example • a novel course in the student grading example Ref- 2, 12
  • 14. Side Information Forms of existing side information for unstructured dataset. • Example from Product Recommendation System Ref- 11
  • 15. Side Information Categorisation of side information for unstructured dataset. • To resolve data sparsity and cold-start issues, side information are widely used in recommender systems. • From Facebook? • Coversation QnA (Textual) • Status Update (Textual) • Videos • Image • Audio Ref – Slide Presentation from NTU, Singapore
  • 16. DAS – A Multi Target Prediction Problem? • DAS- Anxiety, depression and stress • Can be diagnosed by using the Depression, Anxiety and Stress Scale - 21 Items (DASS-21) • DASS-21- is a set of three self-report scales designed to measure the emotional states of depression, anxiety and stress. Ref- 14,15
  • 17. Depression, Anxiety and Stress Scale - 21 Items (DASS-21) Ref- 13 Depression Dysphoria Hopelessness Devaluation Of Life Self-deprecation Lack Of Interest/Involvement Anhedonia Inertia Anxiety Autonomic Arousal Skeletal Muscle Effects Situational Anxiety Subjective Experience Of Anxious Affect Stress levels of non-chronic arousal through difficulty relaxing nervous arousal being easily upset/agitated, irritable/over- reactive impatient • Scores for depression, anxiety and stress are calculated by summing the scores for the relevant items. • DASS-21 is based on a dimensional rather than a categorical conception of psychological disorder.
  • 18. The idea is – • Improving an MTP classification problem (DAS) utilizing a type of side information (for both structured/unstructured data) • ?
  • 19. Reference 1. Waegeman, W., Dembczyński, K., & Hüllermeier, E. (2019). Multi-target prediction: a unifying view on problems and methods. Data Mining and Knowledge Discovery, 33(2), 293–324. https://doi.org/10.1007/s10618-018-0595-5 2. Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., & Vlahavas, I. (2016). Multi-target regression via input space expansion: treating targets as inputs. Machine Learning, 104(1), 55–98. https://doi.org/10.1007/s10994-016-5546-z 3. Tu, C. H., & Li, C. (2019). Multitarget prediction—A new approach using sphere complex fuzzy sets. Engineering Applications of Artificial Intelligence. https://doi.org/10.1016/j.engappai.2018.11.004 4. Xing, L., Lesperance, M. L., Zhang, X., & Hancock, J. (2020). Simultaneous prediction of multiple outcomes using revised stacking algorithms. Bioinformatics. https://doi.org/10.1093/bioinformatics/btz531 5. Rahim, M., Thirion, B., Bzdok, D., Buvat, I., & Varoquaux, G. (2017). Joint prediction of multiple scores captures better individual traits from brain images. NeuroImage. https://doi.org/10.1016/j.neuroimage.2017.06.072 6. Jonschkowski, R., Höfer, S., & Brock, O. (2015). Patterns for Learning with Side Information. Retrieved from http://arxiv.org/abs/1511.06429 7. Vashishth, S., Joshi, R., Prayaga, S. S., Bhattacharyya, C., & Talukdar, P. (2020). RESIDE: Improving distantly-supervised neural relation extraction using side information. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018. https://doi.org/10.18653/v1/d18-1157 8. Farias, V. F., & Li, A. A. (2019). Learning preferences with side information. Management Science. https://doi.org/10.1287/mnsc.2018.3092 9. Hu, C., Rai, P., & Carin, L. (2017). Deep generative models for relational data with side information. 34th International Conference on Machine Learning, ICML 2017. 10. Kang, D., Dhar, D., & Chan, A. B. (2017). Incorporating side information by adaptive convolution. Advances in Neural Information Processing Systems. 11. Pourgholamali, F., Kahani, M., Bagheri, E., & Noorian, Z. (2017). Embedding unstructured side information in product recommendation. Electronic Commerce Research and Applications. https://doi.org/10.1016/j.elerap.2017.08.001 12. Wang, Y., Xiang, Y., Zhang, J., Zhou, W., & Xie, B. (2014). Internet traffic clustering with side information. Journal of Computer and System Sciences. https://doi.org/10.1016/j.jcss.2014.02.008 13. Lovibond, S. H., & Lovibond, P. F. (1995). Manual for the Depression Anxiety Stress Scales. In Psychology Foundation of Australia. https://doi.org/DOI: 10.1016/0005- 7967(94)00075-U 14. Smoller, J. W. (2016). The Genetics of Stress-Related Disorders: PTSD, Depression, and Anxiety Disorders. Neuropsychopharmacology. https://doi.org/10.1038/npp.2015.266 15. Bener, A., Saleh, N., Bakir, A., & Bhugra, D. (2016). Depression, anxiety, and stress symptoms in menopausal arab women: Shedding more light on a complex relationship. Annals of Medical and Health Sciences Research. https://doi.org/10.4103/amhsr.amhsr_341_15
  • 20. Currently working on.. • Current • Bibliometric paper “Research trends on MTP and Side Info” • Side Info categorisation – Need criteria for sorting side information's type • Proof for “In the present study, the combination of Multi-target (MT) prediction approaches and Machine Learning algorithms has not been evaluated as an effective strategy to improve prediction performances of social media data (structured or unstructured) • Find proof of “little focus on the unstructured dataset being used as side information. • Not forgetting • Data collection • Proposal for DRP
  • 22. Multi Target Prediction Other name for MTP-> Multi-source domain? • Multi-source domain adaptation with graph embedding and adaptive label prediction • Recently, deep methods with convolutional neural network (CNN) become popular in the community. Hoffman, Mohri, and Zhang (2018) present new normalized solutions with strong theoretical guarantees for the cross-entropy loss, which verifies the feasibility of utilizing CNNs on multi-source scenario. Peng et al. (2019) match first-order moment in a deep network and achieve great performance on a very large-scale dataset. Zhao et al. (2018) and Xu, Chen, Zuo, Yan, and Lin (2018) further introduce adversarial learning in deep multi-source domain adaptation. Mancini et al. (2018) present a novel deep model for automatically discovering latent domains within visual datasets. To sum up, we can see that both moment matching and geometry alignment contribute to multi-source domain adaptation in both shallow and deep models. https://www-sciencedirect- com.ezaccess.library.uitm.edu.my/science/article/pii/S0306457320308621
  • 23. Inductive Vs Transductive (Learning Approaches)
  • 24. Multi Target Prediction What about Supervised and Unsupervised Training in MTP?
  • 25. Multi Target Prediction What about Feature Selection strategy?
  • 26. Side Information • Other terms keep coming up • Matrix co-factorization • Nonnegative matrix factorization • Composite Absolute Penalty
  • 27. Multi Target Prediction Methods proposed for unstructured dataset – Textual Data?. • Sentiment analysis • Transfer learning BERT using pre-trained language model
  • 29. Joint prediction of multiple scores captures better individual traits from brain images
  • 30. SIMULTANEOUS PREDICTION OF MULTIPLE OUTCOMES USING REVISED STACKING ALGORITHMS