2. Multi Target Prediction
Multi-target prediction (MTP) is involved with the simultaneous
projection of multiple target output and input features of diverse type,
such as binary, nominal, ordinal, or real-valued.
In contrast with Single-target prediction (STP), where a single target
output to be predicted based on a set of features describing an
instance.
Intro from our Bibliometric paper
5. Multi Target Prediction
• MTP types without Side Info
• Multivariate regression (e.g., predicting whether a protein will bind to a set of
experimentally developed small molecules).
• Multi-task learning (e.g., predicting student marks in the final exam for a
typical high-school course).
• Multi-label classification (e.g., assigning appropriate category tags to
documents).
Ref – 1,2
6. Multi Target Prediction
MTP types with Side Info
• Multivariate regression (e.g., predicting whether a protein will bind to a set of
experimentally developed small molecules + a representation for the target
molecules).
• Multi-task learning (e.g., predicting student marks in the final exam for a
typical high-school course + such as geographical location, qualifications of
the teachers, reputation of the school).
• Multi-label classification (e.g., assigning appropriate category tags to
documents + a hierarchical structure).
• Multi-label classification with label ranking (suitable for DAS) maybe….
Ref – 1,
8. Multi Target Prediction
Data type
Ref - https://www.bigdataframework.org/data-types-structured-vs-
unstructured-data/
9. Multi Target Prediction
structured Vs unstructured dataset?
Ref - https://lawtomated.com/structured-data-vs-unstructured-data-
what-are-they-and-why-care/
10. Multi Target Prediction
What is unstructured dataset?
• Human-generated unstructured data
• Text files: word processing files, spreadsheets, presentations, emails.
• Email: largely text, but has some internal structure thanks to its metadata (e.g.
including the visible “to”, “from”, “date / time”, “subject” entered to send an email)
but also mixes in unstructured data via the message body. For this reason, email is
also referred to as semi-structured data.
• Social Media: like email, this is often semi-structured data, containing unstructured
data (e.g. a Tweet) but also structured data (e.g. the number of “Likes”, “retweets”,
“date”, “author” etc).
• Websites: YouTube, Instagram etc contain lots of unstructured data, but also much
structured data, e.g. like described above for Twitter
• Mobile data: text messages, locations.
• Communications: IMs, dictaphone recordings.
• Media: MP3, digital photos, audio recordings and video files.
• Business applications: MS Office documents, PDFs and similar.
Ref - https://lawtomated.com/structured-data-vs-unstructured-data-
what-are-they-and-why-care/
11. Multi Target Prediction
What is unstructured dataset?
• Machine-generated unstructured data
• Common types of machine-generated unstructured data include:
• Satellite imagery: weather data, geographic forms, military movements.
• Scientific data: oil and gas exploration, space exploration, seismic imagery and
atomosphereic data.Digital surveillance: CCTV.
Ref - https://lawtomated.com/structured-data-vs-unstructured-data-
what-are-they-and-why-care/
12. Multi Target Prediction
How it is done in structured and unstructured dataset?
• Convert unstructured to structured data types
• If Text, then
• tagging with metadata or part-of-speech tagging
• Dimensionality reduction - to identify the root word for actual words and
reduce the size of the text data.
• Disambiguation—the use of contextual clues
• Sentiment analysis involves discerning subjective (as opposed to factual)
material and extracting various forms of attitudinal information: sentiment,
opinion, mood, and emotion.
13. Side Information
• is of crucial importance for generalizing to novel targets that are unobserved
during the training phase
• i.e – a novel target molecule in the drug design
• a novel tag in the document annotation example
• a novel course in the student grading example
Ref- 2, 12
14. Side Information
Forms of existing side information for unstructured dataset.
• Example from Product Recommendation System
Ref- 11
15. Side Information
Categorisation of side information for unstructured dataset.
• To resolve data sparsity and cold-start issues, side information are widely used in recommender
systems.
• From Facebook?
• Coversation QnA (Textual)
• Status Update (Textual)
• Videos
• Image
• Audio
Ref – Slide Presentation from NTU, Singapore
16. DAS – A Multi Target Prediction Problem?
• DAS- Anxiety, depression and
stress
• Can be diagnosed by using the
Depression, Anxiety and Stress Scale -
21 Items (DASS-21)
• DASS-21- is a set of three self-report
scales designed to measure the
emotional states of depression,
anxiety and stress.
Ref- 14,15
17. Depression, Anxiety and Stress Scale - 21 Items
(DASS-21)
Ref- 13
Depression
Dysphoria
Hopelessness
Devaluation Of Life
Self-deprecation
Lack Of
Interest/Involvement
Anhedonia
Inertia
Anxiety
Autonomic Arousal
Skeletal Muscle
Effects
Situational Anxiety
Subjective Experience
Of Anxious Affect
Stress
levels of non-chronic
arousal through
difficulty relaxing
nervous arousal
being easily
upset/agitated,
irritable/over-
reactive
impatient
• Scores for depression, anxiety and stress are calculated by summing the scores for the relevant
items.
• DASS-21 is based on a dimensional rather than a categorical conception of psychological disorder.
18. The idea is –
• Improving an MTP classification problem (DAS) utilizing a type of side
information (for both structured/unstructured data)
• ?
19. Reference
1. Waegeman, W., Dembczyński, K., & Hüllermeier, E. (2019). Multi-target prediction: a unifying view on problems and methods. Data Mining and Knowledge Discovery, 33(2),
293–324. https://doi.org/10.1007/s10618-018-0595-5
2. Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., & Vlahavas, I. (2016). Multi-target regression via input space expansion: treating targets as inputs. Machine Learning,
104(1), 55–98. https://doi.org/10.1007/s10994-016-5546-z
3. Tu, C. H., & Li, C. (2019). Multitarget prediction—A new approach using sphere complex fuzzy sets. Engineering Applications of Artificial Intelligence.
https://doi.org/10.1016/j.engappai.2018.11.004
4. Xing, L., Lesperance, M. L., Zhang, X., & Hancock, J. (2020). Simultaneous prediction of multiple outcomes using revised stacking algorithms. Bioinformatics.
https://doi.org/10.1093/bioinformatics/btz531
5. Rahim, M., Thirion, B., Bzdok, D., Buvat, I., & Varoquaux, G. (2017). Joint prediction of multiple scores captures better individual traits from brain images. NeuroImage.
https://doi.org/10.1016/j.neuroimage.2017.06.072
6. Jonschkowski, R., Höfer, S., & Brock, O. (2015). Patterns for Learning with Side Information. Retrieved from http://arxiv.org/abs/1511.06429
7. Vashishth, S., Joshi, R., Prayaga, S. S., Bhattacharyya, C., & Talukdar, P. (2020). RESIDE: Improving distantly-supervised neural relation extraction using side information.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018. https://doi.org/10.18653/v1/d18-1157
8. Farias, V. F., & Li, A. A. (2019). Learning preferences with side information. Management Science. https://doi.org/10.1287/mnsc.2018.3092
9. Hu, C., Rai, P., & Carin, L. (2017). Deep generative models for relational data with side information. 34th International Conference on Machine Learning, ICML 2017.
10. Kang, D., Dhar, D., & Chan, A. B. (2017). Incorporating side information by adaptive convolution. Advances in Neural Information Processing Systems.
11. Pourgholamali, F., Kahani, M., Bagheri, E., & Noorian, Z. (2017). Embedding unstructured side information in product recommendation. Electronic Commerce Research and
Applications. https://doi.org/10.1016/j.elerap.2017.08.001
12. Wang, Y., Xiang, Y., Zhang, J., Zhou, W., & Xie, B. (2014). Internet traffic clustering with side information. Journal of Computer and System Sciences.
https://doi.org/10.1016/j.jcss.2014.02.008
13. Lovibond, S. H., & Lovibond, P. F. (1995). Manual for the Depression Anxiety Stress Scales. In Psychology Foundation of Australia. https://doi.org/DOI: 10.1016/0005-
7967(94)00075-U
14. Smoller, J. W. (2016). The Genetics of Stress-Related Disorders: PTSD, Depression, and Anxiety Disorders. Neuropsychopharmacology. https://doi.org/10.1038/npp.2015.266
15. Bener, A., Saleh, N., Bakir, A., & Bhugra, D. (2016). Depression, anxiety, and stress symptoms in menopausal arab women: Shedding more light on a complex relationship.
Annals of Medical and Health Sciences Research. https://doi.org/10.4103/amhsr.amhsr_341_15
20. Currently working on..
• Current
• Bibliometric paper “Research trends on MTP and Side Info”
• Side Info categorisation – Need criteria for sorting side information's type
• Proof for “In the present study, the combination of Multi-target (MT) prediction
approaches and Machine Learning algorithms has not been evaluated as an effective
strategy to improve prediction performances of social media data (structured or
unstructured)
• Find proof of “little focus on the unstructured dataset being used as side information.
• Not forgetting
• Data collection
• Proposal for DRP
22. Multi Target Prediction
Other name for MTP-> Multi-source domain?
• Multi-source domain adaptation with graph embedding and adaptive label prediction
• Recently, deep methods with convolutional neural network (CNN) become popular in the community.
Hoffman, Mohri, and Zhang (2018) present new normalized solutions with strong theoretical
guarantees for the cross-entropy loss, which verifies the feasibility of utilizing CNNs on multi-source
scenario. Peng et al. (2019) match first-order moment in a deep network and achieve great
performance on a very large-scale dataset. Zhao et al. (2018) and Xu, Chen, Zuo, Yan, and Lin (2018)
further introduce adversarial learning in deep multi-source domain adaptation. Mancini et al. (2018)
present a novel deep model for automatically discovering latent domains within visual datasets. To sum
up, we can see that both moment matching and geometry alignment contribute to multi-source
domain adaptation in both shallow and deep models.
https://www-sciencedirect-
com.ezaccess.library.uitm.edu.my/science/article/pii/S0306457320308621
26. Side Information
• Other terms keep coming up
• Matrix co-factorization
• Nonnegative matrix factorization
• Composite Absolute Penalty
27. Multi Target Prediction
Methods proposed for unstructured dataset – Textual Data?.
• Sentiment analysis
• Transfer learning BERT using pre-trained language model