SlideShare a Scribd company logo
1 of 15
Download to read offline
SCUOLA DI INGEGNERIA
INDUSTRIALE E DELL’INFORMAZIONE
Thesis Proposals
2024
Marco Brambilla
Data Science Lab
marco.brambilla@polimi.it
2
TOC
1. Proposals
2. References and pointers
3
Explainable AI
The final aim of the Explainable Artificial Intelligence (XAI) research field can be
summarised as
“Developing inherently explainable
systems and explainability techniques
that faithfully explicit the behaviour of
complex machine learning models
tailoring their explanation in an
understandable way for humans.”
4
Gamified Data Collection for NLP Explainability Tasks
Development of a gamified platform to collect structured human knowledge for
multiple, different NLP tasks.
NLP Task
Selection
Gamified Activity
Task #1
Gamified Activity
Task #N
Data
Structuring
Data Storing
5
IMAGE ANALYSIS AND EXPLAINABILITY
► Development of NLP Techniques for Semantic Clustering of Words to be used as
labels in the context of explainability of image analysis.
► Development of Gamification and Machine Learning Techniques for Debugging and
Improvement of Image Classification Algorithms.
► Feasibility of crowdsourcing approaches for image classification in categories that
are very similar → double problem: human task complexity (learning cost, type and
quantity of learning), and ML task complexity (much more expensive training).
► How to define training for humans to ensure explainability of objects from very
similar classes?
► Feasibility of crowdsourcing approaches for tagging actions (on videos?)
► Automatic generation of explanations from crowdsourced tagging of relevance
heatmaps of image features for classification.
6
IMAGE ANALYSIS AND EXPLAINABILITY
► Study of single-class and cross-class explainability label to see if the same labels on
different classes are generated by the same part of the network.
► Study of techniques that, given the extracted features (blackbox), classify
explainability labels.
► Classification of labels obtained from the crowd, starting from labels produced for
explainability, and generating classifier results.
► Crowdsourcing techniques for a priori image explanation, composing sets of
descriptive features of objects/concepts. Comparative study on image explainability
techniques.
► Extension of the work "A Flexible Metric-Based Approach to Assess Neural Network
Interpretability in Image Classification"
7
IMAGE ANALYSIS AND EXPLAINABILITY
► Testing and comparing using different GRAD-Cam and saliency map methods other
than the basic one.
► Testing the method using more complex datasets (e.g., PASCAL) and similar-class
datasets to assess explanations (e.g., shape vs. background).
► Validation of the final ranking of the models through human-in-the-loop approaches
(e.g., showing them the same explanation from different models and having them
order them) to compare it with the obtained scores.
► Study of the level of detail necessary for a classifier to correctly classify various
objects starting from the segmentation of a class (e.g., parachute).
► Training the model with the segmentation of a concept, is it enough to understand
the concept, or does it need extra detail? (e.g., "Soccer ball" shape alone may not
be enough, but with color, it could work).
8
TEXT ANALYSIS EXPLAINABILITY
► Explainability of ML models (deep, LLM, …) on text processing and NLP.
► Linguistic and behavioural methods
9
Large Language Models - LLMs
► Design and use of LLMs
– Exploration of different LLMs, experiments and comparison
► Model refinement / verticalization
– Legal
– Tech
– Security
10
LLM and Deep Learning for Security
Development of a Multimedia Pipeline for Data Extraction and Annotation to Support Investigations: An Integrated Approach
with Apache NiFi and Streamlit
Problem:
In security investigations, multimedia material such as audio, video, and images may contain crucial information. However,
extracting such information in a structured and efficient manner poses a significant challenge.
Proposed Solution:
Create an automated pipeline for transforming multimedia material into annotations useful for investigative purposes. This
pipeline will integrate technologies like neural networks for speaker identification, automatic translation, summarization, entity
extraction, metadata extraction, and similar tasks. It will use Apache NiFi for managing data flows and Streamlit for the user
interface, allowing operators to view and use the generated annotations.
Technologies:
Kubernetes and Docker for scalability and maintainability. DevOps techniques for configuring and managing infrastructure.
Pipeline Construction: Use of Apache NiFi for managing data flows and integration with Streamlit for annotation visualization.
Final Use Case:
In the end, it will be possible to demonstrate how these extractions can create alerts that are useful for investigative authorities.
11
LLM and Deep Learning for Security
Using Large Language Models to Guide Investigative Decisions:Prioritizing Actions in a Sea of Options
Problem:
Operators in the field of security investigations oftenface a wide range of operational choices.Some of these options,suchas accessing
specialized databases,can be costly and time-consuming.At the same time, not all actions have the same likelihood of leading to useful
results. The challenge is, therefore,to determine which operations to undertake to maximize effectiveness and reduce costs.
Proposed Solution:
The idea is to use a large language modeltrained to assess the various investigative options available and estimate their likelihood of
success.This way, the operator will be guided towards actions that are more likely to be fruitful, avoiding unnecessaryexpenses and
efforts.
Added Value:
Operational Efficiency:Saving time and resources by focusing on options with high probabilities of success.
DecisionSupport:Providing operators with a system that helps them make more informed decisionsquickly and securely.
WorkflowOptimization: The possibilityof integrating the modelwith existing platforms,making the decision-making process smoother
and integrated.
Possible Steps:
Testing 0-shot or multishot models of existing open-source models and evaluating further training of language models forthis purpose.
Evaluation of the system with real or simulated use cases.
12
Generative approaches for Security
Large-Scale Simulation of Realistic Data for Testing National Security Analysis Tools in the Octostar Environment
Objective:
Develop an advanced simulation modelto generate a realistic dataset representing the dynamics of the daily lives of 10 million people,to
be used as a testbed forthe Octostar platform in the context of national security.
Methodology:
Collecting Open Datasets:Using open and anonymized datasets to modelvarious aspects of urban life (e.g., traffic, economic
transactions, communications).
Model Creation: Using neural networks to generate differenttypes of data, such as daily movements (e.g., home-work, school,shopping),
use of private vehicles or public transportation, banking transactions (e.g., withdrawals, online purchases),communications(e.g., calls,
messages,emails).
Computational Optimization: Methods such as parallelism and distributed computing to address the creation of 100-1000 billion records.
Model Validation: Comparisonwith real or simulated data from accredited sourcesto ensure accuracy.
Integration with Octostar: Importing generated data into the Octostar platform for testing and demonstrations.
Computational Requirements:
Generating large-scale data, estimating a total of 100-1000 billion records.Implementing efficientalgorithms to minimize the time and
computational resources required.
Octostarwill provide the required computational resources.
Scientific and Practical Value:
Provides a realistic dataset fortesting national security algorithms and tools. Offers the opportunity to experimentwith advanced
simulation methods and neural networks to generate realistic human behaviors.
13
Other Topics
► Data science analysis
► Network analysis
► Robotic Process Automation (RPA)
► …
► (see second slide deck too)
14
Pointers
https://marco-brambilla.com/blog/
For past theses examples: POLITESI WEBSITE(search by advisor)
Big data and data science
https://marco-brambilla.com/2022/11/04/exploring-the-bi-verse-a-trip-across-the- digital-and-
physical-ecospheres/
Explainability
https://marco-brambilla.com/2022/07/11/the-role-of-human-knowledge-in-
explainable-ai/
https://marco-brambilla.com/2022/06/01/exp-crowd-gamified-crowdsourcing-for-ai- explainability/
Thesis Proposals 2024
Marco Brambilla
Data Science Lab
marco.brambilla@polimi.it

More Related Content

What's hot

What's hot (20)

An Introduction to Clinical Informatics
An Introduction to Clinical InformaticsAn Introduction to Clinical Informatics
An Introduction to Clinical Informatics
 
Internet of Things presentation
Internet of Things presentationInternet of Things presentation
Internet of Things presentation
 
IoT(Internet of Things) Report
IoT(Internet of Things) ReportIoT(Internet of Things) Report
IoT(Internet of Things) Report
 
Iot how it works
Iot   how it worksIot   how it works
Iot how it works
 
Career in computer Hardware & Networking
Career in computer Hardware & NetworkingCareer in computer Hardware & Networking
Career in computer Hardware & Networking
 
Iot ppt
Iot pptIot ppt
Iot ppt
 
Introduction to Internet of Things (IoT)
Introduction to Internet of Things (IoT) Introduction to Internet of Things (IoT)
Introduction to Internet of Things (IoT)
 
IoT Developer Survey 2019 Report
IoT Developer Survey 2019 ReportIoT Developer Survey 2019 Report
IoT Developer Survey 2019 Report
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
Web 3.0 & IoT (English)
Web 3.0 & IoT (English)Web 3.0 & IoT (English)
Web 3.0 & IoT (English)
 
IoT
IoTIoT
IoT
 
Internet of Things(IOT)
Internet of Things(IOT)Internet of Things(IOT)
Internet of Things(IOT)
 
Pervasive Computing
Pervasive ComputingPervasive Computing
Pervasive Computing
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
 
Node.js and The Internet of Things
Node.js and The Internet of ThingsNode.js and The Internet of Things
Node.js and The Internet of Things
 
Internet of Things (IOT) - Technology and Applications
Internet of Things (IOT) - Technology and ApplicationsInternet of Things (IOT) - Technology and Applications
Internet of Things (IOT) - Technology and Applications
 
Iot-Internet-of-Things-ppt.pptx
Iot-Internet-of-Things-ppt.pptxIot-Internet-of-Things-ppt.pptx
Iot-Internet-of-Things-ppt.pptx
 
A survey in privacy and security in Internet of Things IOT
A survey in privacy and security in Internet of Things IOTA survey in privacy and security in Internet of Things IOT
A survey in privacy and security in Internet of Things IOT
 
Internet of things (IoT)
Internet of things (IoT)Internet of things (IoT)
Internet of things (IoT)
 
Iot
IotIot
Iot
 

Similar to M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Brambilla Marco

MS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.docMS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.doc
butest
 
Screening of Mental Health in Adolescents using ML.pptx
Screening of Mental Health in Adolescents using ML.pptxScreening of Mental Health in Adolescents using ML.pptx
Screening of Mental Health in Adolescents using ML.pptx
NitishChoudhary23
 
Analysis and assessment software for multi-user collaborative cognitive radi...
Analysis and assessment software for multi-user collaborative  cognitive radi...Analysis and assessment software for multi-user collaborative  cognitive radi...
Analysis and assessment software for multi-user collaborative cognitive radi...
IJECEIAES
 
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Sharmila Sathish
 
Activity Context Modeling in Context-Aware
Activity Context Modeling in Context-AwareActivity Context Modeling in Context-Aware
Activity Context Modeling in Context-Aware
Editor IJCATR
 

Similar to M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Brambilla Marco (20)

Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
 
MS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.docMS Word file resumes16869r.doc.doc
MS Word file resumes16869r.doc.doc
 
Marvin_Capstone
Marvin_CapstoneMarvin_Capstone
Marvin_Capstone
 
Beyond TensorBoard: AutoML을 위한 interactive visual analytics 서비스 개발 경험 공유
Beyond TensorBoard: AutoML을 위한 interactive visual analytics 서비스 개발 경험 공유Beyond TensorBoard: AutoML을 위한 interactive visual analytics 서비스 개발 경험 공유
Beyond TensorBoard: AutoML을 위한 interactive visual analytics 서비스 개발 경험 공유
 
Cognitive automation
Cognitive automationCognitive automation
Cognitive automation
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# Projects
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# Projects
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learning
 
Screening of Mental Health in Adolescents using ML.pptx
Screening of Mental Health in Adolescents using ML.pptxScreening of Mental Health in Adolescents using ML.pptx
Screening of Mental Health in Adolescents using ML.pptx
 
Analysis and assessment software for multi-user collaborative cognitive radi...
Analysis and assessment software for multi-user collaborative  cognitive radi...Analysis and assessment software for multi-user collaborative  cognitive radi...
Analysis and assessment software for multi-user collaborative cognitive radi...
 
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
 
The Smart Way To Invest in AI and ML_SFStartupDay
The Smart Way To Invest in AI and ML_SFStartupDayThe Smart Way To Invest in AI and ML_SFStartupDay
The Smart Way To Invest in AI and ML_SFStartupDay
 
Synergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringSynergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software Engineering
 
Mohan C R CV
Mohan C R CVMohan C R CV
Mohan C R CV
 
ML crash course
ML crash courseML crash course
ML crash course
 
Activity Context Modeling in Context-Aware
Activity Context Modeling in Context-AwareActivity Context Modeling in Context-Aware
Activity Context Modeling in Context-Aware
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 
PoolParty Semantic Classifier
PoolParty Semantic ClassifierPoolParty Semantic Classifier
PoolParty Semantic Classifier
 
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELSSENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
 

More from Marco Brambilla

Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Marco Brambilla
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Marco Brambilla
 
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Marco Brambilla
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
Marco Brambilla
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extraction
Marco Brambilla
 
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Marco Brambilla
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...
Marco Brambilla
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introduction
Marco Brambilla
 
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
Marco Brambilla
 

More from Marco Brambilla (20)

Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheres
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social Media
 
Trigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demoTrigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demo
 
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
 
Analyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projectsAnalyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projects
 
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extraction
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018
 
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...
 
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
 
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
 
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
 
Big Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoBig Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di Milano
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introduction
 
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
 
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
 

Recently uploaded

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Recently uploaded (20)

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Brambilla Marco

  • 1. SCUOLA DI INGEGNERIA INDUSTRIALE E DELL’INFORMAZIONE Thesis Proposals 2024 Marco Brambilla Data Science Lab marco.brambilla@polimi.it
  • 3. 3 Explainable AI The final aim of the Explainable Artificial Intelligence (XAI) research field can be summarised as “Developing inherently explainable systems and explainability techniques that faithfully explicit the behaviour of complex machine learning models tailoring their explanation in an understandable way for humans.”
  • 4. 4 Gamified Data Collection for NLP Explainability Tasks Development of a gamified platform to collect structured human knowledge for multiple, different NLP tasks. NLP Task Selection Gamified Activity Task #1 Gamified Activity Task #N Data Structuring Data Storing
  • 5. 5 IMAGE ANALYSIS AND EXPLAINABILITY ► Development of NLP Techniques for Semantic Clustering of Words to be used as labels in the context of explainability of image analysis. ► Development of Gamification and Machine Learning Techniques for Debugging and Improvement of Image Classification Algorithms. ► Feasibility of crowdsourcing approaches for image classification in categories that are very similar → double problem: human task complexity (learning cost, type and quantity of learning), and ML task complexity (much more expensive training). ► How to define training for humans to ensure explainability of objects from very similar classes? ► Feasibility of crowdsourcing approaches for tagging actions (on videos?) ► Automatic generation of explanations from crowdsourced tagging of relevance heatmaps of image features for classification.
  • 6. 6 IMAGE ANALYSIS AND EXPLAINABILITY ► Study of single-class and cross-class explainability label to see if the same labels on different classes are generated by the same part of the network. ► Study of techniques that, given the extracted features (blackbox), classify explainability labels. ► Classification of labels obtained from the crowd, starting from labels produced for explainability, and generating classifier results. ► Crowdsourcing techniques for a priori image explanation, composing sets of descriptive features of objects/concepts. Comparative study on image explainability techniques. ► Extension of the work "A Flexible Metric-Based Approach to Assess Neural Network Interpretability in Image Classification"
  • 7. 7 IMAGE ANALYSIS AND EXPLAINABILITY ► Testing and comparing using different GRAD-Cam and saliency map methods other than the basic one. ► Testing the method using more complex datasets (e.g., PASCAL) and similar-class datasets to assess explanations (e.g., shape vs. background). ► Validation of the final ranking of the models through human-in-the-loop approaches (e.g., showing them the same explanation from different models and having them order them) to compare it with the obtained scores. ► Study of the level of detail necessary for a classifier to correctly classify various objects starting from the segmentation of a class (e.g., parachute). ► Training the model with the segmentation of a concept, is it enough to understand the concept, or does it need extra detail? (e.g., "Soccer ball" shape alone may not be enough, but with color, it could work).
  • 8. 8 TEXT ANALYSIS EXPLAINABILITY ► Explainability of ML models (deep, LLM, …) on text processing and NLP. ► Linguistic and behavioural methods
  • 9. 9 Large Language Models - LLMs ► Design and use of LLMs – Exploration of different LLMs, experiments and comparison ► Model refinement / verticalization – Legal – Tech – Security
  • 10. 10 LLM and Deep Learning for Security Development of a Multimedia Pipeline for Data Extraction and Annotation to Support Investigations: An Integrated Approach with Apache NiFi and Streamlit Problem: In security investigations, multimedia material such as audio, video, and images may contain crucial information. However, extracting such information in a structured and efficient manner poses a significant challenge. Proposed Solution: Create an automated pipeline for transforming multimedia material into annotations useful for investigative purposes. This pipeline will integrate technologies like neural networks for speaker identification, automatic translation, summarization, entity extraction, metadata extraction, and similar tasks. It will use Apache NiFi for managing data flows and Streamlit for the user interface, allowing operators to view and use the generated annotations. Technologies: Kubernetes and Docker for scalability and maintainability. DevOps techniques for configuring and managing infrastructure. Pipeline Construction: Use of Apache NiFi for managing data flows and integration with Streamlit for annotation visualization. Final Use Case: In the end, it will be possible to demonstrate how these extractions can create alerts that are useful for investigative authorities.
  • 11. 11 LLM and Deep Learning for Security Using Large Language Models to Guide Investigative Decisions:Prioritizing Actions in a Sea of Options Problem: Operators in the field of security investigations oftenface a wide range of operational choices.Some of these options,suchas accessing specialized databases,can be costly and time-consuming.At the same time, not all actions have the same likelihood of leading to useful results. The challenge is, therefore,to determine which operations to undertake to maximize effectiveness and reduce costs. Proposed Solution: The idea is to use a large language modeltrained to assess the various investigative options available and estimate their likelihood of success.This way, the operator will be guided towards actions that are more likely to be fruitful, avoiding unnecessaryexpenses and efforts. Added Value: Operational Efficiency:Saving time and resources by focusing on options with high probabilities of success. DecisionSupport:Providing operators with a system that helps them make more informed decisionsquickly and securely. WorkflowOptimization: The possibilityof integrating the modelwith existing platforms,making the decision-making process smoother and integrated. Possible Steps: Testing 0-shot or multishot models of existing open-source models and evaluating further training of language models forthis purpose. Evaluation of the system with real or simulated use cases.
  • 12. 12 Generative approaches for Security Large-Scale Simulation of Realistic Data for Testing National Security Analysis Tools in the Octostar Environment Objective: Develop an advanced simulation modelto generate a realistic dataset representing the dynamics of the daily lives of 10 million people,to be used as a testbed forthe Octostar platform in the context of national security. Methodology: Collecting Open Datasets:Using open and anonymized datasets to modelvarious aspects of urban life (e.g., traffic, economic transactions, communications). Model Creation: Using neural networks to generate differenttypes of data, such as daily movements (e.g., home-work, school,shopping), use of private vehicles or public transportation, banking transactions (e.g., withdrawals, online purchases),communications(e.g., calls, messages,emails). Computational Optimization: Methods such as parallelism and distributed computing to address the creation of 100-1000 billion records. Model Validation: Comparisonwith real or simulated data from accredited sourcesto ensure accuracy. Integration with Octostar: Importing generated data into the Octostar platform for testing and demonstrations. Computational Requirements: Generating large-scale data, estimating a total of 100-1000 billion records.Implementing efficientalgorithms to minimize the time and computational resources required. Octostarwill provide the required computational resources. Scientific and Practical Value: Provides a realistic dataset fortesting national security algorithms and tools. Offers the opportunity to experimentwith advanced simulation methods and neural networks to generate realistic human behaviors.
  • 13. 13 Other Topics ► Data science analysis ► Network analysis ► Robotic Process Automation (RPA) ► … ► (see second slide deck too)
  • 14. 14 Pointers https://marco-brambilla.com/blog/ For past theses examples: POLITESI WEBSITE(search by advisor) Big data and data science https://marco-brambilla.com/2022/11/04/exploring-the-bi-verse-a-trip-across-the- digital-and- physical-ecospheres/ Explainability https://marco-brambilla.com/2022/07/11/the-role-of-human-knowledge-in- explainable-ai/ https://marco-brambilla.com/2022/06/01/exp-crowd-gamified-crowdsourcing-for-ai- explainability/
  • 15. Thesis Proposals 2024 Marco Brambilla Data Science Lab marco.brambilla@polimi.it