Responsible & Safe AI
1 Aug 2024
Inaugural Ceremony
GSFCU ACM Student chapter
Vadodra, Gujarat
Ponnurangam Kumaraguru (“PK”)
#ProfGiri CS IIIT Hyderabad
Vice President, ACM India
TEDx Speaker
https://precog.iiit.ac.in/
/in/ponguru @ponguru
2
3
Know the Audience
Students / Faculty / Others
Any of you have attended my talk(s) before? 
4
What is AI?
5
What is Responsible & Safe?
6
What is Responsible & Safe AI?
7
8
9
10
11
Observations?
12
13
14
15
16
17
18
19
20
https://translate.google.co.in/
21
https://translate.google.co.in/
22
https://translate.google.co.in/
23
https://translate.google.co.in/
Activity
Please do any prompting in any of these or other platforms, get them
to give you biased response, do not do gender bias
HINT: There are very nice prompts that students have come up with in
the past 
24
25
26
27
28
Guardrails
29
30
Jailbreak
What is an alignment problem?
31
What is an alignment problem?
32
https://youtu.be/yWDUzNiWPJA?si=wSDO4i_EMrHzHYDP
High-
level
instanti
ation:
‘RLHF’
pipelin
e
First step: instruction tuning!
Second + third steps: maximize reward 33
https://arxiv.org/pdf/2203.02155
Rouge AIs
We risk losing control over AIs as they become more capable.
Proxy gaming: YouTube / Insta – User engagement – Mental health
34
What is going on? 
https://www.youtube.com/watch?v=lnyuIHSaso8&t=75s
35
Questions?
36
Forms of unlearning
Exact unlearning
Approximate unlearning
Unlearning via differential privacy
Empirical unlearning, where data to be unlearned are precisely known
(training examples)
Empirical unlearning, where data to be unlearned are underspecified
(think “knowledge”)
37
Graph Unlearning
What is it?
38
Graph Unlearning
39
Node feature unlearning
Node unlearning
Edge unlearning
40
Interpretability
New techniques and paradigms for turning model weights and activations into
concepts that humans can understand
https://en.wikipedia.org/wiki/Neural_network
https://en.wikipedia.org/wiki/Artificial_neural_network 41
Interpretability
Interpretability: Mechanistic
Reverse-engineer neural networks
Explaining neurons and connected circuits
42
43
44
45
46
47
Bias in LLMs
Current systems like ChatGPT employ
guardrails, and do not respond to biased
content
Users on the Web leave out key contexts,
which make LLMs think the content is
biased
This negatively affects user engagement
LLMs must be able to explore and ask
more questions
Our work aims to make LLMs bias-aware
– context resolves confusion!
48
Other Directions
Probing
Robustness
Jailbreaking
….
49
https://precog.iiit.ac.in/pages/publications.html
50
https://precog.iiit.ac.in/teaching/responsible-ai-nptel-f24/index.html
51
Search for: Ponnurangam Kumaraguru
https://www.linkedin.com/in/ponguru/
https://twitter.com/ponguru
Interested in working with us?
52
Full time Research Associates
PhD Students
Interns
53
https://precog.iiit.ac.in/
Acknowledgements
Precog members
Collaborators
54

Responsible & Safe AI at GSFC Univ Vadodra

Editor's Notes

  • #43 Neurons communicate through electrical currents called action potentials, which are either excitatory or inhibitory. Excitatory currents are those that prompt one neuron to share information with the next through an action potential, while inhibitory currents reduce the probability that such a transfer will take place.