SlideShare a Scribd company logo
Scholarly Document Processing
Research in the Age of AI
Min-Yen Kan
National University of Singapore
Slides @ http://bit.ly/kan-sdp22
17 Oct 2022 3rd SDProc @ COLING 2022 1
Warning: This is a participatory keynote!
…that is, there is a pop quiz. You have been warned! 🤣🤣
Please access the poll at
http://pollev.com/knmnyn
Do skip the name registration
17 Oct 2022 3rd SDProc @ COLING 2022 2
Fast and Slow
Kahneman and Tversky, Thinking Fast and Slow
Daniel Kahneman
System 1 System 2
Fast Slow
Automatic Controlled
Intuitive Analytical
Parallel Serial
Associative Logical
Slides @ http://bit.ly/kan-sdp22
17 Oct 2022 3rd SDProc @ COLING 2022 3
Neural Nets – System 1
Andrew Ng
System 1 System 2
Fast Slow
Automatic Controlled
Intuitive Analytical
Parallel Serial
Associative Logical
Slides @ http://bit.ly/kan-sdp22
17 Oct 2022 3rd SDProc @ COLING 2022 4
✏Your Turn: What do you think the
loss function of research should be?
The Age of Accelerations
Friedman, Thank You for Being Late
His three accelerations
• Moore’s law
• Globalization
• Mother Nature
Kurzweil’s “Second half of the
chessboard”
Thomas Friedman
17 Oct 2022 3rd SDProc @ COLING 2022 5
The Age of Accelerations
Friedman, Thank You for Being Late
Our three accelerants (take your pick)
• arXiv
• PapersWithCode
• (Semantic) Scholar
Kurzweil’s “Second half of the
chessboard”
Thomas Friedman
17 Oct 2022 3rd SDProc @ COLING 2022 6
What about System 2?
Are there scholarly problems that
require more analytical, logical,
and sustained thinking?
Absolutely!
17 Oct 2022 3rd SDProc @ COLING 2022 7
System 1 System 2
Fast Slow
Automatic Controlled
Intuitive Analytical
Parallel Serial
Associative Logical
A Brief
History of
Science
A fast primer
Photo Credit: 오힘찬 @ WikimediaCommons (CC SA-BY 4.0)
17 Oct 2022 3rd SDProc @ COLING 2022 8
From Astrology to Astronomy
Ptolemy Copernicus
17 Oct 2022 3rd SDProc @ COLING 2022 9
Astronomy 2.0 and 3.0
Galileo and Kepler Newton
17 Oct 2022 3rd SDProc @ COLING 2022 10
Kuhn – challenging accumulative growth
Kuhn, The Structure of Scientific Revolutions
Paradigm Shift
Normal Science
To think about: What age is SDP in now?
17 Oct 2022 3rd SDProc @ COLING 2022 11
Thomas Kuhn
Science is a verb
… in the sense that it is a method (activity) involving the making of hypotheses,
the design of experiments and the analysis of data. But a critical part of the
scientific process is the conversation phase after the experimentation is
done. Scientists share their findings with the broader community through
publications or presentations at meetings. What happens next is a back-and-
forth discussion including a critique of methods or interpretation, and a
comparison with previous findings.
If there are flaws in the experimental design or interpretation, … scientists
need to be willing to hear and respond to feedback. If there are conflicting
results, it may require additional hypothesis making and
experimentation. Only when the conversation runs its course do the conclusions
become a part of accepted scientific understanding.
17 Oct 2022 3rd SDProc @ COLING 2022 12
Steve Savage’s post on Science 2.0
Science in the Age of AI
17 Oct 2022 3rd SDProc @ COLING 2022 13
Video Source: Video by RedEye450 from Pexels
Loss function of research
Beam search analogy
Accelerations make the
gradient steeper
Overload favors System 1
Publish or Perish
Suboptimal local minima
17 Oct 2022 3rd SDProc @ COLING 2022 14
What affordances does AI yield?
Better System 1!
e.g., Neural Architecture
Search (NAS)
17 Oct 2022 3rd SDProc @ COLING 2022 15
Figures from Ren et al. 2021 ACM Comput. Surv. 37(4)
System 1 and 2 work together
One way: System 1 brings data for System 2 to deliberate with
System 2 gives feedback (end-to-end) to System 1
Neither system is perfect but the whole is better than the parts
(multi-view learning)
Let’s connect it back to our societal research loss function
17 Oct 2022 3rd SDProc @ COLING 2022 16
Scholarly
Document
Processing for
System 2
Slowing down
Photo Credits: sizumaru @ Flickr
17 Oct 2022 3rd SDProc @ COLING 2022 17
Challenges for System 2 SDP
1. Discovering Adjacent Possibles (Branch Out)
2. Uncovering Discrepancies (Dive Deep)
3. Finding Provenance (Travel Back)
17 Oct 2022 3rd SDProc @ COLING 2022 18
1. Discovering Adjacent Possibles
Liquid Networks
The Slow Hunch
Serendipity
Exaptation
Steven Johnson
17 Oct 2022 3rd SDProc @ COLING 2022 19
Johnson, Where Good Ideas Come From
Confirmation Bias in Recommender Systems
We train search and recommender systems, but on historical data
This results in confirmation bias (more like this)
But if we want to afford System 2 thinking, we want
serendipitous recommendation (to learn what we don’t know)
Need to capture multimodal evidence and laborious human assessment
17 Oct 2022 3rd SDProc @ COLING 2022 20
Next Gen Platforms
For discoverability:
• Setting exploration criteria
• Reproducible search
• Suggesting alternative paths and terminologies
For discussion, collaboration and crediting:
• “Calm” for Scientists (arXiv off)
• MIT Deliberatorium
• Big Science initiatives
17 Oct 2022 3rd SDProc @ COLING 2022 21
& Toolkits (not everyone wants to do it
globally and publicly)
2. Uncovering Discrepancies
17 Oct 2022 3rd SDProc @ COLING 2022 22
✏Your Turn:
Is coffee bad
for you?
17 Oct 2022 3rd SDProc @ COLING 2022 23
2. Uncovering Discrepancies –
Countering the Streetlight Effect
What happens next is a back-and-forth discussion including a
critique of methods or interpretation, and a comparison with
previous findings.
If there are flaws in the experimental design or interpretation,
… scientists need to be willing to hear and respond to feedback.
Communities do not sufficiently report negative
results
Difficult to organize discrepancies for systematic
exploration, thus we cannot question the
establishment
17 Oct 2022 3rd SDProc @ COLING 2022 24
Related: Davies et al. Promoting inclusive metrics of success and
impact to dismantle a discriminatory reward system in science.
Photo by Guilherme Rossi @ Pexels
Aids for Paradigm Shifts
Systematic reviews for what doesn’t work
“Our techniques improve on Dataset X but less well on Y.
Uncover choices left (un)stated by authors
“We compare against current relevant baselines [1, 2, 3]”
Machine reading of Limitations and Ethical Consideration sections
17 Oct 2022 3rd SDProc @ COLING 2022 25
, 3
17 Oct 2022 3rd SDProc @ COLING 2022 26
https://symplectic.co.uk/guest-blog/research-data-mechanics/
✏Your Turn: What
about citation
half-life? How is it
changing?
3. Finding Provenance
Perhaps surprisingly, citations half-life has lengthened in most fields.
Does this mean that we are finding the right works?
17 Oct 2022 3rd SDProc @ COLING 2022 27
Martín-Martín et al. Back to the past: on the shoulders of an academic search engine giant
Davis and Cochran Cited Half-Life of the Journal Literature
Aids for Finding Provenance
Paraphrase, terminology and simplification services in situ
(stay tuned for Head’s keynote)
Lower the barrier for communication. Platforms for easier means for
discussing problems and knowing of furthering research
Who cares about my research?
(Multi-hop) Trace terms and ideas back to their source
17 Oct 2022 3rd SDProc @ COLING 2022 28
We need to participate in Science!
This is the last activity, I promise!
http://pollev.com/knmnyn
17 Oct 2022 3rd SDProc @ COLING 2022 29
✏Your Turn: Please use
your own judgement to
rank the three
challenges presented
Conclusion: SDP needs to
get involved in Science
Let’s be deliberate about our tools for
science. Care to discuss?
Diversity and inclusion are also important
for holistic progress in science.
Thanks to:
WING members:
George Huang Po-Wei
Yajing Yang
Abhinav Ramesh Kashyap
Muthu Kumar Chandrasekaran
Collaborators:
Min Song
Namhee Kim
and many more previous WING members,
and my family, and
all of you who’ve attended physically and
virtually to listen!
Thank you!
Yanxia Qin
Aminesh Prasad
Kazunari Sugiyama
Juyoung An
Slides @ http://bit.ly/kan-sdp22
17 Oct 2022 3rd SDProc @ COLING 2022 30
17 Oct 2022 3rd SDProc @ COLING 2022 31

More Related Content

Similar to Scholarly Document Processing Research in the Age of AI

Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
jybufgofasfbkpoovh
 
Mon domingue key_introduction to semantic
Mon domingue key_introduction to semanticMon domingue key_introduction to semantic
Mon domingue key_introduction to semantic
eswcsummerschool
 
Writing for Top-Tier Journals. Learn How to Impress the Editors and Increase ...
Writing for Top-Tier Journals. Learn How to Impress the Editors and Increase ...Writing for Top-Tier Journals. Learn How to Impress the Editors and Increase ...
Writing for Top-Tier Journals. Learn How to Impress the Editors and Increase ...
Piotr Siuda
 
The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)
David De Roure
 
Open Science
Open ScienceOpen Science
Open Science
Chelle Gentemann
 
Tochtermann
TochtermannTochtermann
Tochtermann
Ágnes W. Kovács
 
Ntegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxNtegra 20231003 v3.pptx
Ntegra 20231003 v3.pptx
ISSIP
 
SOC2002 Lecture 3
SOC2002 Lecture 3SOC2002 Lecture 3
SOC2002 Lecture 3
Bonnie Green
 
"Strategies for Scientific Communication" part of #SSNCE Research Scholars’ C...
"Strategies for Scientific Communication" part of #SSNCE Research Scholars’ C..."Strategies for Scientific Communication" part of #SSNCE Research Scholars’ C...
"Strategies for Scientific Communication" part of #SSNCE Research Scholars’ C...
IIIT Hyderabad
 
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
GigaScience, BGI Hong Kong
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
jeffreylancaster
 
The End(s) of e-Research
The End(s) of e-ResearchThe End(s) of e-Research
The End(s) of e-Research
Eric Meyer
 
AGU Leptoukh Lecture: Putting Data to Work: Moving science forward together b...
AGU Leptoukh Lecture: Putting Data to Work: Moving science forward together b...AGU Leptoukh Lecture: Putting Data to Work: Moving science forward together b...
AGU Leptoukh Lecture: Putting Data to Work: Moving science forward together b...
Erin Robinson
 
The Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and MusicThe Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and Music
David De Roure
 
50YearsDataScience.pdf
50YearsDataScience.pdf50YearsDataScience.pdf
50YearsDataScience.pdf
Jyothi Jangam
 
Biomedical Engineering in a Changing Scholarly Landscape
Biomedical Engineering in a Changing Scholarly LandscapeBiomedical Engineering in a Changing Scholarly Landscape
Biomedical Engineering in a Changing Scholarly Landscape
Philip Bourne
 
Berlin 6 Open Access Conference: Sergey Fomel
Berlin 6 Open Access Conference: Sergey FomelBerlin 6 Open Access Conference: Sergey Fomel
Berlin 6 Open Access Conference: Sergey Fomel
Cornelius Puschmann
 
LDT Future of Learning 2010
LDT Future of Learning 2010LDT Future of Learning 2010
LDT Future of Learning 2010
Life Unexamined
 
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA DATASCIENCE
 
Being an Open Scholar in a Connected World
Being an Open Scholar in a Connected WorldBeing an Open Scholar in a Connected World
Being an Open Scholar in a Connected World
Stian Håklev
 

Similar to Scholarly Document Processing Research in the Age of AI (20)

Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
 
Mon domingue key_introduction to semantic
Mon domingue key_introduction to semanticMon domingue key_introduction to semantic
Mon domingue key_introduction to semantic
 
Writing for Top-Tier Journals. Learn How to Impress the Editors and Increase ...
Writing for Top-Tier Journals. Learn How to Impress the Editors and Increase ...Writing for Top-Tier Journals. Learn How to Impress the Editors and Increase ...
Writing for Top-Tier Journals. Learn How to Impress the Editors and Increase ...
 
The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)
 
Open Science
Open ScienceOpen Science
Open Science
 
Tochtermann
TochtermannTochtermann
Tochtermann
 
Ntegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxNtegra 20231003 v3.pptx
Ntegra 20231003 v3.pptx
 
SOC2002 Lecture 3
SOC2002 Lecture 3SOC2002 Lecture 3
SOC2002 Lecture 3
 
"Strategies for Scientific Communication" part of #SSNCE Research Scholars’ C...
"Strategies for Scientific Communication" part of #SSNCE Research Scholars’ C..."Strategies for Scientific Communication" part of #SSNCE Research Scholars’ C...
"Strategies for Scientific Communication" part of #SSNCE Research Scholars’ C...
 
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
 
The End(s) of e-Research
The End(s) of e-ResearchThe End(s) of e-Research
The End(s) of e-Research
 
AGU Leptoukh Lecture: Putting Data to Work: Moving science forward together b...
AGU Leptoukh Lecture: Putting Data to Work: Moving science forward together b...AGU Leptoukh Lecture: Putting Data to Work: Moving science forward together b...
AGU Leptoukh Lecture: Putting Data to Work: Moving science forward together b...
 
The Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and MusicThe Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and Music
 
50YearsDataScience.pdf
50YearsDataScience.pdf50YearsDataScience.pdf
50YearsDataScience.pdf
 
Biomedical Engineering in a Changing Scholarly Landscape
Biomedical Engineering in a Changing Scholarly LandscapeBiomedical Engineering in a Changing Scholarly Landscape
Biomedical Engineering in a Changing Scholarly Landscape
 
Berlin 6 Open Access Conference: Sergey Fomel
Berlin 6 Open Access Conference: Sergey FomelBerlin 6 Open Access Conference: Sergey Fomel
Berlin 6 Open Access Conference: Sergey Fomel
 
LDT Future of Learning 2010
LDT Future of Learning 2010LDT Future of Learning 2010
LDT Future of Learning 2010
 
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
 
Being an Open Scholar in a Connected World
Being an Open Scholar in a Connected WorldBeing an Open Scholar in a Connected World
Being an Open Scholar in a Connected World
 

Recently uploaded

Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
TechSoup
 
Educational Technology in the Health Sciences
Educational Technology in the Health SciencesEducational Technology in the Health Sciences
Educational Technology in the Health Sciences
Iris Thiele Isip-Tan
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
The basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptxThe basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptx
heathfieldcps1
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
Data Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsxData Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsx
Prof. Dr. K. Adisesha
 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
ImMuslim
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
How Barcodes Can Be Leveraged Within Odoo 17
How Barcodes Can Be Leveraged Within Odoo 17How Barcodes Can Be Leveraged Within Odoo 17
How Barcodes Can Be Leveraged Within Odoo 17
Celine George
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)
Mohammad Al-Dhahabi
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
nitinpv4ai
 
Juneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School DistrictJuneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School District
David Douglas School District
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
PsychoTech Services
 

Recently uploaded (20)

Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
 
Educational Technology in the Health Sciences
Educational Technology in the Health SciencesEducational Technology in the Health Sciences
Educational Technology in the Health Sciences
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
The basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptxThe basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptx
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
Data Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsxData Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsx
 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
How Barcodes Can Be Leveraged Within Odoo 17
How Barcodes Can Be Leveraged Within Odoo 17How Barcodes Can Be Leveraged Within Odoo 17
How Barcodes Can Be Leveraged Within Odoo 17
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
 
Juneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School DistrictJuneteenth Freedom Day 2024 David Douglas School District
Juneteenth Freedom Day 2024 David Douglas School District
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
 

Scholarly Document Processing Research in the Age of AI

  • 1. Scholarly Document Processing Research in the Age of AI Min-Yen Kan National University of Singapore Slides @ http://bit.ly/kan-sdp22 17 Oct 2022 3rd SDProc @ COLING 2022 1
  • 2. Warning: This is a participatory keynote! …that is, there is a pop quiz. You have been warned! 🤣🤣 Please access the poll at http://pollev.com/knmnyn Do skip the name registration 17 Oct 2022 3rd SDProc @ COLING 2022 2
  • 3. Fast and Slow Kahneman and Tversky, Thinking Fast and Slow Daniel Kahneman System 1 System 2 Fast Slow Automatic Controlled Intuitive Analytical Parallel Serial Associative Logical Slides @ http://bit.ly/kan-sdp22 17 Oct 2022 3rd SDProc @ COLING 2022 3
  • 4. Neural Nets – System 1 Andrew Ng System 1 System 2 Fast Slow Automatic Controlled Intuitive Analytical Parallel Serial Associative Logical Slides @ http://bit.ly/kan-sdp22 17 Oct 2022 3rd SDProc @ COLING 2022 4 ✏Your Turn: What do you think the loss function of research should be?
  • 5. The Age of Accelerations Friedman, Thank You for Being Late His three accelerations • Moore’s law • Globalization • Mother Nature Kurzweil’s “Second half of the chessboard” Thomas Friedman 17 Oct 2022 3rd SDProc @ COLING 2022 5
  • 6. The Age of Accelerations Friedman, Thank You for Being Late Our three accelerants (take your pick) • arXiv • PapersWithCode • (Semantic) Scholar Kurzweil’s “Second half of the chessboard” Thomas Friedman 17 Oct 2022 3rd SDProc @ COLING 2022 6
  • 7. What about System 2? Are there scholarly problems that require more analytical, logical, and sustained thinking? Absolutely! 17 Oct 2022 3rd SDProc @ COLING 2022 7 System 1 System 2 Fast Slow Automatic Controlled Intuitive Analytical Parallel Serial Associative Logical
  • 8. A Brief History of Science A fast primer Photo Credit: 오힘찬 @ WikimediaCommons (CC SA-BY 4.0) 17 Oct 2022 3rd SDProc @ COLING 2022 8
  • 9. From Astrology to Astronomy Ptolemy Copernicus 17 Oct 2022 3rd SDProc @ COLING 2022 9
  • 10. Astronomy 2.0 and 3.0 Galileo and Kepler Newton 17 Oct 2022 3rd SDProc @ COLING 2022 10
  • 11. Kuhn – challenging accumulative growth Kuhn, The Structure of Scientific Revolutions Paradigm Shift Normal Science To think about: What age is SDP in now? 17 Oct 2022 3rd SDProc @ COLING 2022 11 Thomas Kuhn
  • 12. Science is a verb … in the sense that it is a method (activity) involving the making of hypotheses, the design of experiments and the analysis of data. But a critical part of the scientific process is the conversation phase after the experimentation is done. Scientists share their findings with the broader community through publications or presentations at meetings. What happens next is a back-and- forth discussion including a critique of methods or interpretation, and a comparison with previous findings. If there are flaws in the experimental design or interpretation, … scientists need to be willing to hear and respond to feedback. If there are conflicting results, it may require additional hypothesis making and experimentation. Only when the conversation runs its course do the conclusions become a part of accepted scientific understanding. 17 Oct 2022 3rd SDProc @ COLING 2022 12 Steve Savage’s post on Science 2.0
  • 13. Science in the Age of AI 17 Oct 2022 3rd SDProc @ COLING 2022 13 Video Source: Video by RedEye450 from Pexels
  • 14. Loss function of research Beam search analogy Accelerations make the gradient steeper Overload favors System 1 Publish or Perish Suboptimal local minima 17 Oct 2022 3rd SDProc @ COLING 2022 14
  • 15. What affordances does AI yield? Better System 1! e.g., Neural Architecture Search (NAS) 17 Oct 2022 3rd SDProc @ COLING 2022 15 Figures from Ren et al. 2021 ACM Comput. Surv. 37(4)
  • 16. System 1 and 2 work together One way: System 1 brings data for System 2 to deliberate with System 2 gives feedback (end-to-end) to System 1 Neither system is perfect but the whole is better than the parts (multi-view learning) Let’s connect it back to our societal research loss function 17 Oct 2022 3rd SDProc @ COLING 2022 16
  • 17. Scholarly Document Processing for System 2 Slowing down Photo Credits: sizumaru @ Flickr 17 Oct 2022 3rd SDProc @ COLING 2022 17
  • 18. Challenges for System 2 SDP 1. Discovering Adjacent Possibles (Branch Out) 2. Uncovering Discrepancies (Dive Deep) 3. Finding Provenance (Travel Back) 17 Oct 2022 3rd SDProc @ COLING 2022 18
  • 19. 1. Discovering Adjacent Possibles Liquid Networks The Slow Hunch Serendipity Exaptation Steven Johnson 17 Oct 2022 3rd SDProc @ COLING 2022 19 Johnson, Where Good Ideas Come From
  • 20. Confirmation Bias in Recommender Systems We train search and recommender systems, but on historical data This results in confirmation bias (more like this) But if we want to afford System 2 thinking, we want serendipitous recommendation (to learn what we don’t know) Need to capture multimodal evidence and laborious human assessment 17 Oct 2022 3rd SDProc @ COLING 2022 20
  • 21. Next Gen Platforms For discoverability: • Setting exploration criteria • Reproducible search • Suggesting alternative paths and terminologies For discussion, collaboration and crediting: • “Calm” for Scientists (arXiv off) • MIT Deliberatorium • Big Science initiatives 17 Oct 2022 3rd SDProc @ COLING 2022 21 & Toolkits (not everyone wants to do it globally and publicly)
  • 22. 2. Uncovering Discrepancies 17 Oct 2022 3rd SDProc @ COLING 2022 22 ✏Your Turn: Is coffee bad for you?
  • 23. 17 Oct 2022 3rd SDProc @ COLING 2022 23
  • 24. 2. Uncovering Discrepancies – Countering the Streetlight Effect What happens next is a back-and-forth discussion including a critique of methods or interpretation, and a comparison with previous findings. If there are flaws in the experimental design or interpretation, … scientists need to be willing to hear and respond to feedback. Communities do not sufficiently report negative results Difficult to organize discrepancies for systematic exploration, thus we cannot question the establishment 17 Oct 2022 3rd SDProc @ COLING 2022 24 Related: Davies et al. Promoting inclusive metrics of success and impact to dismantle a discriminatory reward system in science. Photo by Guilherme Rossi @ Pexels
  • 25. Aids for Paradigm Shifts Systematic reviews for what doesn’t work “Our techniques improve on Dataset X but less well on Y. Uncover choices left (un)stated by authors “We compare against current relevant baselines [1, 2, 3]” Machine reading of Limitations and Ethical Consideration sections 17 Oct 2022 3rd SDProc @ COLING 2022 25 , 3
  • 26. 17 Oct 2022 3rd SDProc @ COLING 2022 26 https://symplectic.co.uk/guest-blog/research-data-mechanics/ ✏Your Turn: What about citation half-life? How is it changing?
  • 27. 3. Finding Provenance Perhaps surprisingly, citations half-life has lengthened in most fields. Does this mean that we are finding the right works? 17 Oct 2022 3rd SDProc @ COLING 2022 27 Martín-Martín et al. Back to the past: on the shoulders of an academic search engine giant Davis and Cochran Cited Half-Life of the Journal Literature
  • 28. Aids for Finding Provenance Paraphrase, terminology and simplification services in situ (stay tuned for Head’s keynote) Lower the barrier for communication. Platforms for easier means for discussing problems and knowing of furthering research Who cares about my research? (Multi-hop) Trace terms and ideas back to their source 17 Oct 2022 3rd SDProc @ COLING 2022 28
  • 29. We need to participate in Science! This is the last activity, I promise! http://pollev.com/knmnyn 17 Oct 2022 3rd SDProc @ COLING 2022 29 ✏Your Turn: Please use your own judgement to rank the three challenges presented
  • 30. Conclusion: SDP needs to get involved in Science Let’s be deliberate about our tools for science. Care to discuss? Diversity and inclusion are also important for holistic progress in science. Thanks to: WING members: George Huang Po-Wei Yajing Yang Abhinav Ramesh Kashyap Muthu Kumar Chandrasekaran Collaborators: Min Song Namhee Kim and many more previous WING members, and my family, and all of you who’ve attended physically and virtually to listen! Thank you! Yanxia Qin Aminesh Prasad Kazunari Sugiyama Juyoung An Slides @ http://bit.ly/kan-sdp22 17 Oct 2022 3rd SDProc @ COLING 2022 30
  • 31. 17 Oct 2022 3rd SDProc @ COLING 2022 31