SlideShare a Scribd company logo
SILVERCHAIR
Supporting AI: Best Practices for
Content Delivery Platforms
Jake Zarnegar
Chief Business Development Officer, Silverchair
SILVERCHAIR
Orientation to Today’s Talks
SILVERCHAIR
Our Work in Context Understanding
and improving our
worldThe Big Loop:
Research and
Education
Understanding and
improving research
communication and
education effectiveness
The Little Loop:
Publishing
We play a key role in
multiple learning and
improvement loops
Hypothesize
Do something
Generate data
Analyze/Interpret data
Incorporate learning
SILVERCHAIR
• It is our responsibility to identify and apply promising new technology
advancements to the “loops” we serve.
• Examples:
• Internet content distribution (!)
• Streaming video
• Dataset/computational code sharing
• Adaptive learning algorithms
• Emerging TDM and AI content analysis techniques
Embracing the New is Not Optional
SILVERCHAIR
• New technologies take a while to
be “fit to purpose” in particular
problem areas
• Hat tip to Flavius Chis, my Silverchair colleague
(who is a Romanian living in America) who used
this chart to help me understand our fundamental
difference in orientation to new things.
But… Caution is Warranted
SILVERCHAIR
Much like “the cloud,” “big data,” and “machine learning” before it, the term
“artificial intelligence” has been hijacked by marketers and advertising copywriters.
If the hype leaves you asking “What is A.I., really?,” don’t worry, you’re not alone. I
asked various experts to define the term and got different answers. The only thing
they all seem to agree on is that artificial intelligence is a set of technologies that
try to imitate or augment human intelligence.
To me, the emphasis is on augmentation, in which intelligent software helps us
interact and deal with the increasingly digital world we live in.
Om Malik, The New Yorker
http://www.newyorker.com/business/currency/the-hype-and-hope-of-artificial-intelligence
SILVERCHAIR
How Do AI Tools Augment Human Intelligence?
SILVERCHAIR Bloom, et al. 1956
SILVERCHAIR
• Software is far superior to
human experts at this level
of cognition
• Does not require AI methods
SILVERCHAIR Bloom, et al. 1956
SILVERCHAIR
Knowledge Discovery in Databases:
• Traditional knowledge discovery: “Insights are surfaced by highly-educated
human specialists finding, reading, and synthesizing the right few pieces of
information”
• TDM/AI-assisted knowledge discovery: “Insights are surfaced by specialist-
directed algorithms that perform higher-order cognitive analyses on
thousands, millions, or billions of pieces of information”
TDM/AI Methods are New Tools in An Expert’s Toolbox
SILVERCHAIR
http://news.mit.edu/2020/artificial-intelligence-identifies-new-antibiotic-0220
So We Can Just Sit Back and Watch the Magic Happen, Right?
SILVERCHAIR
• Online systems were initially designed to help human experts with
traditional knowledge discovery:
• Filtering/finding relevant content (bring it directly to them or the online
venues where they work)
• Assessing it quickly (visual display of titles, abstracts, tables/charts,
conclusions)
• Understanding it deeply (full narrative “story” of the research, visuals,
audio/video supplementation, citations to related research, PDFs to take
away and read later, etc.)
Not Quite…
SILVERCHAIR
SILVERCHAIR
To enable researchers (and publishers) to get the most value out of emerging
TDM/AI tools, we need to add and consistently evolve specifically-designed
content and data interfaces to our online content platforms.
New AI Tools = New Requirements for Content Platforms
SILVERCHAIR
Two Major Datasets
Content
• Research papers, datasets,
news, whitepapers, education
materials, courseware, etc.
• “Big Loop”
Interactions / Usage Activity
• Data generated from activity on
the platform (myriad basic and
complex measures).
• “Little Loop”
A modern online platform is a streaming content and analytics server
SILVERCHAIR
But how do TDM/AI projects actually work?
SILVERCHAIR
Knowledge Discovery in Databases (KDD) [1] is divided in four main phases: domain
exploration, data preparation, data mining, and interpretation of results.
1. The first phase is responsible for understanding the problem and what data will be
used in the knowledge discovery process.
2. The next phase selects, cleans, and transforms the data to a format that is
suitable for a specific data mining algorithm.
3. In the third phase, the chosen data mining algorithm performs some intelligent
techniques to discover patterns that can be of potential use.
4. The last phase is responsible for manipulating the extracted patterns to generate
interpretable knowledge for humans…
SILVERCHAIR
…Most of the research carried out in this area focus on the data mining phase, which
uses artificial intelligence algorithms like decision trees, artificial neural networks,
evolutionary computation, among others [2] to discover knowledge. On the other
hand, the data preparation phase, responsible for integration, cleaning, and
transformation of data, has not been the subject of much research. In fact, Pyle [3]
argues that “data preparation consumes 60 to 90% of the time needed to mine data
– and contributes 75 to 90% to the mining project’s success.”
From: Paulo M. Goncalves Jr. and Roberto S. M. Barros, "Automating Data Preprocessing with DMPML and
KDDML," 10th IEEE/ACIS International Conference on Computer and Information Science, 2011, DOI:
10.1109/ICIS.2011.23.
SILVERCHAIR
“I downloaded 2TB of Arxiv content last week but I can’t bring myself
to open it and start working on analyzing it because I know I have at
least 6 months of painstaking content cleanup & preparation ahead of
me before I can begin.”
--Mike M., Data Scientist, Fast Forward Labs
SILVERCHAIR
5 Platform Best Practices to Support AI Projects
SILVERCHAIR
• Most projects select content from multiple sources and add/mix with
internal proprietary data
• Even Facebook ingests third-party data to supplement their massive
consumer behavior database. (You’re not Facebook.)
• If your data is too hard to extract/work with, you may be bypassed for easier
alternatives
• Adopting this humble orientation will help you think less about your
preferences for format and delivery, and more about shared standards
1. Accept That You Don’t Have Everything They Need
SILVERCHAIR
• Publishing (especially research publishing) has a huge head start here, with
shared XML standards and many universal data structures
• Note that JSON format is preferred by many AI content analysis tools and
you may want to offer this format as well
• It is best if all of your content (including historical) is delivered in one
consistent format with consistent use of tagging structures
• The title, authors, abstract, where the conclusions are in the paper, etc.
2. Provide Consistent Content/Data Tagging
SILVERCHAIR
• Just letting bots crawl your site is not a programmatic interface. Web page
coding (primarily HTML, JS, CSS) is heavyweight -- built for visual layout and
interactions (traditional use case), although it can be enhanced with
embedded metadata.
• Best practice is to produce a separate programmatic interface that is more
lightweight and designed for concurrent, high-volume transactions.
3. Provide a Programmatic Interface
Of the Silverchair Platform’s last 500 million sessions, 380 million of them were
programmatic.
SILVERCHAIR
• Commercial Publisher Platforms:
https://www.elsevier.com/solutions/sciencedirect/support/api
• Aggregators:
https://www.dimensions.ai/dimensions-apis/
• Independent Platforms:
https://www.silverchair.com/news/silverchair-releases-analysis-and-content-
service/
• Rights/Licensing Organizations:
https://www.copyright.com/business/xmlformining/
API Examples to Examine
SILVERCHAIR
• Per our research and interviews, data scientists overwhelming desire to take
large amounts of content away and ingest it into their own systems (rather
than use your platform as a transactional dependency).
• Some APIs limit the amount that can be downloaded/taken away at one
time, applying old-school limits to the new techniques
• However, protection/vetting is necessary – this is a new type of relationship
4. Embrace Takeout & Delivery (vs. Eat-In)
SILVERCHAIR
• In a seeming paradox, ”some” of your data can be more valuable than “all” of
your data
• If you have data hooks and API tools for complex selection and filtering, the
AI or data scientist won’t need to develop tools of their own in their project
• Recently a licensing agent told me that signed a deal for more $ for a
specifically narrowed portion of a dataset than they quoted for the whole
dataset (not per object, more $ overall!)
5. Enable Complex Selection & Filtering
SILVERCHAIR
New Business Possibilities
SILVERCHAIR
“…data preparation consumes 60 to 90% of the time needed to mine data – and
contributes 75 to 90% to the mining project’s success”
From: Paulo M. Goncalves Jr. and Roberto S. M. Barros, "Automating Data Preprocessing with DMPML and
KDDML," 10th IEEE/ACIS International Conference on Computer and Information Science, 2011, DOI:
10.1109/ICIS.2011.23.
Remember the Business Need
SILVERCHAIR
• Full-text normalized JSON (or XML)
• Separate product/subscription for sale
• Enrich it further for complex filtering/selection or sell it “as is”
• Separate delivery mechanism (no human interface) but can piggyback on
existing content workflows
• Accesses a new class of customer w/deep pockets (AI creators or implementers)
• Requires new vetting/legal agreements
Consider Feeding AI Tools as a New Product
SILVERCHAIR
Or Create Your Own New Services Using AI
• Deliver new on-site content analysis tools for end users using a combination of
semantics & new text mining/AI tools (many of which are free to use from
Google/Amazon/Microsoft/others)
• Experiments, experiments, experiments (recommendation, auto-summarization,
analogues, image analysis, sentiment, prediction, etc.)
SILVERCHAIR
Thank You
Jake Zarnegar
Chief Business Development Officer, Silverchair

More Related Content

What's hot

Bridging Digital Humanities Research and Big Data Repositories of Digital Text
Bridging Digital Humanities Research and Big Data Repositories of Digital TextBridging Digital Humanities Research and Big Data Repositories of Digital Text
Bridging Digital Humanities Research and Big Data Repositories of Digital Text
Beth Plale
 
Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014
Beth Plale
 
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkThe HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
Robert H. McDonald
 
Research Data Management at the University of Edinburgh
Research Data Management at the University of EdinburghResearch Data Management at the University of Edinburgh
Research Data Management at the University of Edinburgh
EDINA, University of Edinburgh
 
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Jisc
 
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
University of South Australlia
 
RDM Programme at University of Edinburgh
RDM Programme at University of EdinburghRDM Programme at University of Edinburgh
RDM Programme at University of Edinburgh
Historic Environment Scotland
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
Christophe Guéret
 
Relationship status: Libraries and linked data in Europe
Relationship status: Libraries and linked data in EuropeRelationship status: Libraries and linked data in Europe
Relationship status: Libraries and linked data in Europe
Diane Rasmussen Pennington
 
The liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleThe liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycle
Celia Emmelhainz
 
Discovery event stuart lee (the humanities researcher)
Discovery event stuart lee (the humanities researcher)Discovery event stuart lee (the humanities researcher)
Discovery event stuart lee (the humanities researcher)
RDTF-Discovery
 
Open Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the HumanitiesOpen Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the Humanities
Open Knowledge Maps
 
Sands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked KnowledgeSands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked Knowledge
sandsfish
 
Bionic Info Pro - Taxonomies and Machine Learning SLA 2014
Bionic Info Pro - Taxonomies and Machine Learning SLA 2014Bionic Info Pro - Taxonomies and Machine Learning SLA 2014
Bionic Info Pro - Taxonomies and Machine Learning SLA 2014
Elaine Lasda
 
Library Connect Webinar - Data Sharing
Library Connect Webinar - Data Sharing Library Connect Webinar - Data Sharing
Library Connect Webinar - Data Sharing
Library_Connect
 
Harnessing Collective Intelligence for Sustainable Development
Harnessing Collective Intelligence for Sustainable DevelopmentHarnessing Collective Intelligence for Sustainable Development
Harnessing Collective Intelligence for Sustainable Development
EDINA, University of Edinburgh
 
Rise of the Databrarian - Jeroen Rombouts
Rise of the Databrarian - Jeroen RomboutsRise of the Databrarian - Jeroen Rombouts
Rise of the Databrarian - Jeroen Rombouts
Library_Connect
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...
Hazel Hall
 
Evolution of e-Content Distribution: Ad Hoc to Standardization
Evolution of e-Content Distribution: Ad Hoc to StandardizationEvolution of e-Content Distribution: Ad Hoc to Standardization
Evolution of e-Content Distribution: Ad Hoc to Standardization
National Information Standards Organization (NISO)
 
Research Data Services at the University of Utah
Research Data Services at the University of UtahResearch Data Services at the University of Utah
Research Data Services at the University of Utah
Rebekah Cummings
 

What's hot (20)

Bridging Digital Humanities Research and Big Data Repositories of Digital Text
Bridging Digital Humanities Research and Big Data Repositories of Digital TextBridging Digital Humanities Research and Big Data Repositories of Digital Text
Bridging Digital Humanities Research and Big Data Repositories of Digital Text
 
Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014
 
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkThe HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
 
Research Data Management at the University of Edinburgh
Research Data Management at the University of EdinburghResearch Data Management at the University of Edinburgh
Research Data Management at the University of Edinburgh
 
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
 
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
 
RDM Programme at University of Edinburgh
RDM Programme at University of EdinburghRDM Programme at University of Edinburgh
RDM Programme at University of Edinburgh
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
 
Relationship status: Libraries and linked data in Europe
Relationship status: Libraries and linked data in EuropeRelationship status: Libraries and linked data in Europe
Relationship status: Libraries and linked data in Europe
 
The liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleThe liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycle
 
Discovery event stuart lee (the humanities researcher)
Discovery event stuart lee (the humanities researcher)Discovery event stuart lee (the humanities researcher)
Discovery event stuart lee (the humanities researcher)
 
Open Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the HumanitiesOpen Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the Humanities
 
Sands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked KnowledgeSands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked Knowledge
 
Bionic Info Pro - Taxonomies and Machine Learning SLA 2014
Bionic Info Pro - Taxonomies and Machine Learning SLA 2014Bionic Info Pro - Taxonomies and Machine Learning SLA 2014
Bionic Info Pro - Taxonomies and Machine Learning SLA 2014
 
Library Connect Webinar - Data Sharing
Library Connect Webinar - Data Sharing Library Connect Webinar - Data Sharing
Library Connect Webinar - Data Sharing
 
Harnessing Collective Intelligence for Sustainable Development
Harnessing Collective Intelligence for Sustainable DevelopmentHarnessing Collective Intelligence for Sustainable Development
Harnessing Collective Intelligence for Sustainable Development
 
Rise of the Databrarian - Jeroen Rombouts
Rise of the Databrarian - Jeroen RomboutsRise of the Databrarian - Jeroen Rombouts
Rise of the Databrarian - Jeroen Rombouts
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...
 
Evolution of e-Content Distribution: Ad Hoc to Standardization
Evolution of e-Content Distribution: Ad Hoc to StandardizationEvolution of e-Content Distribution: Ad Hoc to Standardization
Evolution of e-Content Distribution: Ad Hoc to Standardization
 
Research Data Services at the University of Utah
Research Data Services at the University of UtahResearch Data Services at the University of Utah
Research Data Services at the University of Utah
 

Similar to Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"

Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera, Inc.
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Dr. Sunil Kr. Pandey
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
Sarah Anna Stewart
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
Anita de Waard
 
Database Essay
Database EssayDatabase Essay
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Denodo
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
Kristi Holmes
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
Sarah Jones
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
Sarah Jones
 
Internet of Things
Internet of ThingsInternet of Things
Internet of Things
Aniekan Akpaffiong
 
Democratizing AI with Apache Spark
Democratizing AI with Apache SparkDemocratizing AI with Apache Spark
Democratizing AI with Apache Spark
Spark Summit
 
Big Data
Big Data Big Data
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
Lee Dirks
 
Data and AI in education
Data and AI in educationData and AI in education
Data and AI in education
Jisc
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madness
semanticsconference
 
Engineering a Data Scientist
Engineering a Data ScientistEngineering a Data Scientist
Engineering a Data Scientist
Aron Ahmadia
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
Alexandru Iosup
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
PayamBarnaghi
 
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
Thomas Rones
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
Sri Ambati
 

Similar to Zarneger "Supporting AI: Best Practices for Content Delivery Platforms" (20)

Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Database Essay
Database EssayDatabase Essay
Database Essay
 
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
 
Internet of Things
Internet of ThingsInternet of Things
Internet of Things
 
Democratizing AI with Apache Spark
Democratizing AI with Apache SparkDemocratizing AI with Apache Spark
Democratizing AI with Apache Spark
 
Big Data
Big Data Big Data
Big Data
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
 
Data and AI in education
Data and AI in educationData and AI in education
Data and AI in education
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madness
 
Engineering a Data Scientist
Engineering a Data ScientistEngineering a Data Scientist
Engineering a Data Scientist
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 

More from National Information Standards Organization (NISO)

Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
National Information Standards Organization (NISO)
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
National Information Standards Organization (NISO)
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
National Information Standards Organization (NISO)
 
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
National Information Standards Organization (NISO)
 
Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
National Information Standards Organization (NISO)
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
National Information Standards Organization (NISO)
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
National Information Standards Organization (NISO)
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
National Information Standards Organization (NISO)
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
National Information Standards Organization (NISO)
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
National Information Standards Organization (NISO)
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
National Information Standards Organization (NISO)
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
National Information Standards Organization (NISO)
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
National Information Standards Organization (NISO)
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
National Information Standards Organization (NISO)
 

More from National Information Standards Organization (NISO) (20)

Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
 
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
 
Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 

Recently uploaded

ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
Wahiba Chair Training & Consulting
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 

Recently uploaded (20)

ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 

Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"

  • 1. SILVERCHAIR Supporting AI: Best Practices for Content Delivery Platforms Jake Zarnegar Chief Business Development Officer, Silverchair
  • 3. SILVERCHAIR Our Work in Context Understanding and improving our worldThe Big Loop: Research and Education Understanding and improving research communication and education effectiveness The Little Loop: Publishing We play a key role in multiple learning and improvement loops Hypothesize Do something Generate data Analyze/Interpret data Incorporate learning
  • 4. SILVERCHAIR • It is our responsibility to identify and apply promising new technology advancements to the “loops” we serve. • Examples: • Internet content distribution (!) • Streaming video • Dataset/computational code sharing • Adaptive learning algorithms • Emerging TDM and AI content analysis techniques Embracing the New is Not Optional
  • 5. SILVERCHAIR • New technologies take a while to be “fit to purpose” in particular problem areas • Hat tip to Flavius Chis, my Silverchair colleague (who is a Romanian living in America) who used this chart to help me understand our fundamental difference in orientation to new things. But… Caution is Warranted
  • 6. SILVERCHAIR Much like “the cloud,” “big data,” and “machine learning” before it, the term “artificial intelligence” has been hijacked by marketers and advertising copywriters. If the hype leaves you asking “What is A.I., really?,” don’t worry, you’re not alone. I asked various experts to define the term and got different answers. The only thing they all seem to agree on is that artificial intelligence is a set of technologies that try to imitate or augment human intelligence. To me, the emphasis is on augmentation, in which intelligent software helps us interact and deal with the increasingly digital world we live in. Om Malik, The New Yorker http://www.newyorker.com/business/currency/the-hype-and-hope-of-artificial-intelligence
  • 7. SILVERCHAIR How Do AI Tools Augment Human Intelligence?
  • 9. SILVERCHAIR • Software is far superior to human experts at this level of cognition • Does not require AI methods
  • 11. SILVERCHAIR Knowledge Discovery in Databases: • Traditional knowledge discovery: “Insights are surfaced by highly-educated human specialists finding, reading, and synthesizing the right few pieces of information” • TDM/AI-assisted knowledge discovery: “Insights are surfaced by specialist- directed algorithms that perform higher-order cognitive analyses on thousands, millions, or billions of pieces of information” TDM/AI Methods are New Tools in An Expert’s Toolbox
  • 13. SILVERCHAIR • Online systems were initially designed to help human experts with traditional knowledge discovery: • Filtering/finding relevant content (bring it directly to them or the online venues where they work) • Assessing it quickly (visual display of titles, abstracts, tables/charts, conclusions) • Understanding it deeply (full narrative “story” of the research, visuals, audio/video supplementation, citations to related research, PDFs to take away and read later, etc.) Not Quite…
  • 15. SILVERCHAIR To enable researchers (and publishers) to get the most value out of emerging TDM/AI tools, we need to add and consistently evolve specifically-designed content and data interfaces to our online content platforms. New AI Tools = New Requirements for Content Platforms
  • 16. SILVERCHAIR Two Major Datasets Content • Research papers, datasets, news, whitepapers, education materials, courseware, etc. • “Big Loop” Interactions / Usage Activity • Data generated from activity on the platform (myriad basic and complex measures). • “Little Loop” A modern online platform is a streaming content and analytics server
  • 17. SILVERCHAIR But how do TDM/AI projects actually work?
  • 18. SILVERCHAIR Knowledge Discovery in Databases (KDD) [1] is divided in four main phases: domain exploration, data preparation, data mining, and interpretation of results. 1. The first phase is responsible for understanding the problem and what data will be used in the knowledge discovery process. 2. The next phase selects, cleans, and transforms the data to a format that is suitable for a specific data mining algorithm. 3. In the third phase, the chosen data mining algorithm performs some intelligent techniques to discover patterns that can be of potential use. 4. The last phase is responsible for manipulating the extracted patterns to generate interpretable knowledge for humans…
  • 19. SILVERCHAIR …Most of the research carried out in this area focus on the data mining phase, which uses artificial intelligence algorithms like decision trees, artificial neural networks, evolutionary computation, among others [2] to discover knowledge. On the other hand, the data preparation phase, responsible for integration, cleaning, and transformation of data, has not been the subject of much research. In fact, Pyle [3] argues that “data preparation consumes 60 to 90% of the time needed to mine data – and contributes 75 to 90% to the mining project’s success.” From: Paulo M. Goncalves Jr. and Roberto S. M. Barros, "Automating Data Preprocessing with DMPML and KDDML," 10th IEEE/ACIS International Conference on Computer and Information Science, 2011, DOI: 10.1109/ICIS.2011.23.
  • 20. SILVERCHAIR “I downloaded 2TB of Arxiv content last week but I can’t bring myself to open it and start working on analyzing it because I know I have at least 6 months of painstaking content cleanup & preparation ahead of me before I can begin.” --Mike M., Data Scientist, Fast Forward Labs
  • 21. SILVERCHAIR 5 Platform Best Practices to Support AI Projects
  • 22. SILVERCHAIR • Most projects select content from multiple sources and add/mix with internal proprietary data • Even Facebook ingests third-party data to supplement their massive consumer behavior database. (You’re not Facebook.) • If your data is too hard to extract/work with, you may be bypassed for easier alternatives • Adopting this humble orientation will help you think less about your preferences for format and delivery, and more about shared standards 1. Accept That You Don’t Have Everything They Need
  • 23. SILVERCHAIR • Publishing (especially research publishing) has a huge head start here, with shared XML standards and many universal data structures • Note that JSON format is preferred by many AI content analysis tools and you may want to offer this format as well • It is best if all of your content (including historical) is delivered in one consistent format with consistent use of tagging structures • The title, authors, abstract, where the conclusions are in the paper, etc. 2. Provide Consistent Content/Data Tagging
  • 24. SILVERCHAIR • Just letting bots crawl your site is not a programmatic interface. Web page coding (primarily HTML, JS, CSS) is heavyweight -- built for visual layout and interactions (traditional use case), although it can be enhanced with embedded metadata. • Best practice is to produce a separate programmatic interface that is more lightweight and designed for concurrent, high-volume transactions. 3. Provide a Programmatic Interface Of the Silverchair Platform’s last 500 million sessions, 380 million of them were programmatic.
  • 25. SILVERCHAIR • Commercial Publisher Platforms: https://www.elsevier.com/solutions/sciencedirect/support/api • Aggregators: https://www.dimensions.ai/dimensions-apis/ • Independent Platforms: https://www.silverchair.com/news/silverchair-releases-analysis-and-content- service/ • Rights/Licensing Organizations: https://www.copyright.com/business/xmlformining/ API Examples to Examine
  • 26. SILVERCHAIR • Per our research and interviews, data scientists overwhelming desire to take large amounts of content away and ingest it into their own systems (rather than use your platform as a transactional dependency). • Some APIs limit the amount that can be downloaded/taken away at one time, applying old-school limits to the new techniques • However, protection/vetting is necessary – this is a new type of relationship 4. Embrace Takeout & Delivery (vs. Eat-In)
  • 27. SILVERCHAIR • In a seeming paradox, ”some” of your data can be more valuable than “all” of your data • If you have data hooks and API tools for complex selection and filtering, the AI or data scientist won’t need to develop tools of their own in their project • Recently a licensing agent told me that signed a deal for more $ for a specifically narrowed portion of a dataset than they quoted for the whole dataset (not per object, more $ overall!) 5. Enable Complex Selection & Filtering
  • 29. SILVERCHAIR “…data preparation consumes 60 to 90% of the time needed to mine data – and contributes 75 to 90% to the mining project’s success” From: Paulo M. Goncalves Jr. and Roberto S. M. Barros, "Automating Data Preprocessing with DMPML and KDDML," 10th IEEE/ACIS International Conference on Computer and Information Science, 2011, DOI: 10.1109/ICIS.2011.23. Remember the Business Need
  • 30. SILVERCHAIR • Full-text normalized JSON (or XML) • Separate product/subscription for sale • Enrich it further for complex filtering/selection or sell it “as is” • Separate delivery mechanism (no human interface) but can piggyback on existing content workflows • Accesses a new class of customer w/deep pockets (AI creators or implementers) • Requires new vetting/legal agreements Consider Feeding AI Tools as a New Product
  • 31. SILVERCHAIR Or Create Your Own New Services Using AI • Deliver new on-site content analysis tools for end users using a combination of semantics & new text mining/AI tools (many of which are free to use from Google/Amazon/Microsoft/others) • Experiments, experiments, experiments (recommendation, auto-summarization, analogues, image analysis, sentiment, prediction, etc.)
  • 32. SILVERCHAIR Thank You Jake Zarnegar Chief Business Development Officer, Silverchair

Editor's Notes

  1. Weaknesses Over-filtering Interdisciplinary connections Scalability