YOUR IDEAS, OUR DATA
Towards a public data service at the National
Library of France
The Approach: Building a Portal
for our Service Portfolio
Opened 2017 -11-23
http://api.bnf.fr
Different datasets,
Different ways to use them
One portal to bind them all
Built for the 2nd BnF Hackathon
Formats : XML, JSON, RDF
Data available through dumps
and APIs
Now: robust services for data
reuse (unlimited use)
2018:
- Towards a space for code
and data sharing
- Towards an API manager
Metadata from the BnF
Mission: collect, preserve and describe all publications disseminated
on the French territory since 1537
13.8 M publications
Books and serials
Movies
Albums
Music scores
Video games and software
Maps
Photographs and sketches
Beyond publications: authors, subjects, places related
to our collections
~2,1M authors, ~650k works, ~120k places, ~190k topics
(~CC-BY)
Described documents
(metadata)
Free
Digital documents
(metadata + data)
Digital documents
(metadata + data)
An Example for Digital Humanists
Gender studies: evaluate the proportion of women
writers for fiction
Pierre-Carl Langlais
http://scoms.hypotheses.org
An Example for Literature Teachers
Foster women writers : find contemporaries of Pierre de Ronsard
https://resultats.hypotheses.org/1048http://george2etexte.free.fr
And more…
Book commenting
https://www.babelio.com
You went to a concert? Find out recordings, musical
scores, press reviews of the work
http://auconcert.adimasc.io/#/work/13920002
Data from the Digital Library
~4.3M publications in Gallica: books, serials, manuscripts, maps,
images, music scores, audio media, video, ebooks…
Published with a wide range of APIs giving access to all facets
of digital objects through standardized protocols
Searching in digital data stores: OAI-PMH, SRU, SPARQL…
Accessing to content: image (IIIF), Text, OCR, ToC…
For professionals, researchers, digital humanists, makers…
http://gallica.bnf.fr
Examples: Digital Humanities
Mining the Heritage Newspapers Collection
for Research purposes
• Topic modeling
• Articles recognition
Text & OCR API
Le Journal des débats politiques et littéraires, July 27, 1837
Pierre-Carl Langlais, http://scoms.hypotheses.org/
Examples: Digital Humanities
Mining the Heritage Newspapers Collection
for Research purposes
• Natural language processing
(i.e. named entities
recognition, virality of text)
Text & OCR API
French and Canadian newspapers virality, XIXth c.
Pierre-Carl Langlais, http://scoms.hypotheses.org/
Examples: Digital Humanities
• Topic Modeling
• Articles Recognition
• Natural Language Processing
• Data Mining
Data mining (i.e. looking
for illustrations)
OCR & IIIF APIs
Mining the Heritage Newspapers Collection
for Research purposes
Examples: Apps
Mining Gallica Contents to Build New Apps
• Hackathon 2016: GallicaCarte
(geo-location of documents)
All APIs
http://gallicastudio.bnf.fr
Examples: Apps
Mining Gallica Contents to Build New Apps
• Hackathon 2017: MusiViz
(dataviz of music pieces)
All APIs
http://gallicastudio.bnf.fr
Examples: Apps
Mining Gallica Contents to Build New Apps
• Deep learning approach
for illustrations
classification and CBIR
(Content Based Image
Retrieval)
• Hybrid search
(metadata + text + CBIR)
All APIs
http://gallicastudio.bnf.fr
Help us Enhance our Services!
Contacts available on
http://api.bnf.fr
sebastien.peyrard at bnf.fr
jean-philippe.moreux at
bnf.fr
stephane.pillorget at bnf.fr

APIdays 2018 BnF API projects

  • 1.
    YOUR IDEAS, OURDATA Towards a public data service at the National Library of France
  • 2.
    The Approach: Buildinga Portal for our Service Portfolio Opened 2017 -11-23 http://api.bnf.fr Different datasets, Different ways to use them One portal to bind them all Built for the 2nd BnF Hackathon Formats : XML, JSON, RDF Data available through dumps and APIs Now: robust services for data reuse (unlimited use) 2018: - Towards a space for code and data sharing - Towards an API manager
  • 3.
    Metadata from theBnF Mission: collect, preserve and describe all publications disseminated on the French territory since 1537 13.8 M publications Books and serials Movies Albums Music scores Video games and software Maps Photographs and sketches Beyond publications: authors, subjects, places related to our collections ~2,1M authors, ~650k works, ~120k places, ~190k topics (~CC-BY) Described documents (metadata) Free Digital documents (metadata + data) Digital documents (metadata + data)
  • 4.
    An Example forDigital Humanists Gender studies: evaluate the proportion of women writers for fiction Pierre-Carl Langlais http://scoms.hypotheses.org
  • 5.
    An Example forLiterature Teachers Foster women writers : find contemporaries of Pierre de Ronsard https://resultats.hypotheses.org/1048http://george2etexte.free.fr
  • 6.
    And more… Book commenting https://www.babelio.com Youwent to a concert? Find out recordings, musical scores, press reviews of the work http://auconcert.adimasc.io/#/work/13920002
  • 7.
    Data from theDigital Library ~4.3M publications in Gallica: books, serials, manuscripts, maps, images, music scores, audio media, video, ebooks… Published with a wide range of APIs giving access to all facets of digital objects through standardized protocols Searching in digital data stores: OAI-PMH, SRU, SPARQL… Accessing to content: image (IIIF), Text, OCR, ToC… For professionals, researchers, digital humanists, makers… http://gallica.bnf.fr
  • 8.
    Examples: Digital Humanities Miningthe Heritage Newspapers Collection for Research purposes • Topic modeling • Articles recognition Text & OCR API Le Journal des débats politiques et littéraires, July 27, 1837 Pierre-Carl Langlais, http://scoms.hypotheses.org/
  • 9.
    Examples: Digital Humanities Miningthe Heritage Newspapers Collection for Research purposes • Natural language processing (i.e. named entities recognition, virality of text) Text & OCR API French and Canadian newspapers virality, XIXth c. Pierre-Carl Langlais, http://scoms.hypotheses.org/
  • 10.
    Examples: Digital Humanities •Topic Modeling • Articles Recognition • Natural Language Processing • Data Mining Data mining (i.e. looking for illustrations) OCR & IIIF APIs Mining the Heritage Newspapers Collection for Research purposes
  • 11.
    Examples: Apps Mining GallicaContents to Build New Apps • Hackathon 2016: GallicaCarte (geo-location of documents) All APIs http://gallicastudio.bnf.fr
  • 12.
    Examples: Apps Mining GallicaContents to Build New Apps • Hackathon 2017: MusiViz (dataviz of music pieces) All APIs http://gallicastudio.bnf.fr
  • 13.
    Examples: Apps Mining GallicaContents to Build New Apps • Deep learning approach for illustrations classification and CBIR (Content Based Image Retrieval) • Hybrid search (metadata + text + CBIR) All APIs http://gallicastudio.bnf.fr
  • 14.
    Help us Enhanceour Services! Contacts available on http://api.bnf.fr sebastien.peyrard at bnf.fr jean-philippe.moreux at bnf.fr stephane.pillorget at bnf.fr