SlideShare a Scribd company logo
1 of 21
Download to read offline
Elasticsearch: a key element of Invenio 3
Elasticsearch Meetup
Johnny Mariéthoz
Lausanne, 2017/03/10
About Me
12 years as computer scientist in machine learning
7 years as Invenio developer and instance maintainer
bass and double bass player
newbie as analog camera photographer
Library Network of Western Switzerland 2 Lausanne, 2017/03/10
Library Network of Western Switzerland
Library Network of Western Switzerland 3 Lausanne, 2017/03/10
RERO: Library Network of Western Switzerland
220 libraries
academic libraries, heritage
libraries, public libraries, school
libraries or specialized libraries
50’000 students
5 cantons: FR, GE, JU, NE, VS
280’000 registered patrons
3 academic universities
Geneva, Fribourg, Neuchâtel
1 University of Applied Sciences
central office
in Martigny
19 employees
Library Network of Western Switzerland 4 Lausanne, 2017/03/10
Typical Data Centered Web Application
Data Web Server
Data
Schema
Persistant
Storage
Search
Engine
PID Store
External
Files
REST API
HTML WEB
Pages
GUI
Search Engines:
Google, etc.
External Services
Access Rights
Files
Download /
Preview for users
Other
Formats
Browser
apps
Desktop
apps
Library Network of Western Switzerland 5 Lausanne, 2017/03/10
Common Needed Features
data management with versioning, validation and PID
(Persistent Identifiers)
search engine
rights management (ACL, oauth)
web page management with templates (search results and
others, such as news, front-page, etc.)
url management (routing)
REST API generation
format conversion/migration
CLI utilities
data acquisition (html forms based editor)
Library Network of Western Switzerland 6 Lausanne, 2017/03/10
Development
modular software architecture
easy new module creation
webassets management for less, sass, nodejs, etc.
asynchronous task management
unit testing and logging
i18n (translations)
web front-end and back-end
and many more...
Library Network of Western Switzerland 7 Lausanne, 2017/03/10
Library Network of Western Switzerland 8 Lausanne, 2017/03/10
History
digital library and document repository software
created by CERN
mature platform: first public release v0.0.9 in 2002
open source project
originated in high-energy physics
institutional repository: CERN Document Server
integrated library system: CERN Document Server
disciplinary repository: INSPIRE
open research data server: ZENODO
self-contained python, mysql web application until 1.x
transition with v2.x
complete new rewritten v3
Library Network of Western Switzerland 9 Lausanne, 2017/03/10
Used Technologies
set of python modules
include module interaction mechanisms
delivered as a framework around state-of-the-art
technologies
steep learning curve
Library Network of Western Switzerland 10 Lausanne, 2017/03/10
+
Library Network of Western Switzerland 11 Lausanne, 2017/03/10
Elasticsearch Integration
SQL only for persistent data, no more SQL query during
the HTTP request
data model using JSON, JSON-Schema and ES Mapping
sorting, facets and query configuration by type of object
use official elasticsearch python package:
elasticsearch-dsl
CLI to create indexes, push mappings and index the data
Library Network of Western Switzerland 12 Lausanne, 2017/03/10
JSON-Schema
JSON File
[{
"album": "A Tribute to Jack Johnson",
"artist": "Miles Davis",
"genre": [
"Jazz"
],
"mime": "audio/mp3",
"performers": [
"Miles Davis"
],
"tracks": [
"Part 1",
"Part 2"
],
"year": 1970
}, ...]
Schema File
{"title": "Music Album",
"type": "object",
"properties": {
"album": {
"type": "string",
"minLength": 1
},
"mime": {
"type": "string",
"minLength": 7,
"pattern": "^audio/",
"enum": [
"audio/flac",
"audio/mp2",
"audio/mp3",
"audio/mp4",
"audio/vorbis"
]
},
...},
"required": ["album", "year"],
"additionalProperties": false}
Library Network of Western Switzerland 13 Lausanne, 2017/03/10
ES Mapping
JSON File
[{
"album": "A Tribute to Jack Johnson",
"artist": "Miles Davis",
"genre": [
"Jazz"
],
"mime": "audio/mp3",
"performers": [
"Miles Davis"
],
"tracks": [
"Part 1",
"Part 2"
],
"year": 1970
}, ...]
Mapping File
{"mappings": {
"record-v1.0.0": {
"date_detection": false,
"numeric_detection" : false,
"properties": {
"album": {
"type": "string",
"analyzer": "english",
"copy_to": "sort_album"
},
"sort_album": {
"type": "string",
"index": "not_analyzed"
},
"artist": {
"type": "string",
"analyzer": "standard",
"copy_to": "facet_artist"
},
"facet_artist": {
"type": "string",
"index": "not_analyzed"
},
...
} } } }
Library Network of Western Switzerland 14 Lausanne, 2017/03/10
Schema and Mapping
JSON-SCHEMA: schemas/records/record-v1.0.0.json
should be included in the data ($schema)
used for data validation
is the documentation for humans
name is important: i.e. index_name: records-record-v1.0.0,
document_type: record-v1.0.0 for record indexing
Mapping mappings/records/record-v1.0.0.json
can be set using a CLI
name is important: index_name: records-record-v1.0.0 with
alias=records, document_type: record-v1.0.0 during the
index creation
Library Network of Western Switzerland 15 Lausanne, 2017/03/10
Configuration
Facets Configuration
RECORDS_REST_FACETS = dict(
records=dict(
aggs={
’genre’: dict(terms=dict(field=’genre’, size=10)),
’years’: dict(date_histogram=dict(
field=’year’,
interval=’year’,
format=’yyyy’)
)
},
filters=dict(
genre=terms_filter(’genre’)
),
post_filters=dict(
years=range_filter(
’year’,
format=’yyyy’,
end_date_math=’/y’),
)
)
)
Library Network of Western Switzerland 16 Lausanne, 2017/03/10
Configuration
Sorting Configuration
RECORDS_REST_SORT_OPTIONS = dict(
records=dict(
bestmatch=dict(
fields=[’-_score’],
title=’Best match’,
default_order=’asc’,
order=1,
),
mostrecent=dict(
fields=[’-_created’],
title=’Most recent’,
default_order=’asc’,
order=2,
)
)
)
RECORDS_REST_DEFAULT_SORT = dict(
records=dict(query=’bestmatch’, noquery=’mostrecent’),
)
Library Network of Western Switzerland 17 Lausanne, 2017/03/10
Configuration
REST Configuration
RECORDS_REST_ENDPOINTS = dict(
recid=dict(
search_class=RecordsSearch,
search_index=’records’,
search_type=None,
record_serializers={
’application/json’: (’invenio_records_rest.serializers’
’:json_v1_response’),
},
search_serializers={
’application/json’: (’invenio_records_rest.serializers’
’:json_v1_search’),
},
search_factory_imp=es_search_factory,
list_route=’/records/’,
item_route=’/records/<pid(recid):pid_value>’
)
)
Library Network of Western Switzerland 18 Lausanne, 2017/03/10
Demo
Library Network of Western Switzerland 19 Lausanne, 2017/03/10
Conclusion
very generic and flexible tool
great open source community (many thanks to the CERN)
easy to prototype and develop new applications and
features
demands time to master (learning curve)
at the center of RERO’s future developments
swiss open access research publications repository
(SONAR)
new Integrated Library System (ILS) for public libraries (3
years project)
and many more projects...
Library Network of Western Switzerland 20 Lausanne, 2017/03/10
References
RERO http://www.rero.ch
Invenio http://invenio-software.org/
Invenio Documentation
http://invenio.readthedocs.io
Elasticsearch https://www.elastic.co
CERN http://home.cern
JSON-LD http://json-ld.org/
JSON Schema http://json-schema.org/
ZENODO https://zenodo.org/
Library Network of Western Switzerland 21 Lausanne, 2017/03/10

More Related Content

What's hot

New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...Stefan Schmunk
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data ArrivesRichard Wallis
 
Towards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data GraphTowards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data GraphBesnik Fetahu
 
Making social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataMaking social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataAlbert Meroño-Peñuela
 
2016 05-20-clariah-wp4
2016 05-20-clariah-wp42016 05-20-clariah-wp4
2016 05-20-clariah-wp4CLARIAH
 
BHL-Europe_MINERVA_20111116_hrainer
BHL-Europe_MINERVA_20111116_hrainerBHL-Europe_MINERVA_20111116_hrainer
BHL-Europe_MINERVA_20111116_hrainerHeimo Rainer
 
Linked Data at BnF : We Made It Happen... Now What? / Mélanie Roche (Nationa...
Linked Data at BnF : We Made It Happen... Now What? / Mélanie Roche (Nationa...Linked Data at BnF : We Made It Happen... Now What? / Mélanie Roche (Nationa...
Linked Data at BnF : We Made It Happen... Now What? / Mélanie Roche (Nationa...CIGScotland
 
Richard Wallis Linked Data
Richard Wallis Linked DataRichard Wallis Linked Data
Richard Wallis Linked DataIncisive_Events
 
QB'er demonstration
QB'er demonstrationQB'er demonstration
QB'er demonstrationCLARIAH
 
Linked Data for Libraries: Great progress, but what is the benefit?
Linked Data for Libraries:  Great progress, but what is the benefit?Linked Data for Libraries:  Great progress, but what is the benefit?
Linked Data for Libraries: Great progress, but what is the benefit?Richard Wallis
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackMartin Voigt
 
Viaf and isni ifla 2013 08-16
Viaf and isni  ifla 2013 08-16Viaf and isni  ifla 2013 08-16
Viaf and isni ifla 2013 08-16Janifer Gatenby
 
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...Interactive exploration of complex relational data sets in a web - SemWeb.Pro...
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...Logilab
 

What's hot (18)

New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data Arrives
 
Ceba geoportail
Ceba geoportailCeba geoportail
Ceba geoportail
 
Towards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data GraphTowards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data Graph
 
Making social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataMaking social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked data
 
2016 05-20-clariah-wp4
2016 05-20-clariah-wp42016 05-20-clariah-wp4
2016 05-20-clariah-wp4
 
BHL-Europe_MINERVA_20111116_hrainer
BHL-Europe_MINERVA_20111116_hrainerBHL-Europe_MINERVA_20111116_hrainer
BHL-Europe_MINERVA_20111116_hrainer
 
Csdh sbg clariah_intr01
Csdh sbg clariah_intr01Csdh sbg clariah_intr01
Csdh sbg clariah_intr01
 
Linked Data at BnF : We Made It Happen... Now What? / Mélanie Roche (Nationa...
Linked Data at BnF : We Made It Happen... Now What? / Mélanie Roche (Nationa...Linked Data at BnF : We Made It Happen... Now What? / Mélanie Roche (Nationa...
Linked Data at BnF : We Made It Happen... Now What? / Mélanie Roche (Nationa...
 
DBPedia-past-present-future
DBPedia-past-present-futureDBPedia-past-present-future
DBPedia-past-present-future
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Richard Wallis Linked Data
Richard Wallis Linked DataRichard Wallis Linked Data
Richard Wallis Linked Data
 
QB'er demonstration
QB'er demonstrationQB'er demonstration
QB'er demonstration
 
Linked Data for Libraries: Great progress, but what is the benefit?
Linked Data for Libraries:  Great progress, but what is the benefit?Linked Data for Libraries:  Great progress, but what is the benefit?
Linked Data for Libraries: Great progress, but what is the benefit?
 
Linked Data
Linked DataLinked Data
Linked Data
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data Stack
 
Viaf and isni ifla 2013 08-16
Viaf and isni  ifla 2013 08-16Viaf and isni  ifla 2013 08-16
Viaf and isni ifla 2013 08-16
 
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...Interactive exploration of complex relational data sets in a web - SemWeb.Pro...
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...
 

Similar to Elasticsearch: a key element of Invenio 3

A portrait of Europeana as a Linked Open Data case
A portrait of Europeana as a Linked Open Data caseA portrait of Europeana as a Linked Open Data case
A portrait of Europeana as a Linked Open Data caseAntoine Isaac
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...Jennifer Bowen
 
What to Expect of the LSST Archive: The LSST Science Platform
What to Expect of the LSST Archive: The LSST Science PlatformWhat to Expect of the LSST Archive: The LSST Science Platform
What to Expect of the LSST Archive: The LSST Science PlatformMario Juric
 
eNanoMapper database, search tools and templates
eNanoMapper database, search tools and templateseNanoMapper database, search tools and templates
eNanoMapper database, search tools and templatesNina Jeliazkova
 
Semantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital LibrariesSemantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital LibrariesStefan Dietze
 
Data Analytics and Visualisation with Tableau
Data Analytics and Visualisation with TableauData Analytics and Visualisation with Tableau
Data Analytics and Visualisation with TableauGergely Szécsényi
 
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...Facultad de Informática UCM
 
Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasWes McKinney
 
Slides 111017220255-phpapp01
Slides 111017220255-phpapp01Slides 111017220255-phpapp01
Slides 111017220255-phpapp01Ken Mwai
 
Maria Patterson - Building a community fountain around your data stream
Maria Patterson - Building a community fountain around your data streamMaria Patterson - Building a community fountain around your data stream
Maria Patterson - Building a community fountain around your data streamPyData
 
Ingredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksIngredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksOscar Corcho
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Mapping the European(a) metadata landscape
Mapping the European(a) metadata landscapeMapping the European(a) metadata landscape
Mapping the European(a) metadata landscapeSally Chambers
 
Open Repositories and Interoperability Challenges in UK
Open Repositories and Interoperability Challenges in UKOpen Repositories and Interoperability Challenges in UK
Open Repositories and Interoperability Challenges in UKEDINA, University of Edinburgh
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenVladimir Alexiev, PhD, PMP
 
Building the Mother of All Collections: the future of the National Library's ...
Building the Mother of All Collections: the future of the National Library's ...Building the Mother of All Collections: the future of the National Library's ...
Building the Mother of All Collections: the future of the National Library's ...wcathro
 
VRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffVRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffHeather Seneff
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked dataEnno Meijers
 

Similar to Elasticsearch: a key element of Invenio 3 (20)

A portrait of Europeana as a Linked Open Data case
A portrait of Europeana as a Linked Open Data caseA portrait of Europeana as a Linked Open Data case
A portrait of Europeana as a Linked Open Data case
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
 
What to Expect of the LSST Archive: The LSST Science Platform
What to Expect of the LSST Archive: The LSST Science PlatformWhat to Expect of the LSST Archive: The LSST Science Platform
What to Expect of the LSST Archive: The LSST Science Platform
 
NESSTAR: Preparing, viewing, analyzing, downloading
NESSTAR: Preparing, viewing, analyzing, downloadingNESSTAR: Preparing, viewing, analyzing, downloading
NESSTAR: Preparing, viewing, analyzing, downloading
 
eNanoMapper database, search tools and templates
eNanoMapper database, search tools and templateseNanoMapper database, search tools and templates
eNanoMapper database, search tools and templates
 
Semantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital LibrariesSemantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital Libraries
 
Data Analytics and Visualisation with Tableau
Data Analytics and Visualisation with TableauData Analytics and Visualisation with Tableau
Data Analytics and Visualisation with Tableau
 
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...
Analyzing large multimedia collections in an urban context - Prof. Marcel Wor...
 
Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
 
Slides 111017220255-phpapp01
Slides 111017220255-phpapp01Slides 111017220255-phpapp01
Slides 111017220255-phpapp01
 
Maria Patterson - Building a community fountain around your data stream
Maria Patterson - Building a community fountain around your data streamMaria Patterson - Building a community fountain around your data stream
Maria Patterson - Building a community fountain around your data stream
 
Ingredients for Semantic Sensor Networks
Ingredients for Semantic Sensor NetworksIngredients for Semantic Sensor Networks
Ingredients for Semantic Sensor Networks
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Mapping the European(a) metadata landscape
Mapping the European(a) metadata landscapeMapping the European(a) metadata landscape
Mapping the European(a) metadata landscape
 
Open Repositories and Interoperability Challenges in UK
Open Repositories and Interoperability Challenges in UKOpen Repositories and Interoperability Challenges in UK
Open Repositories and Interoperability Challenges in UK
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
Sylva
SylvaSylva
Sylva
 
Building the Mother of All Collections: the future of the National Library's ...
Building the Mother of All Collections: the future of the National Library's ...Building the Mother of All Collections: the future of the National Library's ...
Building the Mother of All Collections: the future of the National Library's ...
 
VRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffVRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_Seneff
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked data
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Elasticsearch: a key element of Invenio 3

  • 1. Elasticsearch: a key element of Invenio 3 Elasticsearch Meetup Johnny Mariéthoz Lausanne, 2017/03/10
  • 2. About Me 12 years as computer scientist in machine learning 7 years as Invenio developer and instance maintainer bass and double bass player newbie as analog camera photographer Library Network of Western Switzerland 2 Lausanne, 2017/03/10
  • 3. Library Network of Western Switzerland Library Network of Western Switzerland 3 Lausanne, 2017/03/10
  • 4. RERO: Library Network of Western Switzerland 220 libraries academic libraries, heritage libraries, public libraries, school libraries or specialized libraries 50’000 students 5 cantons: FR, GE, JU, NE, VS 280’000 registered patrons 3 academic universities Geneva, Fribourg, Neuchâtel 1 University of Applied Sciences central office in Martigny 19 employees Library Network of Western Switzerland 4 Lausanne, 2017/03/10
  • 5. Typical Data Centered Web Application Data Web Server Data Schema Persistant Storage Search Engine PID Store External Files REST API HTML WEB Pages GUI Search Engines: Google, etc. External Services Access Rights Files Download / Preview for users Other Formats Browser apps Desktop apps Library Network of Western Switzerland 5 Lausanne, 2017/03/10
  • 6. Common Needed Features data management with versioning, validation and PID (Persistent Identifiers) search engine rights management (ACL, oauth) web page management with templates (search results and others, such as news, front-page, etc.) url management (routing) REST API generation format conversion/migration CLI utilities data acquisition (html forms based editor) Library Network of Western Switzerland 6 Lausanne, 2017/03/10
  • 7. Development modular software architecture easy new module creation webassets management for less, sass, nodejs, etc. asynchronous task management unit testing and logging i18n (translations) web front-end and back-end and many more... Library Network of Western Switzerland 7 Lausanne, 2017/03/10
  • 8. Library Network of Western Switzerland 8 Lausanne, 2017/03/10
  • 9. History digital library and document repository software created by CERN mature platform: first public release v0.0.9 in 2002 open source project originated in high-energy physics institutional repository: CERN Document Server integrated library system: CERN Document Server disciplinary repository: INSPIRE open research data server: ZENODO self-contained python, mysql web application until 1.x transition with v2.x complete new rewritten v3 Library Network of Western Switzerland 9 Lausanne, 2017/03/10
  • 10. Used Technologies set of python modules include module interaction mechanisms delivered as a framework around state-of-the-art technologies steep learning curve Library Network of Western Switzerland 10 Lausanne, 2017/03/10
  • 11. + Library Network of Western Switzerland 11 Lausanne, 2017/03/10
  • 12. Elasticsearch Integration SQL only for persistent data, no more SQL query during the HTTP request data model using JSON, JSON-Schema and ES Mapping sorting, facets and query configuration by type of object use official elasticsearch python package: elasticsearch-dsl CLI to create indexes, push mappings and index the data Library Network of Western Switzerland 12 Lausanne, 2017/03/10
  • 13. JSON-Schema JSON File [{ "album": "A Tribute to Jack Johnson", "artist": "Miles Davis", "genre": [ "Jazz" ], "mime": "audio/mp3", "performers": [ "Miles Davis" ], "tracks": [ "Part 1", "Part 2" ], "year": 1970 }, ...] Schema File {"title": "Music Album", "type": "object", "properties": { "album": { "type": "string", "minLength": 1 }, "mime": { "type": "string", "minLength": 7, "pattern": "^audio/", "enum": [ "audio/flac", "audio/mp2", "audio/mp3", "audio/mp4", "audio/vorbis" ] }, ...}, "required": ["album", "year"], "additionalProperties": false} Library Network of Western Switzerland 13 Lausanne, 2017/03/10
  • 14. ES Mapping JSON File [{ "album": "A Tribute to Jack Johnson", "artist": "Miles Davis", "genre": [ "Jazz" ], "mime": "audio/mp3", "performers": [ "Miles Davis" ], "tracks": [ "Part 1", "Part 2" ], "year": 1970 }, ...] Mapping File {"mappings": { "record-v1.0.0": { "date_detection": false, "numeric_detection" : false, "properties": { "album": { "type": "string", "analyzer": "english", "copy_to": "sort_album" }, "sort_album": { "type": "string", "index": "not_analyzed" }, "artist": { "type": "string", "analyzer": "standard", "copy_to": "facet_artist" }, "facet_artist": { "type": "string", "index": "not_analyzed" }, ... } } } } Library Network of Western Switzerland 14 Lausanne, 2017/03/10
  • 15. Schema and Mapping JSON-SCHEMA: schemas/records/record-v1.0.0.json should be included in the data ($schema) used for data validation is the documentation for humans name is important: i.e. index_name: records-record-v1.0.0, document_type: record-v1.0.0 for record indexing Mapping mappings/records/record-v1.0.0.json can be set using a CLI name is important: index_name: records-record-v1.0.0 with alias=records, document_type: record-v1.0.0 during the index creation Library Network of Western Switzerland 15 Lausanne, 2017/03/10
  • 16. Configuration Facets Configuration RECORDS_REST_FACETS = dict( records=dict( aggs={ ’genre’: dict(terms=dict(field=’genre’, size=10)), ’years’: dict(date_histogram=dict( field=’year’, interval=’year’, format=’yyyy’) ) }, filters=dict( genre=terms_filter(’genre’) ), post_filters=dict( years=range_filter( ’year’, format=’yyyy’, end_date_math=’/y’), ) ) ) Library Network of Western Switzerland 16 Lausanne, 2017/03/10
  • 17. Configuration Sorting Configuration RECORDS_REST_SORT_OPTIONS = dict( records=dict( bestmatch=dict( fields=[’-_score’], title=’Best match’, default_order=’asc’, order=1, ), mostrecent=dict( fields=[’-_created’], title=’Most recent’, default_order=’asc’, order=2, ) ) ) RECORDS_REST_DEFAULT_SORT = dict( records=dict(query=’bestmatch’, noquery=’mostrecent’), ) Library Network of Western Switzerland 17 Lausanne, 2017/03/10
  • 18. Configuration REST Configuration RECORDS_REST_ENDPOINTS = dict( recid=dict( search_class=RecordsSearch, search_index=’records’, search_type=None, record_serializers={ ’application/json’: (’invenio_records_rest.serializers’ ’:json_v1_response’), }, search_serializers={ ’application/json’: (’invenio_records_rest.serializers’ ’:json_v1_search’), }, search_factory_imp=es_search_factory, list_route=’/records/’, item_route=’/records/<pid(recid):pid_value>’ ) ) Library Network of Western Switzerland 18 Lausanne, 2017/03/10
  • 19. Demo Library Network of Western Switzerland 19 Lausanne, 2017/03/10
  • 20. Conclusion very generic and flexible tool great open source community (many thanks to the CERN) easy to prototype and develop new applications and features demands time to master (learning curve) at the center of RERO’s future developments swiss open access research publications repository (SONAR) new Integrated Library System (ILS) for public libraries (3 years project) and many more projects... Library Network of Western Switzerland 20 Lausanne, 2017/03/10
  • 21. References RERO http://www.rero.ch Invenio http://invenio-software.org/ Invenio Documentation http://invenio.readthedocs.io Elasticsearch https://www.elastic.co CERN http://home.cern JSON-LD http://json-ld.org/ JSON Schema http://json-schema.org/ ZENODO https://zenodo.org/ Library Network of Western Switzerland 21 Lausanne, 2017/03/10