B2SHARE REST API

www.eudat.eu
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
B2SHARE REST API hands-on
Hans van Piggelen, hans.vanpiggelen@surfsara.nl
Thursday July 6th, 2017
This work is licensed under the Creative
Commons CC-BY 4.0 licence

Today’s hands-on
Theory: (30 min)
The B2SHARE REST API
B2SHARE concepts, variables, metadata schemas
API: making requests, authentication and payloads
Publication workflow
Practice: (1 hour)
Simple examples
Hands-on exercises
Next hands-on: B2FIND

What is the B2SHARE REST API?
The B2SHARE REST API is a set of instructions to interact
with a B2SHARE service instance
The B2SHARE REST API:
Allows direct remote interaction with the B2SHARE service
without using a graphical user interface
Allows integration within application or data processing
workflows for automation of publishing tasks
Supports any programming language or operating system
that supports HTTP requests
There is (almost) no limitation on usage and number of calls,
even for unregistered users
To create new or modify existing content, registration is
required

Why use the B2SHARE REST API?
Using the B2SHARE API you can automate:
Creation of records
Uploading of files
Adding/changing metadata
Changing publications state
Retrieval of object data
Enable:
Precise replication of metadata into repository
Large file uploads
Implement publishing in your own workflow or application
Direct data ingest
Ease administration and overview of your records

What can I do with the B2SHARE API?
List (all) existing records and communities
Search for specific records and communities
Retrieval of community-specific information
Including community metadata schemas
Create new draft records
Upload files to draft records
Add metadata using metadata schemas
Publish draft records
Modification of the metadata of existing records
More to be added in future releases…

Important concepts of B2SHARE
Records:
Contain data files and associated metadata
Connected to a community which possibly maintains it
Metadata:
Set of common fixed metadata fields and custom metadata
blocks with additional fields
Governed by fixed and community metadata schemas
Communities:
Curate datasets which are part of the scientific domain or a
research project
Maintain their own metadata schemas and have community
administrators
States:
Current condition of a record, either draft, submitted or published
Can be changed through the API
7

B2SHARE request variables
B2SHARE defines several request variables that
function as identifiers for objects in B2SHARE
Used in most HTTP request addresses as part of the
path to access specific objects directly
Most important variables:
COMMUNITY_ID: identifier of a user community
RECORD_ID: identifier for a specific record, in either
state
FILE_BUCKET_ID: identifier for a set of files of a
specific record
8

Draft records and versioning
Draft records:
Can be updated with new
files and metadata
Have publication state ‘draft’
Published records:
Cannot have files updated
anymore
Metadata updates possible
but discouraged
B2SHARE supports
versioning of records:
Existing published records
can be updated through their
draft counterpart
Creates new PIDs, bucket
IDs, links
Draft
✔ New files
✔ New records
Published
✗ New files
✗ New records
Draft

Community metadata schemas
Every community defines its own metadata schema
using a hierarchical JSON Schema-based structure
Metadata schemas:
Contain descriptions, vocabularies and expected
structure and format of every metadata field,
including optional fields
May contain community-specific fields
Definitions are publically available, e.g. EUDAT:
10

Example metadata field definitions

Making a request
Requests can be made by using a specific application
directly or by using a programming language that
supports making requests in code
Example applications:
GUI: any file transfer application or web interface
Command line: cURL, wGet
Almost all programming languages support making
requests over HTTP
For more complex operations (like publication), a
dedicated interface or command line application is more
useful
16

HTTP requests
A specific call to a service through an API using a method
with address and parameters
An address is a URL:
URL = protocol + hostname + port + path
Protocol: always http:// or https://
Hostname: base address, e.g. b2share.eudat.eu
Port: sometimes required specifically, usually 80 or 443
Path: endpoint specification, e.g. /api/record/1
Optional parameters are additional options given to the
request and are added to the URL
On success the current state of the requested piece of
information is provided
The B2SHARE service accepts payloads along with a HTTP
request, e.g. text, binary data, files
17

HTTP request method
Different methods have different meaning, but up to
service on how to process them
Common methods:
GET: requests data from a specified resource
POST: submits data to be processed to a specified
resource
PUT: uploads a representation of the specified URI
PATCH: modify state of specified resource
DELETE: delete a specified resource

HTTP responses
Every request returns a status, header and message body, even
when an error occurred
Status line: status code and reason
Header: information on body content
Body: actual response text
Status codes:
1xx: Informational – Request received, continuing process
2xx: Success – Action was successfully received, understood,
and accepted
3xx: Redirection – Further action must be taken in order to
complete the request
4xx: Client Error – Request contains bad syntax or cannot be
fulfilled
5xx: Server Error – Server failed to fulfill an apparently valid
request
HTTP response is pure text, needs interpretation

HTTP request overview
20
Browser CLI tool
Your app or
workflow
HTTP response:
- Header
- Status code
- Response text
HTTP request:
- Request method
- Header
- URL & parameters
- Authentication
- Payloads
Server
Client

Authentication through the API
B2SHARE does not accept username and password
combination, instead use tokens for authentication!
B2SHARE contains open and restricted data:
Public: all published records and metadata, most files
No access token required
Private: your draft records and files in private records
Only accessible using your access token as parameter
in HTTP request
Access tokens:
Automatically generated unique string of characters
attached to your account in B2SHARE
Only known by the owner, do not share with others!
21

Full B2SHARE publication workflow
Publishing in B2SHARE using the API involves multiple
steps:
Identify a target community for your data
Retrieve the metadata schema definition of the
community
The submitted metadata will have to conform to
this schema
Create a draft record:
Upload files into the draft record (one by one)
Add metadata according to schema (possibly in
multiple steps)
Publish the record
22

Full publication workflow diagram
23
Create
draft
Add metadata
Add files
POST /api/records
GET /api/records/RECORD_ID/draft
PATCH /api/records/RECORD_ID/draft
PUT
/api/files/FILE_BUCKET_ID/FILE_NAME
Commit
Published
record
PATCH /api/records/RECORD_ID/draft
Draft
record
PID
Checksum
Needs
approval?
Submitted
record
Community
Approve
NO
YES

Adding metadata
Metadata is added to a record using the API:
Upon creation of a draft record in a POST request
Or by providing so-called JSON patches in a PATCH
request
Patches modify the current state of the metadata by either
changing, adding or removing fields and values
The structure of the data provided in the patch request must
strictly follow the
metadata schema of the
community
As many patches as necessary
can be applied before
publishing your draft record
24
Add
metadata
Draft
record

Adding files
Files can be added during the draft phase of your new
record using a PUT request
Files are uploaded into the file bucket of the draft
record, not the draft record itself
Use the file bucket ID found in the metadata of the
draft record
Files are uploaded one-by-one in separate requests
As many file upload requests
as necessary can be made
before publishing your draft
record
25
Add files
Draft
record
Checksum

Publishing your draft record
Draft records are published by altering the value of the
publication state in the metadata
Once your record is published:
The included files can not be changed anymore!
No new files can be added!
Metadata can be changed after publication, but will
create a new version of your published record
Persistent identifiers are automatically added on
commit
26
Commit
Draft
record
Published
record
PID

Simple examples
Protocol and host: https://trng-b2share.eudat.eu
Application: python (using command-line interface)
HTTP method: GET
Retrieve all existing records:
List all communities:
Search for specific records of a community:
27
r = requests.get(‘https://trng-b2share.eudat.eu/api/records’)
r = requests.get(‘https://trng-b2share.eudat.eu/api/records?
q=community:COMMUNITY_ID`)
r = requests.get(‘https://trng-b2share.eudat.eu/api/communities’)

Complex example
HTTP method: POST
Create draft record ‘My test upload’:
header = {‘Content-Type’: 'application/json'}
metadata = {"titles": [{"title":"My test upload"}],
"community": "e9b9792e-79fb-4b07-b6b4-b9c2bd06d095",
"open_access": True}
r = requests.post('https://trng-b2share.eudat.eu/api/records/', params=parameters,
data=json.dumps(metadata),
headers=header)
parameters = {'access_token': token}

JSON patch
A set of operations that alter an existing set of metadata
fields based on another set of fields
Loosely equals the difference between two sets
Operations: add, remove, replace, copy, move, test
Generated by jsonpatch package
Requires PATCH request to apply to record
Metadata
OLD
Metadata
NEW
Metadata
PATCH

JSON patch example
[
{
"path": "/community_specific",
"value": {},
"op": "add"
},
{
"path": "/disciplines",
"value": [
"EUDAT Summer School"
],
"op": "add"
}
]

Today’s hands-on exercises
Get and store your API token
Retrieve single record information
Check metadata and included files
Download files and compare checksum
Retrieve existing communities
Retrieve community metadata schema
Investigate metadata schema structure
Create a new draft record
Upload files and metadata
Update and complete metadata
Publish record

General instructions
Create an API token on the B2SHARE training website
Requirements for each request:
Request URL and HTTP method (e.g. GET, PUT)
Optional:
Object identifiers (e.g. record, community)
Additional parameters (e.g. your token)
Data payloads (e.g. files or text)
Use requests package for HTTP requests
Use jsonpatch package to create metadata update
patches
B2SHARE API endpoint: /api

General instructions
Log in to 145.100.59.156
ssh <user>@145.100.59.156
Use Python or iPython as interface
Helpful links:
Exercises: https://hdl.handle.net/21.T12996/ESS2017-
B2SHARE-API
Example image:
https://hdl.handle.net/21.T12996/ESS2017-Image.png
Backup token:
https://hdl.handle.net/21.T12996/token.txt
Ask questions anytime!

Getting your access token
Log in on B2SHARE and navigate to profile page:
Create a token by entering a new name:
Click on ‘New token’

Saving your token to file
Store the token in a file so it can be restored later:
Load the token in Python:
$ echo “<your token>” > token.txt
f = open(‘token.txt’, ‘r’)
token = f.read().strip()

Exercise 1a: single record retrieval
Endpoint: /api/records/<RECORD_ID>
Method: GET
Response status code: 200
RECORD_ID: 47077e3c4b9f4852a40709e338ad4620
Steps:
Create the URL
Retrieve the object data
Parse the response text

Exercise 1b: process record metadata
Use data of previous record
Steps:
Get the metadata field values
Investigate the file(s) contained
Check if open access and published
Get the file bucket ID, file key(s) and checksum(s)

Exercise 1c: download file and check
Endpoint: /api/files/<FILE_BUCKET_ID>/<FILE_KEY>
Method: GET
Steps:
Create the URL
Download files
Calculate checksum and compare

Exercise 2a: communities retrieval
Endpoint: /api/communities
Method: GET
Steps:
Get all communities
Parse the response text
Locate the EUDAT community and its ID

Exercise 2b: EUDAT community records
Endpoint:
/api/records?q=community:<COMMUNITY_ID>
Method: GET
COMMUNITY_ID: <>
Steps:
Set required parameters for request
Determine number of records
Show first record

Exercise 2c:
EUDAT community metadata schema
Endpoint:
/api/communities/<COMMUNITY_ID>/schemas/last
Method: GET
COMMUNITY_ID: <>
Steps:
Determine number of metadata fields
Determine required fields
Determine community-specific fields
Determine metadata field structure

Exercise 3a: create draft record
Endpoint: /api/records/
Method: POST
Steps:
Prepare header and payloads
Get API token
Get draft record ID
Check publication state
Get file bucket ID

Exercise 3b: add files
Endpoint: /api/files/<FILE_BUCKET_ID>/<FILE_KEY>
Method: PUT
FILE_BUCKET_ID: <draft record’s file bucket ID>
FILE_KEY: <your file name>
Steps:
Prepare header
Open file handle
Send file with request

Exercise 3c: add metadata
Endpoint: /api/records/<RECORD_ID>/draft
Method: PATCH
RECORD_ID: <draft record ID>
Steps:
Prepare header
Prepare JSON patch
Send patch with request

Exercise 3d: publish record
Endpoint: /api/records/<RECORD_ID>/draft
Method: PATCH
RECORD_ID: <draft record ID>
Steps:
Prepare header
Prepare JSON patch
Send patch with request
Check publication state of record
Check record in web browser

For more info: https://eudat.eu/services/b2share
B2SHARE User Documentation:
https://eudat.eu/services/userdoc/b2share
B2SHARE Training presentations:
https://www.eudat.eu/b2share-training-suite
B2SHARE hands-on training:
https://github.com/EUDAT-Training/B2SHARE-Training 46

www.eudat.eu
This work is licensed under the Creative Commons CC-BY 4.0 licence
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures.
Contract No. 654065
Authors Contributors
Hans van Piggelen, SURFsara
Thank you!

B2SHARE REST API

Recommended

Recommended

More Related Content

What's hot

What's hot (14)

Similar to B2SHARE REST API

Similar to B2SHARE REST API (20)

More from EUDAT

More from EUDAT (20)

Recently uploaded

Recently uploaded (20)

B2SHARE REST API

Editor's Notes