Our work on the EC-TEL paper data extraction.
Upcoming SlideShare
Loading in...5
×
 

Our work on the EC-TEL paper data extraction.

on

  • 923 views

my slides on the work we did on the data extraction of the PDF files of the proceedings of the last 4 years of EC-TEL conferences.

my slides on the work we did on the data extraction of the PDF files of the proceedings of the last 4 years of EC-TEL conferences.

Statistics

Views

Total Views
923
Views on SlideShare
921
Embed Views
2

Actions

Likes
0
Downloads
1
Comments
0

1 Embed 2

http://www.slideshare.net 2

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Our work on the EC-TEL paper data extraction. Our work on the EC-TEL paper data extraction. Presentation Transcript

  • How we did it... Parscit
  • How we did it... Parscit
  • How we did it... Parscit
  • How we did it... Parscit
  • How we did it... Parscit
  • How we did it... Parscit REST API
  • Lessons learned • data gathering from PDF is only OK for some data • alot of cleanup work + complexity with distributed clean up data • future: more structured data as a starting point.
  • What we want... • clean citation data • geographical data: author - affiliation links • structured data • ...
  • What might be helpful... } PDF Author Title