Your SlideShare is downloading. ×
Our work on the EC-TEL paper data extraction.
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Our work on the EC-TEL paper data extraction.

512
views

Published on

my slides on the work we did on the data extraction of the PDF files of the proceedings of the last 4 years of EC-TEL conferences.

my slides on the work we did on the data extraction of the PDF files of the proceedings of the last 4 years of EC-TEL conferences.

Published in: Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
512
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Transcript

    • 1. How we did it... Parscit
    • 2. How we did it... Parscit
    • 3. How we did it... Parscit
    • 4. How we did it... Parscit
    • 5. How we did it... Parscit
    • 6. How we did it... Parscit REST API
    • 7. Lessons learned • data gathering from PDF is only OK for some data • alot of cleanup work + complexity with distributed clean up data • future: more structured data as a starting point.
    • 8. What we want... • clean citation data • geographical data: author - affiliation links • structured data • ...
    • 9. What might be helpful... } PDF Author Title