0
How we did it...


Parscit
How we did it...


Parscit
How we did it...


Parscit
How we did it...


Parscit
How we did it...


Parscit
How we did it...


Parscit



                       REST API
Lessons learned

• data gathering from PDF is only OK for
  some data
• alot of cleanup work + complexity with
  distribut...
What we want...

• clean citation data
• geographical data: author - affiliation links
• structured data
• ...
What might be helpful...




         }
PDF
Author
 Title
Our work on the EC-TEL paper data extraction.
Our work on the EC-TEL paper data extraction.
Upcoming SlideShare
Loading in...5
×

Our work on the EC-TEL paper data extraction.

546

Published on

my slides on the work we did on the data extraction of the PDF files of the proceedings of the last 4 years of EC-TEL conferences.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
546
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Transcript of "Our work on the EC-TEL paper data extraction."

    1. 1. How we did it... Parscit
    2. 2. How we did it... Parscit
    3. 3. How we did it... Parscit
    4. 4. How we did it... Parscit
    5. 5. How we did it... Parscit
    6. 6. How we did it... Parscit REST API
    7. 7. Lessons learned • data gathering from PDF is only OK for some data • alot of cleanup work + complexity with distributed clean up data • future: more structured data as a starting point.
    8. 8. What we want... • clean citation data • geographical data: author - affiliation links • structured data • ...
    9. 9. What might be helpful... } PDF Author Title
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×