• Email
  • Like
  • Save
  • Private Content
  • Embed
 

Mime Magic With Apache Tika

by on Nov 06, 2009

  • 7,234 views

Apache Tika aims to make it easier to extract metadata and structured text content from all kinds of files. Tika is a subproject of Apache Lucene, and leverages libraries like Apache POI and Apache ...

Apache Tika aims to make it easier to extract metadata and structured text content from all kinds of files. Tika is a subproject of Apache Lucene, and leverages libraries like Apache POI and Apache PDFBox to provide a powerful yet simple interface for parsing dozens of document formats. This makes Tika an ideal companion for Apache Lucene, or for any search engine that needs to be able to index metadata and content from many different types of files. This presentation introduces Apache Tika and shows how it's being used in projects like Apache Solr and Apache Jackrabbit. You will learn how to integrate Tika with your application and how to configure and extend Tika to best suit your needs.

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel

4 Embeds 88

http://dev.day.com 52
http://www.slideshare.net 34
http://66.102.9.132 1
https://dev.day.com 1

Statistics

Likes
0
Downloads
68
Comments
0
Embed Views
88
Views on SlideShare
7,146
Total Views
7,234
Post Comment
Edit your comment

Mime Magic With Apache Tika Mime Magic With Apache Tika Presentation Transcript