
Indexing your office documents with Elastic and FSCrawler
You have plenty of Open Office, Microsoft Office, PDF, image documents and you may want to be able to search for their metadata and content. How can you do that?
In this talk, David will explain how Apache Tika can be used for that and how to combine this fantastic library with Elastic Stack:
- Elasticsearch ingest-attachment plugin
- FSCrawler
Resources
- Demo: FSCrawler β This repository contains the code for the FSCrawler demo.
- Documentation: FSCrawler β The official FSCrawler documentation