Is this possible with the SS Search?
We've moved the forum!
Please use forum.silverstripe.org for any new questions
(announcement).
The forum archive will stick around, but will be read only.
You can also use our Slack channel
or StackOverflow to ask for help.
Check out our community overview for more options to contribute.
I don't think it's possible.
You could build a SS wrapper around the Zend Lucene Engine (http://framework.zend.com/manual/en/zend.search.lucene.html), as described in this Blog-Post: http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/
This could get tricky though :)
I also think you're in the wrong forum with that question.
Also I think you can relatively easily extract the PDF contents onAfterWrite using `pdftotext` and chuck it into the database as Content so the search can pick it up. This works pretty well.
For a more robust and performant solution (rather than MySQL fulltext), have a look at the 'sphinx' module:
http://open.silverstripe.org/browser/modules/sphinx/trunk
Its a fairly new module, and requires you to use SilverStripe 2.4 alpha1, but the underlying technology (sphinx search) is quite stable.
This changeset (committed a couple of days ago) explains how to work with PDFs in sphinx:
http://open.silverstripe.org/changeset/97360/modules/sphinx/trunk
BTW, it uses pdftotext as well :)
Hey Ingo that sounds really neat.
The sphinx binaries have to be installed on the server to use this, right?
Yeah, you need to install that on the server and run it as a daemon. You probably won't be able to run it on shared hosting. But it's worth the effort, very quick :)