PDF search query not working

This topic has 3 replies, 2 voices, and was last updated 6 years, 2 months ago by Ernest Marcinko.

Viewing 4 posts - 1 through 4 (of 4 total)

Author

Posts
April 28, 2020 at 6:55 pm #26975

ashleygonzalez68
Participant

Hello! I am new to using Ajax Search Pro and have followed the very detailed set up instructions and manuals. Everything seems to be working properly but the search is not searching any PDF documents. These documents do have OCR, so they are searchable. One of the PDFs in question is located at http://172.105.155.18/flagler-tribune/flagler-tribune-issue-1918-dec-12/.

Using words directly from the PDF, such as CIGARS, it doesn’t bring up the PDF in the results.

Am I missing a setting?

April 29, 2020 at 12:49 pm #26995

Ernest Marcinko
Keymaster

Hi,

Thank you for the details, it helped a lot.

I tried and downloaded that PDF file to run some tests on it. Some of the content is indeed accessible, but there are lots of invalid/incomplete characters as well, and that conflicted greatly with both parsers.
It took a few hours, but I managed to make it work. It extracted around 3000 words from that document. I still think some words may be missing, but it should be much much better now.

I will make sure to include this improvement in the upcoming release.

April 29, 2020 at 2:48 pm #27015

ashleygonzalez68
Participant

As you saw, that’s an OCR scan of a newspaper. I have weekly issues from Dec 1918 through Dec 1987 to upload. I did find a workaround, creating a 0 px font and paste of the OCR text, but that would be for 3000+ issues, and a lot of volunteer hours on my end, which is why I purchased a search plugin to do that work for me. 😉 What can I do to replicate your fix while I wait for the updated plugin?

Thanks!

Ashley

April 29, 2020 at 3:28 pm #27016

Ernest Marcinko
Keymaster

You cannot access this content.
Author

Posts

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.