Best method for including only certain PDF files in search.

Home Forums Product Support Forums Ajax Search Pro for WordPress Support Best method for including only certain PDF files in search.

This topic contains 3 replies, has 2 voices, and was last updated by gztba gztba 5 years, 9 months ago.

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
    Posts
  • #18686
    gztba
    gztba
    Participant

    Hello. I just purchased the plugin and think it’s great. However, I’m having trouble finding the best way to only include certain PDF files in the search. This is actually a three-part question, but first let me explain how I would like to use the search tool. We have five products that we sell on our site, each of which will soon have its own support page full of around 40 support topics that are links to PDF files (so, 200 PDF files in total). I would like to put a separate search instance at the top of each of the these product support pages, and configure things so the search is limited to just that product’s 40 or so PDF files, and not the other 160 PDF files. Here are my questions:

    1) I know if these were pages/posts I could add certain tags or categories, and limit search to a specific tag/category. Media Library files like PDFs have a description field, which I know can be included as part of the search, but it doesn’t appear that I can add a specific word to the description field and only limit the search to description fields that contain that word in the description. Is that correct? I see that I can exclude attachments with certain IDs, but that would require excluding 160 IDs and isn’t very manageable (I didn’t see an option to just include certain IDs). I also see that I can limit what PDFs are in a given search instance based on the user who uploaded the PDF, so it seems one option to achieve my objective would be to create five dummy WP users (and name them after our five products), and then upload each of a given product’s support PDFs from the WP user that I created just for it — and then specify for that product’s search instance to only user PDFs uploaded by the product-associated WP user. I think this works, but was wondering if there was an easier, slicker way to achieve my objective using the plugin where I don’t have to create a bunch of extra WP users.

    2) Is there a way to configure a search instance to only search for PDFs in a given directory on the server, for PDFs that weren’t uploaded into the Media Library. There are some PDFs we were considering uploading via FTP into their own directory so we can given them the URL structure we want, but didn’t see a way to target those PDF files since they aren’t in the Media Library.

    3) Finally, PDFs — as I’m sure you know — have meta data that can be added to the PDF itself, such as Title, Author, Subject, and Keywords. If there any way for the plugin to a) search this data (which I imagine could be easier than searching the content of the PDF itself, which we’re not having any luck with at the moment), and/or b) limit searches to just PDFs with a specific word in one of these PDF meta data fields?

    Thanks so much for your help!
    Adam

    #18691
    gztba
    gztba
    Participant

    Hi again! I wanted to update my original three question

    1) I was able to add a category option to media files with some easy code in functions.php. Cool. I’m thinking that should take care of question #1, except that when I create a category and assign it to a PDF, and then go to the “Advanced” > “Exclude Results” tab in the plugin, and add the relevant category to the “INCLUDE posts only from selected categories” field (and then save and refresh my page with the search instance), it seems to include both the single PDF to which I assigned the category as well as all other PDFs in the media library UNLESS I assign these other PDFs some other category. Is it possible to include in search results ONLY items with a given category assigned, even if not all PDFs are assigned a category. If not, I’ll just go back to old, unrelated PDFs and given them an Uncategorized category.

    2) I’m still interested if this is possible (searching only files in a specific directory on the server).

    3) I’m still interested if this is possible (searching ONLY meta data entered in the PDF file itself), but did want to update that searching within the PDF content is now working for me. I earlier made the mistake of specifying to use the Index Table Engine on the Sources tab, but not the Attachments tab. Specifying it on the Attachments tab did the trick.

    Thanks — I look forward to your responses when back online!

    Adam

    #18704
    Ernest Marcinko
    Ernest Marcinko
    Keymaster

    Hi!

    Thank you for updating and the details, it helps me a lot.

    1. In this case, you need to turn off this option here: https://i.imgur.com/C0LuL76.png
    It will exclude items that are not assigned to any categories.

    2. That is not possible, the files need to be uploaded via the media library, so they are registered within the database.

    3. There is a way to index the PDF file contents, you can find all the details in this documentation section: Searching file contents
    The parser scripts should be able to exctract these metadata as well as the content from the PDF files. If the PDFs are not secured, encrypted or protected, it should very likely work. Making exclusions based on this metadata is however not possible.

    Best,
    Ernest Marcinko

    If you like my products, don't forget to rate them on codecanyon :)


    #18723
    gztba
    gztba
    Participant

    Thanks! Your response to #1 did the trick. All is good now with the search configuration. I appreciate the quick and clear responses to all three questions.

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.