Stop words not being excluded from search results?

Home Forums Product Support Forums Ajax Search Pro for WordPress Support Stop words not being excluded from search results?

This topic contains 10 replies, has 2 voices, and was last updated by thebizpixie thebizpixie 1 year, 1 month ago.

Viewing 11 posts - 1 through 11 (of 11 total)
  • Author
    Posts
  • #41344
    thebizpixie
    thebizpixie
    Participant

    Hi Ernest,

    My client noticed recently that when they do a search for something like “what is change management”, the results returned – in both the live results and the search results page – include results based on the term “what is”, even though “what” and “is” are in the stop words list in my index table settings.

    My expectation is that those stop words would not be indexed for my content, nor would they be used as matching terms in search results. Is this correct?

    If so, is it possible that the stop words list is not being applied somehow to my index? And what would you suggest to test and / or address this?

    If not, what else might be going on here and how can I fix it?

    Thanks,

    Nikki

    Attachments:
    You must be logged in to view attached files.
    #41486
    thebizpixie
    thebizpixie
    Participant

    Hi Ernest,

    I’d love your thoughts on what’s going on here and how I can improve the results.

    My client recently purchased another six months of support, so I hope that helps.

    Thanks,

    Nikki

    #41566
    thebizpixie
    thebizpixie
    Participant

    Hi Ernest,

    I hope everything is OK at your end. Can you please let me know what else I need to provide to you to get a response on this?

    Thanks,

    Nikki

    #41654
    Ernest Marcinko
    Ernest Marcinko
    Keymaster

    Hi,

    I am very sorry, I must have missed your ticket by accident, I am not sure why I didn’t get notifications.

    It is very unlikely that the words are still indexed, if they are excluded. First make sure the stop-words are enabled, and then you should try re-indexing, to make sure everything is updated.
    After that, make sure that the index table engine is indeed enabled. Often times this step is missed and still the regular engine is used.

    Best,
    Ernest Marcinko

    If you like my products, don't forget to rate them on codecanyon :)


    #41743
    thebizpixie
    thebizpixie
    Participant

    Hi Ernest,

    Thanks for your reply. Hopefully you get notifications this time!

    I have followed all three steps you suggested. I did find that the index table engine was not enabled, but stop words were definitely enabled in the index.

    After enabling the index table engine and reindexing just to be safe, I’m still having the same problem (see attached screenshot).

    Any other ideas as to what I can try to get this working?

    Thanks,

    Nikki

    Attachments:
    You must be logged in to view attached files.
    #41750
    Ernest Marcinko
    Ernest Marcinko
    Keymaster

    I can’t tell for sure only from the screenshot, it looks like those results are likely matching the other keywords (or partial matches for the other words), even if they are not highlighted. Searching “is” and “what” can still match partially other words, which may contain these strings, like “whatnot” or “louis” for example.

    Try enforcing a more strict keyword logic, that should make sure that only complete keyword matches are allowed: https://i.imgur.com/hmWob1d.png
    That will make sure to only allow full keyword matches, and only if the phrase is matching all of the keywords in any order.

    Best,
    Ernest Marcinko

    If you like my products, don't forget to rate them on codecanyon :)


    #41883
    thebizpixie
    thebizpixie
    Participant

    The difficulty I have is that the results are different for “what is change management”:
    https://sacsconsult.com.au/?s=what+is+change+management&asp_active=1&p_asid=1&p_asp_data=1&filters_initial=1&filters_changed=0&qtranslate_lang=0&current_page_id=38557

    versus just “change management”:
    https://sacsconsult.com.au/?s=change+management&asp_active=1&p_asid=1&p_asp_data=1&filters_initial=1&filters_changed=0&qtranslate_lang=0&current_page_id=38557

    If “what” and “is” were truly being ignored, I would have expected the results to be pretty much the same.

    But the first set of results returns 14 articles, many of which are not about change.

    Whereas the second set of results returns 51 articles, more of which are about change, at least in the initial rows of results.

    Is it possible that by using the terms “what” and “is” in the search terms that certain articles are being excluded from the results?

    The first set is not an exact subset of the second, but it does seem close.

    I tried enabling exact match keyword logic as per your screenshot, but then the first search returned NO results, and the second 47.

    I’m not keen to enforce exact matching just yet, as this may not be the best user experience, given the likely search terms I know my visitors will use. I’d rather figure out what’s going on here with the stop words if we can.

    Would you mind having a look at the two above sets of search results and seeing if you can discern what might be going on?

    I’m happy for you to have access to the back end if you want to better understand my settings.

    Thank you.

    Nikki

    #41892
    Ernest Marcinko
    Ernest Marcinko
    Keymaster

    If “what” and “is” were truly being ignored, I would have expected the results to be pretty much the same.

    Keyword exception in the results content does not mean they are being ignored as search phrases. The words can still fuzzy match, it only means, that these words were not indexed to search for. Keywords having parts of these ignored words are still indexed.
    The exceptions are applied for the results, not for the phrases.

    If you want to fully ignore words from the phrase entered, you can also do that here: https://i.imgur.com/pUBgE7F.png

    That will yield the same exact results for both cases.

    Best,
    Ernest Marcinko

    If you like my products, don't forget to rate them on codecanyon :)


    #41902
    thebizpixie
    thebizpixie
    Participant

    Thanks for the explanation. I’ve added the words to the exceptions list, and can see that the results are now the same across both search inputs.

    I’m not sure I fully understand what you mean by “The words can still fuzzy match, it only means, that these words were not indexed to search for.”

    What is the difference between a word not being indexed to search for and it fuzzy matching? Does this mean that the whole word won’t match, but word subset matches will still be returned, or does it mean something else?

    And is that why the results were fewer in the first set, because the other articles did not contain the string “what” or “is” somewhere within the words of the content?

    I love the power of your search, but I’d like to better understand it so I’m using the settings appropriately.

    Thank you.

    #41908
    Ernest Marcinko
    Ernest Marcinko
    Keymaster
    What is the difference between a word not being indexed to search for and it fuzzy matching? Does this mean that the whole word won’t match, but word subset matches will still be returned, or does it mean something else?
    
    And is that why the results were fewer in the first set, because the other articles did not contain the string “what” or “is” somewhere within the words of the content?

    Exactly that 🙂 Exceptions on the index table settings are applied during indexing. Whenever the plugin indexes a post, it extracts the keywords and if a keyword matches an exception it is removed and not indexed. This only happens for whole words, otherwise it would make a big mess. So words containing the exceptions are still indexed – ex.: “louis” contains “is”, it is still indexed

    On the other hand the exceptions for the phrases works from the “other end”, so when the user types in “what is change management”, then the actual search phrase becomes “change management” – so “what” and “is” are not even considered, they are removed before search. Therefore a post containing the keyword “louis” wouldn’t match, while in the other case it potentially could.

    Best,
    Ernest Marcinko

    If you like my products, don't forget to rate them on codecanyon :)


    #41966
    thebizpixie
    thebizpixie
    Participant

    Great, thanks for the clarification.

    That should help us improve our search experience going forwards.

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic.