This website uses cookies to personalize your experience. By using this website you agree to our cookie policy.

Reply To: Some issues with Index table search of custom fields

Home Forums Product Support Forums Ajax Search Pro for WordPress Support Some issues with Index table search of custom fields Reply To: Some issues with Index table search of custom fields

#35383
Ernest MarcinkoErnest Marcinko
Keymaster

Thank you!

I understand your reasoning, and thanks for sharing the code as well, I have seen this one many times before. I know their method works pefrectly for your case, but there many other cases where this is not desired at all. Just an example:
The product description includes a string “AB:123-44_42;567”. When entering this string exactly to the input, the user optimally wants this result as the first one – as well as when entering without the punctuation “AB1234442567” – same when entering with spaces “AB 123 44 42 567”.
This seems straightforward, but it actually brings up many issues – and by removing the punctuation/other special characters from both the input and the index – many undesired results may appear when entering “AB:123-44_42;567”. Basically running this:

$a = preg_replace( ‘/[[:punct:]]+/u’, apply_filters( ‘relevanssi_default_punctuation_replacement’, ‘ ‘ ), $a );

..will ruing the original information, so we need to process it first – and create the “junk”.
Of course there is still some real “junk” we need to address, like punctuations/some special characters from the end of the words where possible, that is not working the way it should.

This is just a very basic example, there are many other more complex cases. These all were reported to us by many many customers, so we had to make a working solution, which is transparent from the users.
Of course indexing “junk” data is not the best, but the size of the index table was almost never reported being an issue. We are constantly running test servers with millions of posts, where the index table reaches 10GB, and a basic 1 core, 512MB ram server can handle a query in a fraction of a second. Besides the overhead of the “junk” is extremely variable. In some cases there is almost none, in others there is a lot, very much depends on the original data.

All I wanted to say with this is, that we decided to sacrifice some database space for much better search accuracy. And if users start reporting issues with the index table, we slowly start making some changes and gradually add options to turn these off.

I would love to keep the method the way it is now, but do you think that adding some options to control the punctuation (and others) would be helpful? Would that work for you?