Search in content file

This topic contains 3 replies, has 2 voices, and was last updated by Ernest Marcinko Ernest Marcinko 4 years, 6 months ago.

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
    Posts
  • #25521
    essenciadigital74
    essenciadigital74
    Participant

    Hello,

    Even following all the steps to index the contents of PDF files, I can’t get it to work.

    My index brought the file but not when I search for the content.

    Exemplo
    http://ebook20voegol.essencia.digital/mais-de-100-cidades-estao-em-situacao-de-emergencia-apos-chuvas-em-mg-_-minas-gerais-_-g1/

    #25531
    Ernest Marcinko
    Ernest Marcinko
    Keymaster

    Hi!

    May I ask, which search instance (name or ID) is used for searching attachments? Make sure that the search in attachments is enabled, and the index table engine is selected for searching: https://i.imgur.com/SCzV6Bd.png

    Best,
    Ernest Marcinko

    If you like my products, don't forget to rate them on codecanyon :)


    #25533
    essenciadigital74
    essenciadigital74
    Participant

    Hello,

    even with this configuration still does not seek.

    Attachments:
    You must be logged in to view attached files.
    #25536
    Ernest Marcinko
    Ernest Marcinko
    Keymaster

    Hi,

    Okay, I have checked the index table, and try to debug the extracted contents. There seems to be something wrong with either the PDF encryption, or the parser I am not sure. I tried multiple scripts to get the contents but none of them worked, so it might be some sort of a PDF encoding issue.

    Anyways, I noticed that most of text is present, but there are duouble spaces here and there between the words, and some random characters.

    There might be a way to bypass that via a custom code, but I am not sure. Try adding this custom code to the functions.php in your theme/child theme directory. Before editing, please make sure to have a full site back-up just in case!

    add_filter("asp_indexing_string_pre_process", "asp_fix_indexing_string_pre_process", 10, 1);
    function asp_fix_indexing_string_pre_process($s) {
        if ( substr_count($s, "  ") > 10 ) {
            $s = str_replace('  ', '||||', $s);
            $s = str_replace(' ', '', $s);
            $s = str_replace('||||', ' ', $s);
        }
    	return $s;
    }

    Once the code is added, please try to re-create the index table. There is a small chance, that some comlete words will be indexed from the PDF files.

    Best,
    Ernest Marcinko

    If you like my products, don't forget to rate them on codecanyon :)


Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.