Hi Sean,
Thank you for all the details, it helped a lot.
Okay, this was a tough one. I found that all pages were made with some sort of a custom page builder. Interestingly, the page content in the database contains some internal modules from the page builder itself – including encoded (html entity encoded) HTML as text, buttons etc.. I am not sure why they are doing that.
The escaped HTML is basically considered as simple text, and even though the plugin does html removal and other stuff, this is not recognized as HTML, since it is escaped. To bypass this issue to some extent, I had to put this code to the functions.php file in your theme directory:
add_filter('asp_results', 'asp_filter_invalid_html', 10, 1);
function asp_filter_invalid_html( $results ) {
foreach ( $results as $k => &$r ) {
$r->content = strip_tags(html_entity_decode($r->content));
}
return $results;
}
This may help a bit, but it is definitely not perfect. The encoded HTML should not be there in the first place, this code will force decoding of the HTML entities, and escape them again.