Skip to main content

This site requires you to update your browser. Your browsing experience maybe affected by not having the most up to date version.

We've moved the forum!

Please use forum.silverstripe.org for any new questions (announcement).
The forum archive will stick around, but will be read only.

You can also use our Slack channel or StackOverflow to ask for help.
Check out our community overview for more options to contribute.

General Questions /

General questions about getting started with SilverStripe that don't fit in any of the categories above.

Moderators: martimiz, Sean, Ed, biapar, Willr, Ingo, swaiba

� Appearing in HTML content


Go to End


10 Posts   41445 Views

Avatar
timcole

Community Member, 32 Posts

28 May 2011 at 12:46am

I recently upgraded a site from 2.3.x to 2.4.3. As part of the process I made a copy of the site including the mysql database and then performed the upgrade, additional mods my client required and some fixes to make my custom modules work correctly with 2.4. I then copied the database again (via a MySQL dump) to ensure I had the latest version of the content (as my client needed to be updating the site daily while I worked on the new version).

Ever since then I have noticed odds strings such as Ã� appearing in the HTML area fields. They seem to appear next to / in place of special characters such as &nbspc; and © , etc. They appear when clicking "Save" or "Save and Publish" and they seem to build up over time with each save. This means some pages (after being saved a few times) have strings of these "junk" characters in them. eg: Ã�ÂÃ�ÂÃ�ÂÃ�ÂÃ�Â

I've tried looking into issues surrounding htmlentities and php and also looking at the forums on the tinymce web site. So far I've not been able to come up with a fix.

Can anyone help or at least steer me in the right direction?

Avatar
swaiba

Forum Moderator, 1899 Posts

28 May 2011 at 1:23am

I've been logging my progress with this annoying one...

http://www.silverstripe.org/general-questions/show/16915

Avatar
martimiz

Forum Moderator, 1391 Posts

28 May 2011 at 10:08pm

Edited: 28/05/2011 10:09pm

I've had that happen just recently, after an upgrate 2.3 -> 2.4. In mysite/_config.php there's a line that sets the characterset for the db connection to utf8. After I replaced it with latin1, all was well again.

//MySQLDatabase::set_connection_charset('utf8');
MySQLDatabase::set_connection_charset('latin1');

Obviously this feels like it should be a temporary fix, but it works while trying to figure out the best course of action...

Avatar
timcole

Community Member, 32 Posts

29 May 2011 at 4:26am

Thanks, I tried changing the line but things are exactly the same. Junk characters by the truck load when I save the pages.

Avatar
timcole

Community Member, 32 Posts

30 May 2011 at 12:19am

Just found that the site was in dev mode... I thought I changed that when I made it live but I guess I missed it... Anyway that seems to have fixed it. Tho I have left in the line martimiz suggested.

Thanks!

Avatar
martimiz

Forum Moderator, 1391 Posts

30 May 2011 at 2:19am

Apparently utf-8 processing in 2.3 wasn't quite consistent:

read this post: http://silverstripe.org/migrating-a-site-to-silverstripe/show/5849
see this ticket: http://open.silverstripe.org/ticket/3746),

So the encoding of some characters in a 2.3 database might very well be multibyte but not quite utf8? Some of these multibyte characters might be recognized by 2.4 as being two separate characters, that are then both converted to utf8 when you (re)publish a page? This could result in a nonbreakable space being converted into a  followed by a nonbreakable space. On the other hand some characters are suddenly interpreted correctly if you set the charset to latin. Things like this confuse me a lot...

Avatar
timcole

Community Member, 32 Posts

30 May 2011 at 2:46am

That certainly seems consistent with what I have experienced. I now have over 500 pages some of which have errors and some that don't. Although it appears that no more of these characters are now getting introduced, there are some that have got saved there in the intervening period. I know I could try and remove them using an SQL query directly on the database, but I am very loathed to do that lest I make more problems for myself!

Avatar
Phill

Community Member, 81 Posts

3 June 2011 at 10:37pm

I've tried the MySQLDatabase::set_connection_charset('latin1'); fix, which at first appeared to solve the problem but have found it throws up a new error if there is a   in the content, which gets inserted by TinyMCE in several cases mainly being inserted into empty table cells. I've included a screen shot of the error.

Any ideas of a way around with without hacking the core to change   to  ?

Attached Files
Go to Top