[TYPO3-core] RFC: replacement for rawurlencode/unescape

Martin Kutschker martin.kutschker-n0spam at no5pam-blackbox.net
Sun Oct 8 18:26:32 CEST 2006


As I have posted a long time ago the rawurlencode/unescape pair doesn't 
cope well with various charsets *. I have stumbled again across it while 
preparing the Core for UTF8 filenames.

After changing escape() to encodeURIComponent() in top.launchView (JS of 
frameset) everything was fine. It was fine because the data was already 
in UTF8, so the launched window did not need to do any further decoding. 
But we need extra encoding and decoding when our charset is not UTF8 as 
en-/decodeURIComponent will always encode to or decode from UTF8 to/from 
the charset of the HTML page. The reason is that offically a URL is 
either ASCII or UTF8.

So I suggest we change all PHP code that generates URL encoded to one 
that prepares for decodeURIComponent() and vice versa. In effect this 
means that any string prepared for JS has to be converted to UTF8 before 
going through rawurlencode. In JS we need only to replace unescape() 
with decodeURIComponent.

When we receieve data from JS that has been encoded with 
encodeURIComponent (instead of escape()) we need to convert the data 
from UTF8 to the current charset after applying rawurldecode. This is 
mostly the case for wizards.

To ease the transition I am for a helper in $LANG that encapsulates the 
necessary conversions and de/encoding.


* escape() only works with Latin1. In fact even data in UTF8 will be 
converted to latin1 before encoding (Firefox).

More information about the TYPO3-team-core mailing list