Bugs in stringByAddingPercentEscapesUsingEncoding?

I found something wrong with NSString’s stringByAddingPercentEscapeUsingEncoding method.

Take a look at this screenshot.

percent-escape-bug

As you can see, I wanted to convert a clipDescription which holds AT&T Giants HD 1080i 50 LGOP.

When this was processed by calling stringByAddingPercentEscapesUsingEncoding:, it returned “AT&T%20Giants%20HD%201080i%2050%20LGOP”. So, the & is not percent escaped. When this one is handled by any XML handler, for example, tidy or xmllint, they complain that the string is malformed.

So, I had to write my own for percent escaping.

At first, I thought this was a bug, but thought twice. I guess this is just for different purpose.

For example, URL string can contain & character to be sent. Is it OK to send it just as it is? or should it be percent-escaped? I didn’t test it. But I noticed that many web site contained the & character not percent-escaped. So, probably this stringByAddingPercentEscapesUsingEncoding is for handling URL strings.
(ADDED : I noticed some people read this post from StackOverflow. So, I update this part not to confuse or not to give wrong impression. The web site URL containing & is for any passed parameter to the URL. When GET method is used, parameters are added using & characters. So, if that & is percent-escaped, it will mean totally different thing. So, in the method, it doesn’t percent-escape & characters. It could be better if Apple people named the message like URLstringByAddingPercentEscapesUsingEncoding for conveying its purpose more clearly.

However, it would be much nicer if they wrote some more explanation about where the method is used for, or something like that.

6 responses to this post.

  1. the & char in a url is a delimiter for each argument in a REST call.
    so the url domain.com/test.php?param1=dance&param2=jump

    will result (in php) in:
    $_GET[‘param1’] contains “dance”
    $_GET[‘param2’] contains “jump”

    if you’d escape the &-sign you would come up with something like:

    $_GET[‘param1’] contains “dance%param2=jump”
    $_GET[‘param2’] would not be defined

    so blindly escaping the & is not a good idea.

    just percent-escape each parameter’s content on its own (with your custom escaping method to filter out any rogue &’s :), then build a url-string with these escaped parameters and use stringByAddingPercentEscapesUsingEncoding: on this url-string. (though the stringByAddingPercentEscapesUsingEncoding: shouldn’t bee necessary then.)

    as far as i know stringByAddingPercentEscapesUsingEncoding: is only to build valid escaped urls from your string. it can’t know if your &-signs are meant as delimiters or as content so it assumes you know what you’re doing and thus your &’s must delimiters. so it doesn’t escape them.

    Reply

    • Posted by jongampark on March 22, 2009 at 9:00 AM

      Thank you for your comment!

      Yeah.. the & means delimiter in a URL, but what I was not sure of was that if it is OK to percent-escape the delimiter in a URL also. It was about 10 years ago when I wrote CGI scripts often. I didn’t exactly remember but I think I saw that the & was escaped and web browsers and web servers interpret the percent-escaped & correctly as a delimiter. Actually, I checked the Wikipedia about this, and it mentions about the same issue indirectly, but it was not clear.

      Anyway, the Apple’s document should clearly mention the purpose of the method. I searched for this issue on the Google, and many people asked about this issue, and some even said that Apple people acknowledged that it was a bug. Then they were wrong! :)

      Anyway, thank you very much again for clarifying this!
      It is constructive!! ;)

      Reply

  2. oh, one small addition: if you’re just building local “urls” to files and you don’t call any scripts over these urls not escaping the & should be fine. (as it is a legit char in a url … it has just some special meaning to the mainstream of browsers, servers and server side scripts)

    but then i’d escape either way – just to be sure :)

    Reply

  3. Posted by John Doe on April 25, 2009 at 10:29 PM

    Did you know about CFURLCreateStringByAddingPercentEscapes?

    Reply

  4. […] Escape Sequence 2 11 06 2009 Before I wrote about bug in stringByAddingPercentEscapesUsingEncoding. A log still says that there are some people who visit my site from a MacRumors forum. So, I would […]

    Reply

  5. I really like when people are expressing their opinion and thought. So I like the way you are writing

    Reply

Leave a comment