Also, from what I can tell, we still spit out html entered by users, otherwise forums woundn't work very well. How is that safe? Because we filter _on the way in_. Just add 'nohtml' to variables that shouldn't contain html, where the database wouldn't otherwise barf on that type of input.
I've read, several times, the reasoning for using noquote. Mostly it was a rush job to get some new code out to a single client. There was no history, no legacy code to deal with. There were no other users to consider. But several other things were stated in the discussion: unquoted user strings work 95% of the time, and using noquote as a fix, breaks existing code. Instead of tracking down the 5% of the remaining problem, a quick fix was introduced, that would produce more visible errors.
Of course the main problem is that the noquote fix introduces its own version of a hard to find problem: over-quoting.
This whole problem is caused by sloppy development. Creating a list of how to avoid this under/over quote issue would be a good start. But XSS isn't going away any time soon, until a new HTML standard is drafted, and until all current browsers are removed from use.
Here's a simple example. I'm running 5.0.0b1a at 127.0.0.1:8000. In my home directory I create an html file which contains, among other things:
I haven't logged in in a while, so my session is expired on that server. When I load file:///home/tom/test.html, the access log gets two new entries:
127.0.0.1 - - [24/Dec/2003:09:56:03 -0800] "GET /admin/ HTTP/1.1" 302 364 "" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020830"
127.0.0.1 - - [24/Dec/2003:09:56:11 -0800] "GET /register/?return%5furl=http%3a%2f%2f127%2e0%2e0%2e1%3a8000%2fadmin%2f HTTP/1.1" 200 5964 "" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020830"
Now I login to the website using my browser, and reload the file, in another browser window here is the result:
127.0.0.1 - - [24/Dec/2003:10:02:08 -0800] "GET /admin/ HTTP/1.1" 200 3185 "" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020830"
Huh? I'm using the noquote patch, and yet I just fell victim to XSS. This is why I say it is social engineering. All you have to do is figure out how to get your victim to follow a link, or maybe just read your email, if they allow their email client to load images.