Nearly every web application allows its users to upload files. If not carefully handled, these files may cause cross-site scripting attacks on other visitors of the site using Microsoft Internet Explorer.
The problem is that under certain circumstances IE executes HTML or JavaScript code hidden in image files. Basically this is because there are many ways to determine the content type of files: A filename extension, the Content-Type HTTP header field, the first few bytes with known patterns (the so-called signature) or MIME-sniffing which analyses the first 256 bytes. MIME-sniffing was introduced in IE 4, but it is carried out only if the user calls the URL of the file directly, so MIME-sniffing won’t be done for image tags (IMG) in HTML.
MIME-sniffing was originally a feature and has been around for a while already, but now it becomes a problem when more and more sites allow its users to upload files. If a file’s extension, the signature and the Content-Type differ, IE will determine the MIME type by its first 256 bytes. However, if an uploaded image contains HTML and/or JavaScript code and the user clicks on a link to download the file, IE will execute that code.
You can try this out yourself in this Heise Security article.
Microsoft has identified the problem (already!) and plans to remove the problem in IE 8. However, IE 6 and IE 7 are still in heavy use. To fend off these attacks, the Heise Security article proposes several options. One of the most effective is to analyse file uploads closely by reading the first 256 bytes and rejecting the file if there are HTML tags in it. In order to find HTML tags you will propably use a regular expression. Here is an example regex to identify matching opening and closing tags. This is meant as a starting point, because the attacker may ommit the closing tag. It is however an effective countermeasure because the attacker will most likely give up if the first try doesn’t work.
Three lines of code may already fend this attack method off:
File.open(“security_logo_en.jpg”, “r”) do |f|
puts “reject file” if f.read(256) =~ /<(.)+>(.)*<\/(.)+>/i
end