|Main Archive Page > Month Archives > full-disclosure-uk archives|
On 5/22/07, Brian Eaton <email@example.com> wrote:
> What surprises me is that not all codepage conversion libraries are
> doing the same thing with this data. I've tested a few, and some of
> them are canonicalizing full-width unicode to ASCII equivalents, and
> others are not. Where we run into trouble is where one component
> doing input validation uses one technique for canonicalization, and
> another component trying to do the actual work is using a different
> technique. Figuring out exactly what different application platforms
> are doing would help to figure out how much of a problem this poses in
> the real world.
> Somebody ought to put together a test suite for this, just to see what
> different vendors have done.
Funny thing you should say that. :) That's one of the exact things we are working on, but specifically from a "software defect with security implications" perspective. What you really probably need are some unit-test type suites that ram home a huge charset in different encoding types and see what happens. I am focusing on testing a small subset of that (focusing on metacharacter transforms primarily) across a lot of software as efficiently as possible. Anyway...
This subject gets really confusing because people mean many different things when they say "encoding attack" or "encoding bypass". The two common meanings are:
#2 gets confusing because people myopically focus on the parser/interpreter
that is the *target* of the attack, and debate that parser's ability to
given input encoding type... Which many have nothing to do with intermediary functions & transforms performed on the attack data on the way to the target parser. Things like canonicalization or normalizing data for full-text searching
are examples of key intermediary transforms performed upon one's data.
This leaves the sub-points:
2.1 What parser are you targeting?
2.2 What encoding types will that parser interpret/execute?
2.3 What intermediary decoding/canonicalization steps will *all* software &
involved in the transaction a priori the target parser take?
Sub-point 2.3 is a real bear, eh? People under-estimate this one. I've
several direct inquiries from folks that usually ask some form of the question:
:: Which of these is responsible for the issue? ::
+ Is it the client? (Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7)
+ Is it the protocol? (Content-Type: text/html; charset=utf-8; charset=iso-8859-1, etc)
+ Is it the web server? (IIS Hex URL & Unicode decode/double-decode issues)
+ Is it the framework? (.NET's Hex canonicalization issue from 2005)
+ Is it the language? (glued together open source PHP crap; huge monolithic J2EE projects)
+ Is it our custom code? (insert random canonicalization library, add /random canonicalization step to your software for situational normalization issue you run into...but make it global for all data passing through those functions)
One part of this, then, is clearly defining your target.
The other part is evaluating the transforms performed on your data, and
the transforms & canonicalization your software is *capable* of. We can
directly deduce this in some situations, I believe, given a valid data type
and the ability to correlate output, but in some cases where we are
a parser internal to the system (e.g.-SQL interpreter) this will have to be inferred by some state change, or context change, which is going to be very difficult to do in an automated fashion with any sense of reliability.
But, definitely, that problem is being worked on.
I think this is a classic case where run-time black-box analysis is essential.
There is simply no way a source code or controls audit or binary analysis
is going to find the majority of issues in this case (when evaluating
production software deployments), because they are usually the result of emergent behaviors of complex, glued-together systems with many different components (including even things like firewalls/IPS that may "fix" or "re-code"
protocols in transit, etc., assuming they really even understand the protocol). -- Arian Evans software security stuff "Diplomacy is the art of saying "Nice doggie" until you can find a rock." -- Will Rogers
_______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.grok.org.uk/full-disclosure-charter.html Hosted and sponsored by Secunia - http://secunia.com/