[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86emul/fuzz: adjust canonicalization in sanitize_input()




> On Mar 29, 2019, at 4:14 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
> 
>>>> On 29.03.19 at 16:42, <George.Dunlap@xxxxxxxxxx> wrote:
>>> On Mar 29, 2019, at 3:23 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>>> On 29.03.19 at 16:14, <George.Dunlap@xxxxxxxxxx> wrote:
>>>> FAOD:
>>>> 1. I don’t oppose this, but
>>>> 2. I don’t support it either; however,
>>>> 3. I don’t think my Ack is necessary.
>>> 
>>> Well, preferably I would address your concerns despite 3. So could
>>> you clarify what you would suggest instead? Keep things as they
>>> are? Drop all canonicalization? I've basically tried to find a middle
>>> ground between the two extremes.
>> 
>> I appreciate that. :-) But the main reason I wrote this was #3: I didn’t 
>> want my silence interpreted as a nack.
>> 
>> I don’t think it will help fuzzing to remove canonicalization of ebp; it may 
>> help to have it in.  In fact I’d prefer to CANONICALIZE_MAYBE() more 
>> registers.
> 
> But canonicalization removes potentially interesting bits from fuzzed
> input, which is liable to be relevant if a register is used for other than
> a base address in an effective address calculation. As an example,
> take BSR: You'd remove the possibility to get results in the range
> [48,62]. Or take the XSA-195 case: The memory range covered by
> BT{,C,R,S} is dramatically much smaller when the register holding the
> bit offset first got canonicalized.
> 
> Granted the canonicalization is conditional, so it wouldn't be making it
> entirely impossible to get into such a state, but since fuzzing is all
> about likelihood, we'd like to avoid reducing our chances of hitting
> interesting cases.

OK, since you want to discuss it, let’s discuss it. :-)

Just to summarize everything:

“Canonical” addresses must have bits 48-63 identical to bit 47.  This means the 
chance of any “random” 64-bit number being canonical is 1 in 65536 (2^16).

A large number of registers (rax, rbp, &c) *can* be used as memory addresses; a 
handful of registers (esp, rip) are designated to be used primarily for that 
purpose. 

It is asserted that the canonicity* of any registers has no effect on the 
codepaths the emulator takes; it simply does calculations and passes the 
results to various callbacks (e.g., insn_fetch, read, write, &c).  It is these 
callbacks which, in Xen, check the calculated linear address for canonicity and 
return an error for non-canonical addresses.

[* This is the form of this word my spell-checker doesn’t red-underline]

However, the whole point of testing is to find places where your assumptions 
are violated.  If the emulator ever *did* behave differently for canonical and 
non-canonical addresses, or near the boundary of canonicity, we’d want those 
behaviors to be tested.  In fact, it’s *possible* that the emulator already 
does have this kind of behavior in it, and we just haven’t realized it.

Fuzzing is probabilistic; theoretically, if we run the fuzzer long enough, 
we’ll hit all “interesting” inputs.   But practically speaking, “focusing” the 
fuzzing such that “interesting” inputs are found more quickly seems like a good 
idea.

AFL’s fuzzing takes two modes:
* “Deterministic” fuzzing, where individual bits are flipped, bytes / words / 
&c are incremented and decremented, bytes / words / &c are replaced with 
“interesting” numbers like 0, 1, -1, and so on
* “Random” fuzzing, where completely random changes are made: random values 
added, random bits of the file removed, random bits of the file duplicated, &c

At the moment, all registers can be modified at start to any random 64-bit 
value.  But there’s also a bitfield near the beginning that controls whether 
some of those registers (namely rip, rsp, and rbp) are “canonicalized”: If the 
register’s bitfield is 1, bits 48-63 will be set to the value of bit 47; 
otherwise the register will be left alone.  This allows AFL to choose whether 
the register will be canonicalized or not.

Effectively, a “maybe canonicalized” register has a 50% chance (plus change) of 
being “canonical”, and a 50% chance (minus a bit) of being non-canonical.

This canonicalization has several effects.

First of all, it means that any interesting “canonical” addresses (if they 
exist) has approximately a 65536x higher chance of being selected;  or to put 
it a different way, AFL will find any interesting “canonical” addresses 65536x 
faster than otherwise.

On the other hand, any interesting non-canonical addresses (or non-address 
values) have a 50% lower chance of being selected; AFL will take twice as long 
to find an interesting “non-canonical” / non-address values address as it would 
otherwise.

The other effect, at the moment, is to add another branch to fuzz-emulate.c.  
This means that the very act of adding a select-able bit to canonicalize (or 
not) a register adds another path that AFL will consider “interesting”; given 
two otherwise-identical inputs, A and A’, where one is canonicalized and the 
other not, AFL will consider them two different test cases, *even if they go 
through the same paths in x86_emulate()*.  It will keep them both in its 
“queue", and will spend time mutating them both.

For random fuzzing, even if x86_emulate() is the same, I don’t think this will 
be wasted effort: 5 seconds fuzzing A and then 5 seconds fuzzing A’ will be the 
same as 10 seconds of fuzzing A.  It may skew AFL’s priorities a bit by fuzzing 
something for 10 seconds rather than 5, but it still has a chance of finding 
new data.

For deterministic fuzzing, if x86_emulate() is the same, then there is 
duplicated work: the 5 seconds spent fuzzing A’ will go through exactly the 
same sets of mutations as the 5 seconds spent fuzzing A.

(It may be possible to exclude fuzz-emul.c from instrumentation, in which case 
this effect would go away.)

Now, we have two possibilities: Either canonicity does not and never will 
affect x86_emulate(); or it may, either now or later.

And we have a few choices: We maybe-canonicalize no registers; we 
maybe-canonicalize only registers that are almost always used as an address; or 
we maybe-canonicalize more (maybe all) registers.

Let us call these C0, C1, and C2 (for “canonicity matters, and we canonicalize 
none, some, or most, respectively"), and N0, N1, and N2 (for “canonicity 
doesn’t mater, and we canonicalize none, some, or most respectively”).  (And 
remember, N vs C is a fact about the world that we don’t know; and 0-2 are 
options we can choose.)

N0: Everything is the fastest: “interesting” non-address values are found the 
fastest, AFL isn’t fooled into double-random-fuzzing some entries, AFL doesn’t 
waste time double-deterministic-fuzzing entries.

N1: “Interesting” non-address values take twice as long to find; but since the 
registers in question are used almost exclusively for addresses, this is less 
important.  Canonicity has no effect, so it doesn’t matter whether we choose 
canonical or non-canonical values for the registers.  We do spend twice as long 
random-fuzzing entries (for some benefit), and we twice as much time 
deterministically fuzzing some entries to no benefit; but only when fuzzing 
these address-mostly registers.

N2: “Interesting” non-address values take twice as long to find.  We spend 
twice as long random-fuzzing entries (to some benefit), and twice as long 
deterministically fuzzing some entries (to no benefit).

C0: Finding “interesting” non-address values is as fast as N0.  However, 
finding “interesting” address values take 65536 times as long.  Extra time 
spent fuzzing finds useful differences.

C1: “Interesting” non-address values for address-mostly registers take twice as 
long to find; but “interesting” address-values for mostly-address registers 
take about the same amount of time to find (i.e., 65536x faster than C0).  
“Interesting” address values for other registers take 65536x times as long to 
find.  Again, extra fuzzing time is spent usefully.

C2: “Interesting” non-address values for all registers take twice as long to 
find; but “interesting” address values for all registers take that same amount 
of time.  Again, extra fuzzing time is spent usefully.

If N is true, and we go with option 2, it will take on average 2 hours to find 
an “interesting" value that would have taken 1 hour with option 0.

If C is true, and we go with 0, it will take on average 65536 hours (7.48 
years) to find an “interesting” value that would have taken 1 hour in option 2.

I think the chances of C are less than 50%; but I think it's certainly higher 
than 1 in 65536.

So given that we don’t know whether we’re in case N or C, from a cost/benefits 
analysis, I think option 1 is a minimum; and that 2 is an obvious choice — 
particularly if we can get rid of the “wasted” fuzzing, perhaps by not 
instrumenting fuzz-emul.c.

Obviously there’s a lot of hand-waving and guessing here.  My understanding is 
that in practice, ebp is *mostly* used as an address and not a general-purpose 
register, which is why I maybe-canonicalized it in the first place, and would 
rather it remain so (option 1).  But I can see the arguments for classifying it 
as a general-purpose register; so as long as rip and rsp are 
maybe-canonicalized, I won’t spend a lot of time arguing against this patch.

This was very long — I hope it makes sense. :-)

Peace,
 -George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.