[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86emul/fuzz: adjust canonicalization in sanitize_input()


  • To: George Dunlap <George.Dunlap@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 29 Mar 2019 20:40:15 +0000
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Delivery-date: Fri, 29 Mar 2019 20:40:31 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 29/03/2019 19:20, George Dunlap wrote:
>
>> On Mar 29, 2019, at 4:14 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>
>>>>> On 29.03.19 at 16:42, <George.Dunlap@xxxxxxxxxx> wrote:
>>>> On Mar 29, 2019, at 3:23 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>>>> On 29.03.19 at 16:14, <George.Dunlap@xxxxxxxxxx> wrote:
>>>>> FAOD:
>>>>> 1. I don’t oppose this, but
>>>>> 2. I don’t support it either; however,
>>>>> 3. I don’t think my Ack is necessary.
>>>> Well, preferably I would address your concerns despite 3. So could
>>>> you clarify what you would suggest instead? Keep things as they
>>>> are? Drop all canonicalization? I've basically tried to find a middle
>>>> ground between the two extremes.
>>> I appreciate that. :-) But the main reason I wrote this was #3: I didn’t 
>>> want my silence interpreted as a nack.
>>>
>>> I don’t think it will help fuzzing to remove canonicalization of ebp; it 
>>> may 
>>> help to have it in.  In fact I’d prefer to CANONICALIZE_MAYBE() more 
>>> registers.
>> But canonicalization removes potentially interesting bits from fuzzed
>> input, which is liable to be relevant if a register is used for other than
>> a base address in an effective address calculation. As an example,
>> take BSR: You'd remove the possibility to get results in the range
>> [48,62]. Or take the XSA-195 case: The memory range covered by
>> BT{,C,R,S} is dramatically much smaller when the register holding the
>> bit offset first got canonicalized.
>>
>> Granted the canonicalization is conditional, so it wouldn't be making it
>> entirely impossible to get into such a state, but since fuzzing is all
>> about likelihood, we'd like to avoid reducing our chances of hitting
>> interesting cases.
> OK, since you want to discuss it, let’s discuss it. :-)
>
> Just to summarize everything:
>
> “Canonical” addresses must have bits 48-63 identical to bit 47.  This means 
> the chance of any “random” 64-bit number being canonical is 1 in 65536 (2^16).

As with everything, this is more complicated.  With 5-level paging
support in IceLake hardware, we now get a different type of canonical
address, where the sign extension is from bit 57, rather than 48.  In
practice, that now means that, depending on CR4.LA57 (which is the
control bit), we may have a 1 in 2^7 chance of being canonical.

>
> A large number of registers (rax, rbp, &c) *can* be used as memory addresses; 
> a handful of registers (esp, rip) are designated to be used primarily for 
> that purpose. 
>
> It is asserted that the canonicity* of any registers has no effect on the 
> codepaths the emulator takes; it simply does calculations and passes the 
> results to various callbacks (e.g., insn_fetch, read, write, &c).  It is 
> these callbacks which, in Xen, check the calculated linear address for 
> canonicity and return an error for non-canonical addresses.

Correct.  Registers are just numbers without any specific meaning.  It
is only when considering a resulting memory operand (which can be
derived from multiple registers and embedded constants) that canonicity*
is relevant.

As said before, we'll shortly gain a second type of canonical address,
but at the end of the day, all it amounts to is "is this a valid linear
address or not".

> [* This is the form of this word my spell-checker doesn’t red-underline]

[* My spell-checker doesn't like it, but meh - the meaning is clear]

> However, the whole point of testing is to find places where your assumptions 
> are violated.  If the emulator ever *did* behave differently for canonical 
> and non-canonical addresses, or near the boundary of canonicity, we’d want 
> those behaviors to be tested.  In fact, it’s *possible* that the emulator 
> already does have this kind of behavior in it, and we just haven’t realized 
> it.

Accepting this point, we do have a couple of places in x86_emulate.c
which check canonicity.

* validate_far_branch()
* protmode_load_seg()
* LIDT/LGDT
* SYSEXIT
* WR{FS,GS}BASE

validate_far_branch() is strictly on %rip.

protmode_load_seg() and LIDT/LGDT are both to do with the content about
to be loaded into a system segment.  WR{FS,GS}BASE is for loading into a
user segment.  These should arguably be deferred to the
->write_segment() hook.

The use in SYSEXIT is explicitly documented in the manual, but they
shouldn't be necessary.  The check for %rdx will be done as part of
validate_far_branch(), whereas SYSEXIT shouldn't care about %rcx, and be
able to load a non-canonical %rsp.  In some copious free time, I'll try
seeing how hardware actually behaves.

All of these uses result in a difference in behaviour which ALF is
liable to pick up on.

I'm afraid that I'm not sufficiently caffeinated to work out if/how this
affects the rest of your reasoning.

> Fuzzing is probabilistic; theoretically, if we run the fuzzer long enough, 
> we’ll hit all “interesting” inputs.   But practically speaking, “focusing” 
> the fuzzing such that “interesting” inputs are found more quickly seems like 
> a good idea.
>
> AFL’s fuzzing takes two modes:
> * “Deterministic” fuzzing, where individual bits are flipped, bytes / words / 
> &c are incremented and decremented, bytes / words / &c are replaced with 
> “interesting” numbers like 0, 1, -1, and so on
> * “Random” fuzzing, where completely random changes are made: random values 
> added, random bits of the file removed, random bits of the file duplicated, &c
>
> At the moment, all registers can be modified at start to any random 64-bit 
> value.  But there’s also a bitfield near the beginning that controls whether 
> some of those registers (namely rip, rsp, and rbp) are “canonicalized”: If 
> the register’s bitfield is 1, bits 48-63 will be set to the value of bit 47; 
> otherwise the register will be left alone.  This allows AFL to choose whether 
> the register will be canonicalized or not.
>
> Effectively, a “maybe canonicalized” register has a 50% chance (plus change) 
> of being “canonical”, and a 50% chance (minus a bit) of being non-canonical.
>
> This canonicalization has several effects.
>
> First of all, it means that any interesting “canonical” addresses (if they 
> exist) has approximately a 65536x higher chance of being selected;  or to put 
> it a different way, AFL will find any interesting “canonical” addresses 
> 65536x faster than otherwise.
>
> On the other hand, any interesting non-canonical addresses (or non-address 
> values) have a 50% lower chance of being selected; AFL will take twice as 
> long to find an interesting “non-canonical” / non-address values address as 
> it would otherwise.
>
> The other effect, at the moment, is to add another branch to fuzz-emulate.c.  
> This means that the very act of adding a select-able bit to canonicalize (or 
> not) a register adds another path that AFL will consider “interesting”; given 
> two otherwise-identical inputs, A and A’, where one is canonicalized and the 
> other not, AFL will consider them two different test cases, *even if they go 
> through the same paths in x86_emulate()*.  It will keep them both in its 
> “queue", and will spend time mutating them both.
>
> For random fuzzing, even if x86_emulate() is the same, I don’t think this 
> will be wasted effort: 5 seconds fuzzing A and then 5 seconds fuzzing A’ will 
> be the same as 10 seconds of fuzzing A.  It may skew AFL’s priorities a bit 
> by fuzzing something for 10 seconds rather than 5, but it still has a chance 
> of finding new data.
>
> For deterministic fuzzing, if x86_emulate() is the same, then there is 
> duplicated work: the 5 seconds spent fuzzing A’ will go through exactly the 
> same sets of mutations as the 5 seconds spent fuzzing A.
>
> (It may be possible to exclude fuzz-emul.c from instrumentation, in which 
> case this effect would go away.)
>
> Now, we have two possibilities: Either canonicity does not and never will 
> affect x86_emulate(); or it may, either now or later.
>
> And we have a few choices: We maybe-canonicalize no registers; we 
> maybe-canonicalize only registers that are almost always used as an address; 
> or we maybe-canonicalize more (maybe all) registers.
>
> Let us call these C0, C1, and C2 (for “canonicity matters, and we 
> canonicalize none, some, or most, respectively"), and N0, N1, and N2 (for 
> “canonicity doesn’t mater, and we canonicalize none, some, or most 
> respectively”).  (And remember, N vs C is a fact about the world that we 
> don’t know; and 0-2 are options we can choose.)
>
> N0: Everything is the fastest: “interesting” non-address values are found the 
> fastest, AFL isn’t fooled into double-random-fuzzing some entries, AFL 
> doesn’t waste time double-deterministic-fuzzing entries.
>
> N1: “Interesting” non-address values take twice as long to find; but since 
> the registers in question are used almost exclusively for addresses, this is 
> less important.  Canonicity has no effect, so it doesn’t matter whether we 
> choose canonical or non-canonical values for the registers.  We do spend 
> twice as long random-fuzzing entries (for some benefit), and we twice as much 
> time deterministically fuzzing some entries to no benefit; but only when 
> fuzzing these address-mostly registers.
>
> N2: “Interesting” non-address values take twice as long to find.  We spend 
> twice as long random-fuzzing entries (to some benefit), and twice as long 
> deterministically fuzzing some entries (to no benefit).
>
> C0: Finding “interesting” non-address values is as fast as N0.  However, 
> finding “interesting” address values take 65536 times as long.  Extra time 
> spent fuzzing finds useful differences.
>
> C1: “Interesting” non-address values for address-mostly registers take twice 
> as long to find; but “interesting” address-values for mostly-address 
> registers take about the same amount of time to find (i.e., 65536x faster 
> than C0).  “Interesting” address values for other registers take 65536x times 
> as long to find.  Again, extra fuzzing time is spent usefully.
>
> C2: “Interesting” non-address values for all registers take twice as long to 
> find; but “interesting” address values for all registers take that same 
> amount of time.  Again, extra fuzzing time is spent usefully.
>
> If N is true, and we go with option 2, it will take on average 2 hours to 
> find an “interesting" value that would have taken 1 hour with option 0.
>
> If C is true, and we go with 0, it will take on average 65536 hours (7.48 
> years) to find an “interesting” value that would have taken 1 hour in option 
> 2.
>
> I think the chances of C are less than 50%; but I think it's certainly higher 
> than 1 in 65536.
>
> So given that we don’t know whether we’re in case N or C, from a 
> cost/benefits analysis, I think option 1 is a minimum; and that 2 is an 
> obvious choice — particularly if we can get rid of the “wasted” fuzzing, 
> perhaps by not instrumenting fuzz-emul.c.
>
> Obviously there’s a lot of hand-waving and guessing here.  My understanding 
> is that in practice, ebp is *mostly* used as an address and not a 
> general-purpose register, which is why I maybe-canonicalized it in the first 
> place, and would rather it remain so (option 1).  But I can see the arguments 
> for classifying it as a general-purpose register; so as long as rip and rsp 
> are maybe-canonicalized, I won’t spend a lot of time arguing against this 
> patch.

There is a selection bias here.

It is true that, with -fno-omit-frame-pointer, %rbp is always an
address, but we aren't considering compiler-crafted code here.

To a first approximation, AFL will generate instruction fragments with a
uniform probability of which of the 8 registers are encoded (or 16 for
64bit, but lets take the simple path here).  Therefore, %rbp will be
encoded with a 1/8th probability, as will %rsp, and %rax, etc.

At this point, what matters is how many instructions use %rbp/%rsp as an
implicit memory operand.  %rsp is used by all stack operations, which is
a large quantity of the encoding space.

I can't think of any instruction which uses %rbp in this way. 
ENTER/LEAVE/PUSHA/POPA use/modify it, but only in its integer form - not
as a memory address.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.