[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 10/13] fuzz/x86_emulate: Make input more compact



On 10/04/2017 09:26 AM, Jan Beulich wrote:
>>>> On 25.09.17 at 16:26, <george.dunlap@xxxxxxxxxx> wrote:
>> @@ -22,13 +25,17 @@ int main(int argc, char **argv)
>>      setbuf(stdin, NULL);
>>      setbuf(stdout, NULL);
>>  
>> +    opt_compact = true;
> 
> How about giving the variable an initializer instead?

Actually, if we want fuzz-emul.c to be usable by itself (e.g., for the
Google ossfuz project), we *must* use a static initializer from within
fuzz-emul.c for it to have the correct defaults.  I'll change that...

> 
>> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> @@ -53,6 +53,15 @@ struct fuzz_state
>>  };
>>  #define DATA_OFFSET offsetof(struct fuzz_state, corpus)
>>  
>> +bool opt_compact;
>> +
>> +unsigned int fuzz_minimal_input_size(void)
>> +{
>> +    if ( opt_compact )
>> +        return sizeof(unsigned long) + 1;
> 
> What is this value choice based on / derived from? Oh, judging from
> code further down it may be one more than the size of the options
> field, in which case it should be sizeof(...->options) here.

What about renaming DATA_OFFSET to DATA_SIZE_FULL, and adding
DATA_SIZE_COMPACT?

Then is could be:

    return (opt_compact ? DATA_SIZE_COMPACT : DATA_SIZE_FULL) + 1;

>> @@ -647,9 +656,81 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>>  {
>>      struct fuzz_state *s = ctxt->data;
>>  
>> -    /* Fuzz all of the state in one go */
>> -    if (!input_read(s, s, DATA_OFFSET))
>> -        exit(-1);
>> +    if ( !opt_compact )
>> +    {
>> +        /* Fuzz all of the state in one go */
>> +        if (!input_read(s, s, DATA_OFFSET))
> 
> Missing blanks.

Ack

> 
>> +            exit(-1);
>> +        return;
>> +    }
>> +
>> +    /* Modify only select bits of state */
>> +
>> +    /* Always read 'options' */
>> +    if ( !input_read(s, &s->options, sizeof(s->options)) )
>> +        return;
>> +    
>> +    while(1) {
> 
> Style. And for compatibility (read: no warnings) with as wide a range
> of compilers as possible, generally for ( ; ; ) is better to use.

I can do that; but would you mind explaining?  What kinds of compilers
don't like while(1)?

>> +        uint16_t offset;
>> +
>> +        /* Read 16 bits to decide what bit of state to modify */
>> +        if ( !input_read(s, &offset, sizeof(offset)) )
>> +            return;
> 
> Doesn't this suggest minimal input size wants to be one higher than
> what you currently enforce? And isn't the use of uint16_t here in
> conflict with the description talking about reading a byte every time?

Hmm, actually it rather implies that it should be one less... with the
new format there's no way to guarantee that the very first insn_fetch
will have any data to read.

>> +        /* 
>> +         * Then decide if it's "pointing to" different bits of the
>> +         * state 
>> +         */
>> +
>> +        /* cr[]? */
>> +        if ( offset < 5 )
> 
> ARRAY_SIZE()

Ack

>> +        {
>> +            if ( !input_read(s, s->cr + offset, sizeof(*s->cr)) )
>> +                return;
>> +            printf("Setting CR %d to %lx\n", offset, s->cr[offset]);
>> +            continue;
>> +        }
>> +        
>> +        offset -= 5;
> 
> Same here then.
> 
>> +        /* msr[]? */
>> +        if ( offset < MSR_INDEX_MAX )
> 
> Even here (and below) use of ARRAY_SIZE() may be better.
> 
>> +        {
>> +            if ( !input_read(s, s->msr + offset, sizeof(*s->msr)) )
>> +                return;
>> +            printf("Setting MSR i%d (%x) to %lx\n", offset,
>> +                   msr_index[offset], s->msr[offset]);
>> +            continue;
>> +        }
>> +
>> +        offset -= MSR_INDEX_MAX;
>> +
>> +        /* segments[]? */
>> +        if ( offset < SEG_NUM )
>> +        {
>> +            if ( !input_read(s, s->segments + offset, sizeof(*s->segments)) 
>> )
>> +                return;
>> +            printf("Setting Segment %d\n", offset);
>> +            continue;
>> +            
>> +        }
>> +
>> +        offset -= SEG_NUM;
>> +
>> +        /* regs? */
>> +        if ( offset < sizeof(struct cpu_user_regs)
>> +             && offset + sizeof(uint64_t) <= sizeof(struct cpu_user_regs) )
>> +        {
>> +            if ( !input_read(s, ((char *)ctxt->regs) + offset, 
>> sizeof(uint64_t)) )
>> +                return;
>> +            printf("Setting cpu_user_regs offset %x\n", offset);
>> +            continue;
>> +        }
>> +
>> +        /* None of the above -- take that as "start emulating" */
>> +        
>> +        return;
>> +    }
> 
> Having come here I wonder whether the use of "byte" in the description
> is right, and you mean "uint8_t offset" above, as you're far from
> consuming the entire 256 value range.

Isn't cpu_user_regs larger than 256 bytes?  And in any case, the offset
will become larger than 256 bytes one we include the FPU state.

> Additionally, was the order of elements here chosen for any specific
> reason? It would seem to me that elements having a more significant
> effect on emulation may be worth filling first, and I'm not convinced
> the "all CRs, all MSRs, all SREGs, all GPRs" order matches that.

I'm not aware of any particular order; it's probably some combination of
"the order they were in the cpu_regs struct" and "the order in which I
found it useful to add them".  Given that the input will be more or less
random, I don't think the order in the struct will have too much of an
impact on the order in which AFL explores them.

If you have an alternative suggestion for an order you think would be
more logical I'm happy to rearrange the structure.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.