Xen project Mailing List

Re: [Xen-devel] [PATCH v2 10/13] fuzz/x86_emulate: Make input more compact

From: George Dunlap <george.dunlap@xxxxxxxxxx>

Date: Thu, 5 Oct 2017 16:04:44 +0100

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Ian Jackson <ian.jackson@xxxxxxxxxx>

Delivery-date: Thu, 05 Oct 2017 15:06:32 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 10/04/2017 09:26 AM, Jan Beulich wrote: >>>> On 25.09.17 at 16:26, <george.dunlap@xxxxxxxxxx> wrote: >> @@ -22,13 +25,17 @@ int main(int argc, char **argv) >> setbuf(stdin, NULL); >> setbuf(stdout, NULL); >> >> + opt_compact = true; > > How about giving the variable an initializer instead? Actually, if we want fuzz-emul.c to be usable by itself (e.g., for the Google ossfuz project), we *must* use a static initializer from within fuzz-emul.c for it to have the correct defaults. I'll change that... > >> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c >> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c >> @@ -53,6 +53,15 @@ struct fuzz_state >> }; >> #define DATA_OFFSET offsetof(struct fuzz_state, corpus) >> >> +bool opt_compact; >> + >> +unsigned int fuzz_minimal_input_size(void) >> +{ >> + if ( opt_compact ) >> + return sizeof(unsigned long) + 1; > > What is this value choice based on / derived from? Oh, judging from > code further down it may be one more than the size of the options > field, in which case it should be sizeof(...->options) here. What about renaming DATA_OFFSET to DATA_SIZE_FULL, and adding DATA_SIZE_COMPACT? Then is could be: return (opt_compact ? DATA_SIZE_COMPACT : DATA_SIZE_FULL) + 1; >> @@ -647,9 +656,81 @@ static void setup_state(struct x86_emulate_ctxt *ctxt) >> { >> struct fuzz_state *s = ctxt->data; >> >> - /* Fuzz all of the state in one go */ >> - if (!input_read(s, s, DATA_OFFSET)) >> - exit(-1); >> + if ( !opt_compact ) >> + { >> + /* Fuzz all of the state in one go */ >> + if (!input_read(s, s, DATA_OFFSET)) > > Missing blanks. Ack > >> + exit(-1); >> + return; >> + } >> + >> + /* Modify only select bits of state */ >> + >> + /* Always read 'options' */ >> + if ( !input_read(s, &s->options, sizeof(s->options)) ) >> + return; >> + >> + while(1) { > > Style. And for compatibility (read: no warnings) with as wide a range > of compilers as possible, generally for ( ; ; ) is better to use. I can do that; but would you mind explaining? What kinds of compilers don't like while(1)? >> + uint16_t offset; >> + >> + /* Read 16 bits to decide what bit of state to modify */ >> + if ( !input_read(s, &offset, sizeof(offset)) ) >> + return; > > Doesn't this suggest minimal input size wants to be one higher than > what you currently enforce? And isn't the use of uint16_t here in > conflict with the description talking about reading a byte every time? Hmm, actually it rather implies that it should be one less... with the new format there's no way to guarantee that the very first insn_fetch will have any data to read. >> + /* >> + * Then decide if it's "pointing to" different bits of the >> + * state >> + */ >> + >> + /* cr[]? */ >> + if ( offset < 5 ) > > ARRAY_SIZE() Ack >> + { >> + if ( !input_read(s, s->cr + offset, sizeof(*s->cr)) ) >> + return; >> + printf("Setting CR %d to %lx\n", offset, s->cr[offset]); >> + continue; >> + } >> + >> + offset -= 5; > > Same here then. > >> + /* msr[]? */ >> + if ( offset < MSR_INDEX_MAX ) > > Even here (and below) use of ARRAY_SIZE() may be better. > >> + { >> + if ( !input_read(s, s->msr + offset, sizeof(*s->msr)) ) >> + return; >> + printf("Setting MSR i%d (%x) to %lx\n", offset, >> + msr_index[offset], s->msr[offset]); >> + continue; >> + } >> + >> + offset -= MSR_INDEX_MAX; >> + >> + /* segments[]? */ >> + if ( offset < SEG_NUM ) >> + { >> + if ( !input_read(s, s->segments + offset, sizeof(*s->segments)) >> ) >> + return; >> + printf("Setting Segment %d\n", offset); >> + continue; >> + >> + } >> + >> + offset -= SEG_NUM; >> + >> + /* regs? */ >> + if ( offset < sizeof(struct cpu_user_regs) >> + && offset + sizeof(uint64_t) <= sizeof(struct cpu_user_regs) ) >> + { >> + if ( !input_read(s, ((char *)ctxt->regs) + offset, >> sizeof(uint64_t)) ) >> + return; >> + printf("Setting cpu_user_regs offset %x\n", offset); >> + continue; >> + } >> + >> + /* None of the above -- take that as "start emulating" */ >> + >> + return; >> + } > > Having come here I wonder whether the use of "byte" in the description > is right, and you mean "uint8_t offset" above, as you're far from > consuming the entire 256 value range. Isn't cpu_user_regs larger than 256 bytes? And in any case, the offset will become larger than 256 bytes one we include the FPU state. > Additionally, was the order of elements here chosen for any specific > reason? It would seem to me that elements having a more significant > effect on emulation may be worth filling first, and I'm not convinced > the "all CRs, all MSRs, all SREGs, all GPRs" order matches that. I'm not aware of any particular order; it's probably some combination of "the order they were in the cpu_regs struct" and "the order in which I found it useful to add them". Given that the input will be more or less random, I don't think the order in the struct will have too much of an impact on the order in which AFL explores them. If you have an alternative suggestion for an order you think would be more logical I'm happy to rearrange the structure. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.