[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XEN v1] xen/arm: io: Check ESR_EL2.ISV != 0 before searching for a MMIO handler


  • To: Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien.grall.oss@xxxxxxxxx>
  • From: Ayan Kumar Halder <ayan.kumar.halder@xxxxxxxxxx>
  • Date: Fri, 28 Jan 2022 12:08:02 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 149.199.80.198) smtp.rcpttodomain=xen.org smtp.mailfrom=xilinx.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=xilinx.com; dkim=none (message not signed); arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dAgwEsqbYyqq5pjJ0fZSpOvLz5DniqCuJcX2G2jjbqM=; b=QO6vgh5kbZCkhlXF0prrLnk9dajKdAwBE4oJovUei3ZzZrN/hXOn6FHqsfkgwURaqSLNpxdLEogs3NXd2E4A6f4tqcTxNSVCIh7Lh4TGG01i2sN3x0BGoNmkHGbkwwhi5SBb8XftXrVaIkfd6uadgoW/d5wL0ajYF1PulHCDpwNM2QIKLYShaYmbm8a6DzyDl7yToq9zn+jw3HK/OhrIJrABC1vKRwVpp567IgfLkKpKEISHhOozwcGTu2AqS13UsC8YBPSKwjsmEFORCJ/xaeQjMqJL4WBnI1d6EUDwQsfBF7aXD60jplyo9JU/NzAcKcWvj1xNGtIw7a/FMONYKw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=V7Vsri6HGJ7zoW49qKCbfIK+5mFlJC5IRhD9NLPO0gPDB7TVtlHKSfQNCnWo4WWME6oo/SxysxAlrK+2YYK3/jSNQWuGAOsBpnquzctOnVBXYMrZq/toll2poDW3Ju1dbM/lCObldoALzGg/5liygkNm0b80A0n3Tznhtj8WLQggAOSZKWUbUtpXsKNmvtkjA9JPDAQ4VZXZ/FAEqhUPnvAxC1AGOZt0I3TTTU1uCkXyo4eXZsaG6/smVEGsqMhupWylBLwjllHVwWHR+NcmS8ksuDb5lYOLRFj+CcImDZZQ2n33HqIZjJPJLkkQipdGuB6XnmqUp4q24cVxmictRQ==
  • Cc: Ayan Kumar Halder <ayan.kumar.halder@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, <stefanos@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>
  • Delivery-date: Fri, 28 Jan 2022 12:08:14 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Julien/Stefano,

Good discussion to learn about Xen (from a newbie's perspective). :)

I am trying to clarify my understanding. Some queries as below :-

On 28/01/2022 09:46, Julien Grall wrote:


On 28/01/2022 01:20, Stefano Stabellini wrote:
On Thu, 27 Jan 2022, Julien Grall wrote:
On Thu, 27 Jan 2022 at 23:05, Julien Grall <julien.grall.oss@xxxxxxxxx> wrote:

On Thu, 27 Jan 2022 at 22:40, Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:
I am with you on both points.

One thing I noticed is that the code today is not able to deal with
IO_UNHANDLED for MMIO regions handled by IOREQ servers or Xen MMIO
emulator handlers. p2m_resolve_translation_fault and try_map_mmio are
called after try_handle_mmio returns IO_UNHANDLED but try_handle_mmio is
not called a second time (or am I mistaken?)

Why would you need it? If try_mmio_fault() doesn't work the first time, then

Sorry I meant try_handle_mmio().

it will not work the second it.

I think I explained myself badly, I'll try again below.


Another thing I noticed is that currently find_mmio_handler and
try_fwd_ioserv expect dabt to be already populated and valid so it would
be better if we could get there only when dabt.valid.

With these two things in mind, I think maybe the best thing to do is to
change the code in do_trap_stage2_abort_guest slightly so that
p2m_resolve_translation_fault and try_map_mmio are called first when
!dabt.valid.

An abort will mostly likely happen because of emulated I/O. If we call
p2m_resolve_translation_fault() and try_map_mmio() first, then it means
the processing will take longer than necessary for the common case.

So I think we want to keep the order as it is. I.e first trying the MMIO
and then falling back to the less likely reason for a trap.

Yeah I thought about it as well. The idea would be that if dabt.valid is
set then we leave things as they are (we call try_handle_mmio first) but
if dabt.valid is not set (it is not valid) then we skip the
try_handle_mmio() call because it wouldn't succeed anyway and go
directly to p2m_resolve_translation_fault() and try_map_mmio().

If either of them work (also reading what you wrote about it) then we
return immediately.

Ok. So the assumption is data abort with invalid syndrome would mostly likely be because of a fault handled by p2m_resolve_translation_fault().

I think this makes sense. However, I am not convinced we can currently safely call try_map_mmio() before try_handle_mmio(). This is because the logic in try_map_mmio() is quite fragile and we may mistakenly map an emulated region.

By emulated region, you mean vgic.dbase (Refer https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/arm/vgic-v2.c;h=589c033eda8f5e11af33c868eae2c159f985eac9;hb=0bdc43c8dec993258e930b34855853c22b917519#l702, which has not been mapped to the guest) and thus requires an MMIO handler.

Is my understanding correcr ?

If so, can Xen mantain a table of such emulated regions ? I am guessing that all emulated regions will have a mmio_handler. Then, before invoking try_map_mmio(), it can check the table.


Similarly, we can't call try_map_mmio() before p2m_resolve_translation_fault() because a transient fault may be
misinterpreted.

I think we may be able to harden try_map_mmio() by checking if the I/O region is emulated. But this will need to be fully thought through first.


If not, then we call decode_instruction from do_trap_stage2_abort_guest
and try again. The second time dabt.valid is set so we end up calling
try_handle_mmio() as usual.

With the approach below, you will also end up to call p2m_resolve_translation_fault() and try_map_mmio() a second time if try_handle_mmio() fails.


Just for clarity let me copy/paste the relevant code, apologies if it
was already obvious to you -- I got the impression my suggestion wasn't
very clear.



+again:
+        if ( is_data && hsr.dabt.valid )
         {
             enum io_state state = try_handle_mmio(regs, hsr, gpa);

             switch ( state )
             {
             case IO_ABORT:
                 goto inject_abt;
             case IO_HANDLED:
                 advance_pc(regs, hsr);
                 return;
             case IO_RETRY:
                 /* finish later */
                 return;
             case IO_UNHANDLED:
                 /* IO unhandled, try another way to handle it. */
                 break;
             }
         }

         /*
          * First check if the translation fault can be resolved by the
          * P2M subsystem. If that's the case nothing else to do.
          */
         if ( p2m_resolve_translation_fault(current->domain,
                                            gaddr_to_gfn(gpa)) )
             return;

         if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
             return;

+        if ( !hsr.dabt.valid )
One more thing I noticed, a stage 2 fault can also happen on an access made for a stage 1 translation walk. In this case, I think we don't want to decode the instruction.

So this would need to be !hsr.dabt.valid && !hsr.dabt.s1ptw. Depending on which patch we go with, this would also need to be adjusted in the other one as well.

This triggered me to check for the remaining bits as well. Refer DDI 0487G.b Armv8 Arm, "ISS encoding for an exception from a Data Abort", Page D13-3219

I guess we need to check the following :-

1. !hsr.dabt.valid

2. !hsr.dabt.s1ptw - Abort may be due to stage 1 translation table walk

3. !hsr.dabt.cache - Abort is due to cache maintenance or address translation instructions. We do not decode these instructions.

4. !hsr.dabt.eat - Abort is external


There is no need to check the following due to the reasons mentioned :-

1. hsr.dabt.dfsc - no need as we have already determined that it is a translation fault from EL0/EL1.

2. hsr.dabt.write - no need as the fault can be caused due to both read or write

3. hsr.dabt.fnv - no use for this in instruction decoding

4. hsr.dabt.sbzp0 - Bits[12:11] - We know that DFSC cannot be 0b010000 (FEAT_RAS), We may not check for FEAT_LS64 as from the instruction opcode, we can make out that it is not ST64BV, LD64B, ST64B or ST64BV0

                         Bit[13] - VCNR - The instruction opcode will tell us that it is not MRS/MSR instruction.

Please let me know if my understanding is correct. I can send out a v5 patch "Re: [XEN v4] xen/arm64: io: Decode ldr/str post-indexing instructions" will additional checks in place.

- Ayan


Cheers,




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.