|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/nmi: lower initial watchdog frequency to avoid boot hangs
>>>> If the actual SMI source is not related to some place in the NMI
>>>> handler code but was eg. due to some SMI timer, lowering NMI
>>>> watchdog frequency might not fix the issue completely, but lower
>>>> its reproducibility (perhaps to some very rare occurrences). So
>>>> it's better be sure what was the real source of SMI.
>>>>
>>>
>>> This *is* related to this instruction - it was confirmed
>>> empirically. Removing this instruction stops SMIs from occurring
>>> and effectively removes the issue leaving the frequency unchanged.
>>
>> Hmm, it would be interesting to know for what evil purpose does it
>> need to trap I/O port 61h.
>> BTW, on which motherboard model the issue was reproduced?
>>
>
>The issue has been reported for some Dell/Huawei Skylake platforms (one
>of them PowerEdge R740 to be precise) but I don't think the others are
>unaffected (the issue supposedly originates from Intel's reference
>code)
>- the default BIOS setup indeed matters.
Here is a bit of info you might find useful. I did a quick research on
my test system (Gigabyte GA-H270M-D3H) in order to confirm if BIOS traps
I/O port 61h (NMI status) and for what purposes.
Well, turns out it really does.
Moreover, it's actually the only fixed I/O port location trapped by SMI
I/O traps on this system. Few others are simply 'allocated' ones,
meaning the real I/O port address being trapped is chosen dynamically by
supplying Address=0 to a corresponding call to EFI I/O Trap interface
function -- such I/O traps may be used as interfaces with a SMI handler
in a manner similar to the SW SMI interface.
The EFI module responsible for installing port 61h SMI I/O Trap is
PchInitSmm in my case. The related code is:
...
mov eax, 61h
lea r9, qword_5778
mov [rsp+98h+io_trap_ctx.io_address], ax
mov rax, cs:pIoTrapIF
lea r8, [rsp+98h+io_trap_ctx]
lea rdx, Port61h_IoTrapHandler
mov rcx, rax
mov [rsp+98h+io_trap_ctx.trap_type], ebp ; trap reads
mov [rsp+98h+io_trap_ctx.io_len], bp ; ebp=1
call qword ptr [rax]
...
The actual handler (named Port61h_IoTrapHandler in the above code) is
fairly lightweight and does a bit of useless black magic.
First, there is a loop for all CPUs which finds which CPU actually
caused trapped I/O operation by reading NMI status port.
Then it reads the original port 61h value and set NMI_SC bit4 to its
inverted previous state for the selected CPU' bit. And then updated AL
register value is returned to the NMI_SC-reading user code (via
patching RAX register value in SMRAM saved state):
; ebp = 61h, rbx = CPU index
...
mov edx, ebp
in al, dx
mov r8, cs:bmNmiRefTogglesForCpus
mov rcx, rbx
mov edx, 1
shl edx, cl
mov r9, rbx
movsxd rcx, edx
mov dl, al
and al, 0EFh
xor r8, rcx
or dl, 10h
mov cs:bmNmiRefTogglesForCpus, r8
and r8, rcx
movzx ecx, al
movzx eax, dl
test r8, r8
mov edx, 1
cmovnz ecx, eax
lea rax, [rsp+58h+al_to_return]
lea r8d, [rdx+25h] ; EFI_SMM_SAVE_STATE_REGISTER_RAX
mov [rsp+58h+func_arg0], rax
mov rax, cs:pEFI_SMM_CPU_PROTOCOL_GUID_IF
mov [rsp+58h+al_to_return], cl
mov rcx, rax
call qword ptr [rax+8] ; WriteSaveState
...
So, the only purpose of this stuff is emulating REF_TOGGLE bit toggling
logic (simply by alternating ones and zeros on each NMI_SC read),
nothing more. Sort of workaround for some legacy code which depends on
REF_TOGGLE rolling (which is now being marked Reserved in docs).
On this particular system SMI I/O trap for port 61h neither do anything
time-consuming nor anything really useful. That Dell system must have
something similar (thanks to common EFI ref code from Intel Igor
mentioned), leaving the question why port 61h reading is so slow there.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |