|
|
|
|
|
|
|
|
|
|
xen-ia64-devel
[Xen-ia64-devel] [RFC] MCA handler support for Xen/ia64
Hi all,
This is a design memo of the MCA handler for Xen/ia64.
We hope many reviews and many comments.
1. Basic design
- The MCA/CMC/CPE handler of the Xen/ia64 makes use of Linux code
as much as possible.
- The CMC/CPE interruption is injected to dom0 for logging.
This interruption is not injected to domU or domVTI.
- If the MCA interruption is a TLB check, the MCA handler
changes the MCA to a CMC interruption, and inject it to dom0.
This interruption is not injected to domU or domVTi.
- If the MCA interruption is not a TLB check, the MCA handler
does not try to recover, and Xen/ia64 reboot.
2. Detail design
2.1 Initialization of MCA handler
The processing sequence is basically as follows.
1) Clear the Rendez checkin flag for all cpus.
2) Register the rendezvous interrupt vector with SAL.
3) Register the wakeup interrupt vector with SAL.
4) Register the Xen/ia64 MCA handler with SAL.
5) Configure the CMCI/P vector and handler. Interrupts
for CMC are per-processor, so AP CMC interrupts are
setup in smp_callin() (smpboot.c).
6) Setup the MCA rendezvous interrupt vector.
7) Setup the MCA wakeup interrupt vector.
8) Setup the CPEI/P handler.
9) Initialize the areas set aside by the Xen/ia64 to
buffer the platform/processor error states for
MCA/CMC/CPE handling.
10) Read the MCA error record for logging (by Dom0) if
Xen has been rebooted due to an unrecoverable MCA.
2.2 MCA handler (TLB error only)
The processing sequence is basically as follows.
1) Get processor state parameter on existing PALE_CHECK.
And purge TR and TC, and reload TR.
2) Call the ia64_mca_handler().
3) Wait for checkin of slave processors.
4) Wakeup all the processors which are spinning in the
rendezvous loop.
5) Get the MCA error record.
And hold the MCA error record into Xen/ia64 for logging
by dom0.
6) Clear the MCA error record.
7) Inject the external interruption of CMC to dom0.
8) Set IA64_MCA_CORRECTED to the ia64_sal_os_state struct.
9) Return to the SAL and resume the interrupted processing.
2.3 MCA handler (TLB error and the others error)
The processing sequence is basically as follows.
1) Get processor state parameter on existing PALE_CHECK.
And purge TR and TC, and reload TR.
2) Call the ia64_mca_handler().
3) Wait for checkin of slave processors.
4) Wakeup all the processors which are spinning in the
rendezvous loop.
5) Get the MCA error record.
And save the MCA error record into Xen/ia64 for logging
by dom0 after reboot. [*1]
6) Return to the SAL and reboot the Xen/ia64.
2.4 MCA handler (Not TLB error)
The processing sequence is basically as follows.
1) Get processor state parameter on existing PALE_CHECK.
2) Call the ia64_mca_handler().
3) Wait for checkin of slave processors.
4) Wakeup all the processors which are spinning in the
rendezvous loop.
5) Get the MCA error record.
And save the MCA error record into Xen/ia64 for logging
by dom0 after reboot. [*1]
6) Return to the SAL and reboot the Xen/ia64.
2.5 CMC handler
The processing sequence is basically as follows.
1) Call the ia64_mca_cmc_int_handler() from the
__do_IRQ() in the ia64_handle_irq().
2) Get the MCA error record.
And save the MCA error record into Xen/ia64 for logging
by dom0 after reboot. [*1]
3) Inject the external interruption of CMC to dom0.
2.6 CPE handler
Same as CMC.
2.7 SAL emulation for Dom0/DomU/DomVTI
The following SAL emulation procedures are added.
- SAL_SET_VECTORS
- SAL_GET_STATE_INFO
- SAL_GET_STATE_INFO_SIZE
- SAL_CLEAR_STATE_INFO
- SAL_MC_SET_PARAMS
Note:
[*1]: Actually, read the MCA error record again after the
Xen/ia64 rebooted and log it with dom0.
Best regards,
Yutaka Ezaki
Masaki Kanno
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Xen-ia64-devel] [RFC] MCA handler support for Xen/ia64,
Masaki Kanno <=
|
|
|
|
|