[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/2] xen: credit2: fix vcpu starvation due to too few credits



On Thu, Mar 12, 2020 at 06:02:03PM +0100, Dario Faggioli wrote:
> On Thu, 2020-03-12 at 16:08 +0100, Roger Pau Monné wrote:
> > Thanks for looking into this, seems like a specially tricky issue to
> > tackle!
> > 
> It was tricky indeed! :-)
> 
> > On Thu, Mar 12, 2020 at 02:44:07PM +0100, Dario Faggioli wrote:
> > [...]
> > > For example, I have a trace showing that csched2_schedule() is
> > > invoked at
> > > t=57970746155ns. At t=57970747658ns (+1503ns) the s_timer is set to
> > > fire at t=57979485083ns, i.e., 8738928ns in future. That's because
> > > credit
> > > of snext is exactly that 8738928ns. Then, what I see is that the
> > > next
> > > call to burn_credits(), coming from csched2_schedule() for the same
> > > vCPU
> > > happens at t=60083283617ns. That is *a lot* (2103798534ns) later
> > > than
> > > when we expected and asked. Of course, that also means that delta
> > > is
> > > 2112537462ns, and therefore credits will sink to -2103798534!
> > 
> > Which timer does this hardware use? DYK if there's some relation
> > between the timer hardware used and the issue?
> > 
> Timers came to mind but I haven't checked yet.
> 
> FWIW, one thing I saw is that, without patches, my machine times out
> around...
> 
> [    2.364819] NET: Registered protocol family 16
> [    2.368018] xen:grant_table: Grant tables using version 1 layout
> [    2.372033] Grant table initialized
> [    2.377115] ACPI: bus type PCI registered
> [    2.380011] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
> [    2.384660] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 
> 0x80000000-0x8fffffff] (base 0x80000000)
> [    2.388033] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
> [    2.499080] PCI: Using configuration type 1 for base access
> [    2.516768] ACPI: Added _OSI(Module Device)
> [    2.524006] ACPI: Added _OSI(Processor Device)
> [    2.536004] ACPI: Added _OSI(3.0 _SCP Extensions)
> [    2.544003] ACPI: Added _OSI(Processor Aggregator Device)
> [    2.816022] ACPI: 4 ACPI AML tables successfully acquired and loaded
> [    2.852011] xen: registering gsi 9 triggering 0 polarity 0
> [    2.856021] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> 
> ... here, during dom0 boot.
> 
> [    2.871615] ACPI: Dynamic OEM Table Load:
> [    2.941945] ACPI: Interpreter enabled
> [    2.952021] ACPI: (supports S0 S3 S4 S5)
> [    2.960004] ACPI: Using IOAPIC for interrupt routing
> [    2.972031] PCI: Using host bridge windows from ACPI; if necessary, use 
> "pci=nocrs" and report a bug
> [    2.993032] ACPI: Enabled 6 GPEs in block 00 to 3F
> [    3.042478] ACPI: PCI Root Bridge [UNC1] (domain 0000 [bus ff])
> [    3.056010] acpi PNP0A03:02: _OSC: OS supports [ExtendedConfig ASPM 
> ClockPM Segments MSI]
> [    3.079707] acpi PNP0A03:02: _OSC: platform does not support [SHPCHotplug 
> LTR]
> [    3.098999] acpi PNP0A03:02: _OSC: OS now controls [PCIeHotplug PME AER 
> PCIeCapability]
> 
> What do you mean with "Which timer does this hardware use" ?

Xen uses a hardware timer (HPET, PMTIMER or PIT IIRC) in order to get
interrupts at specified times, on my box I see for example:

(XEN) Platform timer is 23.999MHz HPET

You should also see something along those lines. I was wondering if
there was some relation between the timer in use and the delay in
timer interrupts that you are seeing.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.