[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XCP: signal -7 (Re: [Xen-devel] XCP: sr driver question wrt vm-migrate)



hi,

> Very strange indeed. -7 is SIGKILL. 
> 
> Firstly, is this 0.1.1 or 0.5-RC? If it's 0.1.1 could you retry it on 0.5 
> just to check if it's already been fixed?

it's 0.1.1.
i'm not sure when i can try 0.5-RC.

> 
> Secondly, can you check whether xapi has started properly on both machines 
> (ie. the init script completed successfully)? I believe that if the init 
> script doesn't detect that xapi has started correctly it might kill it. This 
> is about the only thing we can think of that might cause the problem you 
> described.

i was running a script which repeats vm-migrate a vm between two hosts.
the error was after many of successful vm-migrate runs.
so i don't think it's related to the init script.

YAMAMOTO Takashi

> 
> Cheers,
> 
> Jon
> 
> 
> On 18 Jun 2010, at 03:45, YAMAMOTO Takashi wrote:
> 
>> hi,
>> 
>> i got another error on vm-migrate.
>> "signal -7" in the log seems intersting.  does this ring your bell?
>> 
>> YAMAMOTO Takashi
>> 
>> + date
>> Thu Jun 17 21:51:44 JST 2010
>> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 
>> host=67b8b07b-8c50-4677-a511-beb196ea766f
>> Lost connection to the server.
>> 
>> /var/log/messages:
>> 
>> Jun 17 21:51:40 s1 ovs-cfg-mod: 
>> 00007|cfg_mod|INFO|-port.vif4958.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592
>> Jun 17 21:51:41 s1 xapi: [ warn|s1|2416799 unix-RPC|VM.pool_migrate 
>> R:832813c0722b|hotplug] Warning, deleting 'vif' entry from 
>> /xapi/4958/hotplug/vif/1
>> Jun 17 21:51:41 s1 xapi: [error|s1|90 xal_listen|VM (domid: 4958) 
>> device_event = ChangeUncooperative false D:0e953bd99071|event] device_event 
>> could not be processed because VM record not in database
>> Jun 17 21:51:47 s1 xapi: [ warn|s1|2417066 inet_rpc|VM.pool_migrate 
>> R:fee54e870a4e|xapi] memory_required_bytes = 1080033280 > memory_static_max 
>> = 1073741824; clipping
>> Jun 17 21:51:57 s1 xenguest: Determined the following parameters from 
>> xenstore:
>> Jun 17 21:51:57 s1 xenguest: vcpu/number:1 vcpu/affinity:0 vcpu/weight:0 
>> vcpu/cap:0 nx: 0 viridian: 1 apic: 1 acpi: 1 pae: 1 acpi_s4: 0 acpi_s3: 0
>> Jun 17 21:52:43 s1 scripts-vif: Called as "add vif" domid:4959 devid:1 
>> mode:vswitch
>> Jun 17 21:52:44 s1 scripts-vif: Called as "online vif" domid:4959 devid:1 
>> mode:vswitch
>> Jun 17 21:52:46 s1 scripts-vif: Adding vif4959.1 to xenbr0 with address 
>> fe:ff:ff:ff:ff:ff
>> Jun 17 21:52:46 s1 ovs-vsctl: Called as br-to-vlan xenbr0
>> Jun 17 21:52:49 s1 ovs-cfg-mod: 00001|cfg|INFO|using 
>> "/etc/ovs-vswitchd.conf" as configuration file, 
>> "/etc/.ovs-vswitchd.conf.~lock~" as lock file
>> Jun 17 21:52:49 s1 ovs-cfg-mod: 00002|cfg_mod|INFO|configuration changes:
>> Jun 17 21:52:49 s1 ovs-cfg-mod: 
>> 00003|cfg_mod|INFO|+bridge.xenbr0.port=vif4959.1
>> Jun 17 21:52:49 s1 ovs-cfg-mod: 
>> 00004|cfg_mod|INFO|+port.vif4959.1.net-uuid=9ca059b1-ac1e-8d3f-ff19-e5e74f7b7392
>> Jun 17 21:52:49 s1 ovs-cfg-mod: 
>> 00005|cfg_mod|INFO|+port.vif4959.1.vif-mac=2e:17:01:b0:05:fb
>> Jun 17 21:52:49 s1 ovs-cfg-mod: 
>> 00006|cfg_mod|INFO|+port.vif4959.1.vif-uuid=271f0001-06ca-c9ca-cabc-dc79f412d925
>> Jun 17 21:52:49 s1 ovs-cfg-mod: 
>> 00007|cfg_mod|INFO|+port.vif4959.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592
>> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] received signal 
>> -7
>> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] xapi watchdog 
>> exiting.
>> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] Fatal: xapi died 
>> with signal -7: not restarting (watchdog never restarts on this signal)
>> Jun 17 21:55:11 s1 python: PERFMON: caught IOError: (socket error (111, 
>> 'Connection refused')) - restarting XAPI session
>> Jun 17 22:00:02 s1 python: PERFMON: caught socket.error: (111 Connection 
>> refused) - restarting XAPI session
>> Jun 17 22:04:48 s1 python: PERFMON: caught socket.error: (111 Connection 
>> refused) - restarting XAPI session
>> Jun 17 22:09:58 s1 python: PERFMON: caught socket.error: (111 Connection 
>> refused) - restarting XAPI session
>> Jun 17 22:14:52 s1 python: PERFMON: caught socket.error: (111 Connection 
>> refused) - restarting XAPI session
>> Jun 17 22:19:38 s1 python: PERFMON: caught socket.error: (111 Connection 
>> refused) - restarting XAPI session
>> 
>>> hi,
>>> 
>>> thanks.  i'll take a look at the log if it happens again.
>>> 
>>> YAMAMOTO Takashi
>>> 
>>>> This is usually the result of a failure earier on. Could you grep through 
>>>> the logs to get the whole trace of what went on? Best thing to do is grep 
>>>> for VM.pool_migrate, then find the task reference (the hex string 
>>>> beginning with 'R:' immediately after the 'VM.pool_migrate') and grep for 
>>>> this string in the logs on both the source and destination machines. 
>>>> 
>>>> Have a look  through these, and if it's still not obvious what went wrong, 
>>>> post them to the list and we can have a look.
>>>> 
>>>> Cheers,
>>>> 
>>>> Jon
>>>> 
>>>> 
>>>> On 16 Jun 2010, at 07:19, YAMAMOTO Takashi wrote:
>>>> 
>>>>> hi,
>>>>> 
>>>>> after making my sr driver defer the attach operation as you suggested,
>>>>> i got migration work.  thanks!
>>>>> 
>>>>> however, when repeating live migration between two hosts for testing,
>>>>> i got the following error.  it doesn't seem so reproducable.
>>>>> do you have any idea?
>>>>> 
>>>>> YAMAMOTO Takashi
>>>>> 
>>>>> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 
>>>>> host=67b8b07b-8c50-4677-a511-beb196ea766f
>>>>> An error occurred during the migration process.
>>>>> vm: 23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 (CentOS53x64-1)
>>>>> source: eea41bdd-d2ce-4a9a-bc51-1ca286320296 (s6)
>>>>> destination: 67b8b07b-8c50-4677-a511-beb196ea766f (s1)
>>>>> msg: Caught exception INTERNAL_ERROR: [ 
>>>>> Xapi_vm_migrate.Remote_failed("unmarshalling result code from remote") ] 
>>>>> at last minute during migration
>>>>> 
>>>>>> hi,
>>>>>> 
>>>>>> i'll try deferring the attach operation to vdi_activate.
>>>>>> thanks!
>>>>>> 
>>>>>> YAMAMOTO Takashi
>>>>>> 
>>>>>>> Yup, vdi activate is the way forward.
>>>>>>> 
>>>>>>> If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 
>>>>>>> 'get_driver_info' response, xapi will call the following during the 
>>>>>>> start-migrate-shutdown lifecycle:
>>>>>>> 
>>>>>>> VM start:
>>>>>>> 
>>>>>>> host A: VDI.attach
>>>>>>> host A: VDI.activate
>>>>>>> 
>>>>>>> VM migrate:
>>>>>>> 
>>>>>>> host B: VDI.attach
>>>>>>> 
>>>>>>> (VM pauses on host A)
>>>>>>> 
>>>>>>> host A: VDI.deactivate
>>>>>>> host B: VDI.activate
>>>>>>> 
>>>>>>> (VM unpauses on host B)
>>>>>>> 
>>>>>>> host A: VDI.detach
>>>>>>> 
>>>>>>> VM shutdown:
>>>>>>> 
>>>>>>> host B: VDI.deactivate
>>>>>>> host B: VDI.detach
>>>>>>> 
>>>>>>> so the disk is never activated on both hosts at once, but it does still 
>>>>>>> go through a period when it is attached to both hosts at once. So you 
>>>>>>> could, for example, check that the disk *could* be attached on the 
>>>>>>> vdi_attach SMAPI call, and actually attach it properly on the 
>>>>>>> vdi_activate call.
>>>>>>> 
>>>>>>> Hope this helps,
>>>>>>> 
>>>>>>> Jon
>>>>>>> 
>>>>>>> 
>>>>>>> On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote:
>>>>>>> 
>>>>>>>> hi,
>>>>>>>> 
>>>>>>>> on vm-migrate, xapi attaches a vdi on the migrate-to host
>>>>>>>> before detaching it on the migrate-from host.
>>>>>>>> unfortunately it doesn't work for our product, which doesn't
>>>>>>>> provide a way to attach a volume to multiple hosts at the same time.
>>>>>>>> is VDI_ACTIVATE something what i can use as a workaround?
>>>>>>>> or any other suggestions?
>>>>>>>> 
>>>>>>>> YAMAMOTO Takashi
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Xen-devel mailing list
>>>>>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>>>>>> http://lists.xensource.com/xen-devel
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Xen-devel mailing list
>>>>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>>>>> http://lists.xensource.com/xen-devel
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Xen-devel mailing list
>>>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>>>> http://lists.xensource.com/xen-devel
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>> http://lists.xensource.com/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.