[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Lots of connections led oxenstored stuck



On 11 Aug 2014, at 01:35, Joe Jin <joe.jin@xxxxxxxxxx> wrote:

> On 08/08/14 17:37, Dave Scott wrote:
>> 
>> On 8 Aug 2014, at 09:35, Liuqiming (John) <john.liuqiming@xxxxxxxxxx> wrote:
>> 
>>> In oxenstored it use "select" for incoming socket, so I don't think it can 
>>> handle more than 1024 socket connections. 
>> 
>> That’s true.
> 
> The problem is when oxenstored does not respond any request anymore even all
> thread exited, with my reproducer, when you executed it and all threads 
> exited,
> "xm list -l" will stuck.

OK so is this the behaviour you expect:

* root in dom0 opens many connections, until oxenstored is out of resources 
(where the most limited resource is currently file descriptors)
* root in dom0 closes the connections
* oxenstored recovers, and ‘xm list -l’ works again

Instead, you’re seeing oxenstored getting into a stuck state causing ‘xm list 
-l’ to block — is this accurate?

Could you share your reproducer program?

Thanks,
Dave

> 
> Thanks,
> Joe
>> 
>> In the long term I’d like to use Lwt which internally uses libev and has a 
>> more scalable event loop.
>> 
>> In the short term I think Zheng Li (cc:d) may have a prototype patch to work 
>> around this issue. Is this right, Zheng?
>> 
>> Cheers,
>> Dave
>> 
>>> 
>>>> -----Original Message-----
>>>> From: xen-devel-bounces@xxxxxxxxxxxxx
>>>> [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf Of Joe Jin
>>>> Sent: Friday, August 08, 2014 3:01 PM
>>>> To: David Scott; Luis R. Rodriguez; Ian Jackson
>>>> Cc: xen-devel
>>>> Subject: [Xen-devel] Lots of connections led oxenstored stuck
>>>> 
>>>> Hi,
>>>> 
>>>> During internal test on Xen-4.3-stable we found sometime when restarted
>>>> Xen, it stuck and does not response any request, xenstored.log filled
>>>> out below stuff:
>>>> [20140702T21:00:41.564Z|error|xenstored] caught exception
>>>> Unix.Unix_error(15, "accept", "")
>>>> 
>>>> I created reproducer which will create 2000 connections to oxenstored,
>>>> after
>>>> ran the reproducer, "xm list --long" will stuck, oxenstored does not
>>>> response anymore, same test case passed when use xenstored, any input
>>>> will appreciate!
>>>> 
>>>> /*
>>>> * This program used to test oxenstored connections stuck issue.
>>>> * please compile by below command:
>>>> *  gcc -o client client.c -lpthread
>>>> */
>>>> #include <stdio.h>
>>>> #include <sys/socket.h>
>>>> #include <sys/un.h>
>>>> #include <unistd.h>
>>>> #include <string.h>
>>>> #include <pthread.h>
>>>> #include <stdlib.h>
>>>> #include <errno.h>
>>>> 
>>>> 
>>>> void *main_thread(void *arg)
>>>> {
>>>>    struct sockaddr_un address;
>>>>    int socket_fd, nbytes;
>>>>    char buffer[256];
>>>>    int i;
>>>>    extern int errno;
>>>> 
>>>>    memcpy(&i, arg, sizeof(i));
>>>>    socket_fd = socket(PF_UNIX, SOCK_STREAM, 0);
>>>>    if (socket_fd < 0) {
>>>>            fprintf(stderr, "socket() %dth failed, errno=%d\n", i, errno);
>>>>            return;
>>>>    }
>>>>    fprintf(stderr, "socket() %dth ok!\n", i);
>>>> 
>>>>    /* start with a clean address structure */
>>>>    memset(&address, 0, sizeof(struct sockaddr_un));
>>>> 
>>>>    address.sun_family = AF_UNIX;
>>>>    snprintf(address.sun_path, 1024, "/var/run/xenstored/socket");
>>>> 
>>>>    if (connect(socket_fd,
>>>>                (struct sockaddr *) &address,
>>>>                sizeof(struct sockaddr_un)) != 0) {
>>>>            fprintf(stderr, "connect() %d failed, error=%d", i, errno);
>>>>            return;
>>>>    }
>>>>    fprintf(stderr, "connec() %dth ok!\n", i);
>>>> 
>>>>    while (1)
>>>>            sleep(1);
>>>>    if (arg) {
>>>>            free(arg);
>>>>            arg = NULL;
>>>>    }
>>>> 
>>>>    return;
>>>> }
>>>> 
>>>> int main(void)
>>>> {
>>>>    int i;
>>>>    for (i = 0; i < 2000; i++) {
>>>>            void *arg = malloc(sizeof(i));
>>>>            memset(arg, 0, sizeof(i));
>>>>            memcpy(arg, &i, sizeof(i));
>>>>            pthread_t thread;
>>>>            if (pthread_create(&thread, NULL, main_thread, arg) != 0) {
>>>>                    perror("pthread_create:");
>>>>                    break;
>>>>            }
>>>>    }
>>>>    /* Wait all children exit */
>>>>    sleep(3);
>>>>    return 0;
>>>> }
>>>> /* end */
>>>> 
>>>> Thanks,
>>>> Joe
>>>> 
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxx
>>>> http://lists.xen.org/xen-devel
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxx
>> http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.