[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Debug event_channel.c



On Thu, Sep 29, 2011 at 9:52 PM, Ian Campbell
<Ian.Campbell@xxxxxxxxxxxxx> wrote:
> On Thu, 2011-09-29 at 13:36 +0100, Daniel Castro wrote:
>> On Thu, Sep 29, 2011 at 8:30 PM, Ian Campbell
>> <Ian.Campbell@xxxxxxxxxxxxx> wrote:
>> > On Thu, 2011-09-29 at 12:09 +0100, Daniel Castro wrote:
>> >> On Tue, Sep 27, 2011 at 9:09 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> 
>> >> wrote:
>> >> > On Tue, 2011-09-27 at 12:36 +0100, Daniel Castro wrote:
>> >> >> On Tue, Sep 27, 2011 at 8:33 PM, Andrew Cooper
>> >> >> <andrew.cooper3@xxxxxxxxxx> wrote:
>> >> >> > On 27/09/11 12:09, Daniel Castro wrote:
>> >> >> >> Hello All,
>> >> >> >>
>> >> >> >> I am trying to debug event_channel.c for this I have filled the
>> >> >> >> functions with gdprintk(XENLOG_WARNING, "..."); yet the messages are
>> >> >> >> not displayed on dmesg or /var/log/xen. Where could they be printed?
>> >> >> >> or should I use a different function?
>> >> >> >>
>> >> >> >> In grub I have loglvl=all to print all messages...
>> >> >> >>
>> >> >> >> Thanks for the answer,
>> >> >> >>
>> >> >> >> Daniel
>> >> >> >>
>> >> >> >
>> >> >> > gdprintk only gets set with guest debugging enabled. ( guest_loglvl 
>> >> >> > on
>> >> >> > the command line )
>> >> >> >
>> >> >> > My suggestion would be to just use regular printks and look at the
>> >> >> > serial log.
>> >> >>
>> >> >> How can can I look at the serial log from dom0?
>> >> >
>> >> > 'xl dmesg' (or using a serial cable of course ;-))
>> >> >
>> >> > You can also add XENCONSOLED_TRACE=hv in /etc/sysconfig/xencommons (or
>> >> > the equivalent on your distro, the effect should be to add --log=hv to
>> >> > the xenconsoled command line). Then the xen console will be logged
>> >> > under /var/log/xen somewhere.
>> >>
>> >> Ian, thanks for the info.
>> >>
>> >> This is the info I gathered:
>> >> (XEN) schedule.c:658:d1 DEBUG 1: START DO POLL port -32060 on
>> >> sched_poll.nr_ports 1
>> >
>> > port == -32060 doesn't sound right...
>> >
>> >> (XEN) schedule.c:719:d1 DEBUG 1: DO POLL test bit on port 2 exit here
>> >> -> if ( test_bit(port, &shared_info(d, evtchn_pending)) )
>> >> (XEN) schedule.c:746:d1 DEBUG 1: DO POLL GOTO out: check previus msg,
>> >> return rc=0
>> >> (XEN) event_channel.c:606:d1 DEBUG 1: set_pending
>> >> (XEN) event_channel.c:628:d1 DEBUG 1 : evtchn_set_pending test_bit AND
>> >> test_and_set_bit returned 0.
>> >> (XEN) event_channel.c:637:d1 DEBUG 1: evtchn_set_pending bitmap_empty 
>> >> return 0.
>> >>
>> >> In my code test_bit_and_clear in Xenstore ring_wait is in fact
>> >> returning 0, it was expecting a one, the do_poll is finding the bit in
>> >> 1 also according to test_bit, right?
>> >> So the error is on the my test_bit_and_clear. Am I reading it correctly?
>> >
>> > I'm not sure I follow what your debug messages are actually saying, but
>> > if do_poll is exiting because of the
>> >        if ( test_bit(port, &shared_info(d, evtchn_pending)) )
>> >            goto out;
>> > inside the "for ( i = 0; i < sched_poll->nr_ports; i++ )" loop then this
>> > indicates that the event channel is pending. If you aren't seeing this
>> > on the guest end then there is likely a problem somewhere on that end.
>> >
>> > In your current ring_wait function you have:
>> >        int wait = test_and_clear_bit(event, shinfo->evtchn_pending);
>> >        int ret = 1;
>> >        while (wait!=0 || ret!=0){
>> >                ret = hypercall_sched_op(SCHEDOP_poll, &poll);
>> >                wait = test_and_clear_bit(event, shinfo->evtchn_pending);
>> >                struct vcpu_info *vcpu = shinfo->vcpu_info;
>> >                dprintf(1,"DEBUG bit clear is %d and ret %d\n",wait,ret);
>> >                time = shinfo->vcpu_info[0].time;
>> >                dprintf(1,"TIME system %d timestamp 
>> > %d\n",time.system_time,time.tsc_timestamp);
>> >        }
>> > }
>> >
>> > Isn't "wait!=0" backwards? Don't you want to succeed (i.e. fall out of
>> > the loop) when wait!=0 rather than keep looping?
>>
>> Yes, at some point I must have screwed that up, and later corrected
>> it... Sorry... Yet the problem remains, in the ring wait I get
>> stuck...
>>
>> What else could I check?
>
> Does shinfo actually point to the right thing?
>
> Looking at your *get_shared_info(void) you have:
>    xatp.gpfn  = malloc_high(sizeof(shared_info));
>    shared_info = (struct shared_info *)(xatp.gpfn << PAGE_SHIFT);
>
> but malloc_high returns a void * (i.e. a pointer) not an mfn.
>
> I suspect you want:
>    shared_info = malloc_high(sizeof(shared_info));
>    xatp.gpfn  = ((unsigned long)shared_info >> PAGE_SHIFT);
>
> At least here the compiler produces a clear warning about this issue:
>
>        src/xen.c: In function ‘get_shared_info’:
>        src/xen.c:157: warning: assignment makes integer from pointer without 
> a cast
>
> The code in your seabios tree currently produces nearly a page of
> warnings. It is very good practice to get into the habit of taking care
> of all warnings as soon as they appear, more often than not they
> represent are real problem with the code. For example just from skimming
> them I can see that a bunch of your debug is not printing what you seem
> to think it is.

Thanks for the comments Ian, I have fixed most of them now. Also I am
getting results now, but after the first wait I get stuck again in the
wait...
First I had to change the function bit_test_and_clear to:
static inline int test_and_clear_bit(int nr, volatile void *addr)
{
    int oldbit;
    asm volatile (
        "lock; btrl %2,%1\n\tsbbl %0,%0"
        : "=r" (oldbit), "=m" (*(volatile long *)addr)
        : "Ir" (nr), "m" (*(volatile long *)addr) : "memory");
    return oldbit;
}
In order to check the bit value without changing it I am using:
static inline int test_bit(int nr, const volatile void *addr)
{
   int oldbit;
//was btl changed to
    asm volatile (
        "btrl %2,%1\n\tsbbl %0,%0"
        : "=r" (oldbit)
        : "m" (addr), "Ir" (nr) : "memory" );
    return oldbit;
}
The first funny thing is that test_bit_and_clear return -1 with Bit
representation 1111111 11111111 11111111 11111111 I was expecting
something like 0...01...

Now my problem lies that after the first wait exists, the second gets
stuck in the same predicament I had before...

Does any one see any problems in the above code?


>
> Ian.
>
>>
>> >
>> > Ian.
>> >
>> >>
>> >> Thanks all,
>> >>
>> >> Daniel
>> >> >
>> >> > Ian.
>> >> >
>> >> >>
>> >> >> >
>> >> >> > --
>> >> >> > Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
>> >> >> > T: +44 (0)1223 225 900, http://www.citrix.com
>> >> >> >
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > Xen-devel mailing list
>> >> >> > Xen-devel@xxxxxxxxxxxxxxxxxxx
>> >> >> > http://lists.xensource.com/xen-devel
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >
>> >
>> >
>>
>>
>>
>
>
>



-- 
+-=====---------------------------+
| +---------------------------------+ | This space intentionally blank
for notetaking.
| |   | Daniel Castro,                |
| |   | Consultant/Programmer.|
| |   | U Andes                         |
+-------------------------------------+

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.