Xen project Mailing List

Re: NULL pointer dereference in xenbus_thread->...

From: Jason Andryuk <jandryuk@xxxxxxxxx>

Date: Sat, 3 May 2025 11:08:59 -0400

Cc: Jason Andryuk <jason.andryuk@xxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Sat, 03 May 2025 15:09:23 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Apr 30, 2025 at 11:44 AM Jürgen Groß <jgross@xxxxxxxx> wrote: > I have looked at this issue multiple times now. > > Just some remarks what IMO could go wrong (I didn't find any proof that > this really happened, though), in case someone wants to double check: > > The most probably candidate for something going wrong is a use-after-free > of a struct xb_req_data element (normally named "req" in the related code). > > Some words about the not really obvious locking scheme used for those > elements: A "req" is owned by a thread as long as it isn't in any of the > lists it can live in (xs_reply_list or xb_write_list). Putting it into one > of the lists or removing it again requires to hold the xb_write_mutex. While I see how the ownership is transferred between lists and the static variables in process_writes(), xs_wait_for_reply()/read_reply() is looking at req, too. > A "req" needs to be in a certain state when either in one of the lists or > when being owned by a worker thread. > > I'm wondering whether it could happen that a thread waiting for a "req" > could be woken up and the "req" is being freed before the waiting thread > can react. Normally this shouldn't be possible, but "never say never". > What catched my eye today is the test of req->state == xb_req_state_wait_reply > in process_msg() just after dropping the xb_write_mutex. This looks a little > bit fishy, but OTOH the request has been just removed from the xs_reply_list, > so no mutex should be needed for that test. > > Possible candidates for such an "impossible" scenario include a wrap of > xs_request_id (not very probable, though, as having 4 billion Xenstore > requests "in flight" is rather unlikely IMHO). If read_reply() exits because req->err is set, and req->state is not queued or wait_reply, the req will be freed in xs_wait_for_reply(), but not removed from the list. If it is on the xs_reply_list, req->cb(req) could be called, but that would need req->state == wait_reply. I don't see how this could happen, but it came to mind. Put another way, kfree() in xs_wait_for_reply() doesn't know if the req is on the list or not. I wonder if it would be better to just have if (req->state == xb_req_state_got_reply) kfree(req); else req->state = xb_req_state_aborted; Maybe also some sort of assertion that req is not on a list, too. This would at least make it clear exactly when xs_wait_for_reply() is supposed to free req. Given these states: enum xb_req_state { xb_req_state_queued, xb_req_state_wait_reply, xb_req_state_got_reply, xb_req_state_aborted }; Being explicit on who frees each state would be good. Or if all list manipulations have to be protected by xb_write_mutex, then just let the "caller" xs_wait_for_reply() always unlink and kfree()? Yet it's unclear how req->cb(req) leads to a use-after-free. req->cb() was valid to make the call, but then the waitqueue was (probably) zeroed? That seems like a race. ... or is wake_up() faulting because xs_wait_for_reply() woke up and freed the req before wake_up() finished with it? __wake_up_common (kernel/sched/wait.c:85) is: list_for_each_entry_safe_from(curr, next, &wq_head->head, entry) { I'l need to look more into the wake_up() implementation... Regards, Jason

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.