This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: Domgrps/SchedGrps Merge RFC: domgrps-vmm

To: Chris <hap10@xxxxxxxxxxxxxx>
Subject: [Xen-devel] Re: Domgrps/SchedGrps Merge RFC: domgrps-vmm
From: "Mike D. Day" <ncmike@xxxxxxxxxx>
Date: Mon, 11 Feb 2008 17:07:38 -0500
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 11 Feb 2008 14:08:06 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <1CB9F0D9-ADBD-4368-836C-94CAB62296B2@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: IBM Linux Technology Center
References: <1CB9F0D9-ADBD-4368-836C-94CAB62296B2@xxxxxxxxxxxxxx>
Reply-to: ncmike@xxxxxxxxxx
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.15+20070412 (2007-04-11)
Sorry for the lateness of this review. My opinion is that sched groups
should (as implemented for credit sharing to support stub domains)
should go upstream for 3.3 without being merged with domain groups. 

I think domain groups are more powerful and will eventually assimilate
the focused scheduling group patchset. At the same time, more thought
needs to go into domain groups. To that end, here are a few of my
belated thoughts and questions.

On 04/02/08 14:14 -0500, Chris wrote:
>diff -urN xen-unstable/xen/arch/powerpc/powerpc64/hypercall_table.S 
>--- xen-unstable/xen/arch/powerpc/powerpc64/hypercall_table.S  2007-08-06 
>17:59:54.000000000 -0400
>+++ xen-unstable-domgrps/xen/arch/powerpc/powerpc64/hypercall_table.S  
>2007-11-19 18:42:00.000000000 -0500
>@@ -41,6 +41,7 @@
>         .quad 0 /* do_hvm_op */
>         .quad do_sysctl             /* 35 */
>         .quad do_domctl
>+        .quad do_domgrpctl
>         .rept NR_hypercalls-((.-__hypercall_table)/8)
>         .quad do_ni_hypercall
>         .endr

Rather than creating a new hypercall, it may be best to implement all
the grouping and grouping control as part of a sub-command of the
domctrl hypercall. Then we don't need to add this extra hypercall into
the table of every platform.

>+    if (d->group)
>+        grp = d->group;
>+    else
>+        grp = find_grp_by_id(NULL_GROUP_ID);

Wondering why we have to search for the null group. Why can't the null
group be statically initialized and then always referred to by a
pointer? In fact, what is the role of the null group? I see that it is
a real domain group and it is created automatically. Also it can't be
created by a hypercall.

All domains that do not belong to a group actually belong to the null
group. Could it also be just as easy for each domain to have an
identity group (itself), to which it belongs in the absence of any
other group membership? Or is there a specific purpose to the null

>+#define SERIALIZE_LINKED_LIST(op_name, list_name)             \
>+    rcu_read_lock(&grp->op_name##_read_lock);                 \
>+    memset(&info->op_name, 0, sizeof(domid_t)*MAX_GROUP_SIZE);        \
>+    if (!list_empty(&grp->op_name)) {                         \
>+        i = 0;                                                        \
>+        list_for_each_entry(dom, &grp->op_name, list_name) {  \
>+            info->op_name[i] = dom->domain_id;                        \
>+                  i++;                                                \
>+      }                                                       \
>+    } else                                                    \
>+      info->op_name[i] = NULL_GROUP_ID;                       \
>+    rcu_read_unlock(&grp->op_name##_read_lock);

This should really be cleaned up and made into an inline function. It
also refers to a variable declared outside of the macro and parameters
(info). It's not clear what's happening inside the macro.

>+#define RM_DOM_FROM_LIST(list_name, entry)    \
>+    spin_lock(&grp->list_name##_update_lock); \
>+    if (!list_empty(&grp->list_name))                 \
>+        list_del(&dom->entry);                        \
>+    spin_unlock(&grp->list_name##_update_lock);

Another good candidate for inlining.

>+int add_dom_to_grp(struct domain *dom, dgid_t dgid)


>+      /* skip it if dom is already a member */
>+      if (dom_in_member_list(dom->domain_id, grp))
>+              goto out;
>+      /* remove dom from old group */
>+      if (dom->group != NULL)
>+              del_dom_from_grp(dom);

So a domain can only belong to one group at a time. Would it ever be
useful for a domain to belong to more than one group? This type of
restriction seems to work against the idea of a general grouping
concept. For example, all domains that belong to a scheduling group
(assuming we eventually merge domgrp and schedgrp) would automatically
be removed from a scheduling group if added to any other type of

This argues for keeping different types of groups under different
grouping infrastructures unless we can figure out a way for a domain
to have multiple group memberships. It may be too complicated to do so.

>+int pause_grp(dgid_t dgid)
>+      int ret = 0;
>+      struct domain *member;
>+      struct domain_group *grp;
>+      if (dgid == 0) {
>+              ret = -EPERM;
>+              goto out;
>+      }
>+      grp = find_grp_by_id(dgid);
>+      if (grp == NULL) {
>+              ret = -ESRCH;
>+              goto out;
>+      }
>+      spin_lock_recursive(&grp->grp_big_lock);
>+      rcu_read_lock(&grp->member_list_read_lock);
>+      /* could ignore interupts during this loop to increase atomicity */
>+      list_for_each_entry(member, &grp->member_list, member) {
>+              if (member != current->domain && member->domain_id != 0)
>+                      domain_pause_by_systemcontroller(member);
>+      }
>+      rcu_read_unlock(&grp->member_list_read_lock);
>+      spin_unlock_recursive(&grp->grp_big_lock);
>+      out:
>+      return ret;

Groups are paused in the order in which they are added to the group,
right? (FIFO). What do we do when we have groups which are dependant
upon each other in certain ways. 

For example, the stub domain and HVM domain. At first glance, it seems
that you would want to pause the hvm domain before pausing its stub
domain. And then the reverse on resume: unpause the stub domain, then
unpause the hvm domain. (I could have it backward). But the point is,
group operations may have inter-domain dependencies that are not
accounted for simply by the order in which the domains are added to a
group.  This is true for pause, unpause, start, migrate, etc., on down
the line of possible group operations.


Mike D. Day
Cell: 919 412-3900
Sametime: ncmike@xxxxxxxxxx AIM: ncmikeday  Yahoo: ultra.runner
PGP key: http://www.ncultra.org/ncmike/pubkey.asc

Xen-devel mailing list