[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.



On 17-04-13 19:11:54, Yi Sun wrote:
> On 17-04-13 04:58:06, Jan Beulich wrote:
> > >>> On 13.04.17 at 12:49, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> > > On 17-04-13 03:41:44, Jan Beulich wrote:
> > >> >>> On 13.04.17 at 10:11, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> > >> > On 17-04-12 06:42:01, Jan Beulich wrote:
> > >> >> >>> On 12.04.17 at 14:23, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> > >> >> > On 17-04-12 03:09:56, Jan Beulich wrote:
> > >> >> >> >>> On 12.04.17 at 07:53, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> > >> >> >> > On 17-04-11 09:01:53, Jan Beulich wrote:
> > >> >> >> >> >>> On 01.04.17 at 15:53, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> > >> >> >> >> Furthermore I'm not at all convinced this is appropriate to do 
> > >> >> >> >> in the
> > >> >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you
> > >> >> >> >> have a few thousand VMs, the loop above may take a while.
> > >> >> >> >> 
> > >> >> >> > Hmm, that may be a potential issue. I have two proposals below. 
> > >> >> >> > Could you
> > >> >> >> > please help to check which one you prefer? Or provide another 
> > >> >> >> > solution?
> > >> >> >> > 
> > >> >> >> > 1. Start a tasklet in free_socket_resources() to restore 
> > >> >> > 'psr_cos_ids[socket]'
> > >> >> >> >    of all domains. The action is protected by 'ref_lock' to 
> > >> >> >> > avoid 
> > >> >> > confliction
> > >> >> >> >    in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in 
> > >> >> >> > tasklet or 
> > > memset
> > >> >> >> >    the array to 0 in free_socket_resources().
> > >> >> >> > 
> > >> >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and 
> > >> >> >> > change 
> > > index
> > >> >> >> >    from 'socket' to 'domain_id'. So we keep all domains' COS IDs 
> > >> >> >> > per 
> > > socket
> > >> >> >> >    and can memset the array to 0 when socket is offline. But 
> > >> >> >> > here is an 
> > >> >> > issue
> > >> >> >> >    that we do not know how many members this array should have. 
> > >> >> >> > I cannot 
> > >> >> > find
> > >> >> >> >    a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to 
> > >> >> >> > use 
> > >> >> > reallocation
> > >> >> >> >    in 'psr_alloc_cos' if the newly created domain's id is bigger 
> > >> >> >> > than 
> > >> >> > current
> > >> >> >> >    array number.
> > >> >> >> 
> > >> >> >> The number of domains is limited by the special DOMID_* values.
> > >> >> >> However, allocating an array with 32k entries doesn't sound very
> > >> >> >> reasonable.
> > >> >> > 
> > >> >> > I think 32K entries should be the extreme case. I can allocate e.g. 
> > >> >> > 100 
> > > entries
> > >> >> > when the first domain is created. If a new domain's id exceeds 100, 
> > > reallocate
> > >> >> > another 100 entries. The total number of entries allocated should 
> > >> >> > be less 
> > > than
> > >> >> > 32K. This is a functional requirement which cannot be avoided. How 
> > >> >> > do you 
> > >> >> > think?
> > >> >> 
> > >> >> So how many entries would your array have once I start the 32,000th
> > >> >> domain (having at any one time at most a single one running, besides
> > >> >> Dom0)?
> > >> >> 
> > >> > In such case, we have to keep a 32K array because the domain_id is the 
> > > index to
> > >> > access the array. But this array is per socket so the whole memory 
> > >> > used 
> > > should
> > >> > not be too much.
> > >> 
> > >> We carefully avoid any runtime allocations of order > 0, so if you
> > >> were to set up such an array, you'd need to use vmalloc()/vzalloc().
> > >> But I continue to be unconvinced that we want such a large array
> > >> in the first place.
> > >> 
> > >> > After considering this issue more, I think the original codes might 
> > >> > not be
> > >> > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at 
> > >> > most
> > >> > 288 CPUs. So, I think the domains running at same time in reality may 
> > >> > not 
> > > be
> > >> > so many (no efficient resources). If this hypothesis is right, a loop 
> > >> > to 
> > > write
> > >> > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If 
> > >> > I am
> > >> > wrong, please correct me. Thanks!
> > >> 
> > >> What relationship does the number of CPUs have to the number of
> > >> domains on a host? There could be thousands with just a few dozen
> > >> CPUs, provided none or very few of them have high demands on
> > >> CPU resources. Additionally please never forget that system sizes
> > >> basically only ever grow. Plus we wouldn't want a latent issue here
> > >> in case we ever end up needing to widen domain IDs beyond 16 bits.
> > >> 
> > > How about a per socket array like this:
> > > uint32_t domain_switch[1024];
> > > 
> > > Every bit represents a domain id. Then, we can handle this case as below:
> > > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is 
> > > enough to
> > >    cover socket offline case. We do not need to clear it in 
> > > 'free_socket_resources'.
> > > 
> > > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) 
> > > to set
> > >    the bit to 1 according to domain_id. If the old value is 0 and the 
> > >    'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0.
> > > 
> > > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set 
> > > the bit
> > >    to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick 
> > > flow.
> > > 
> > > Then, we only use 4KB for one socket.
> > 
> > This looks to come closer to something I'd consider acceptable, but
> > I may not understand your intentions in full yet: For one, there's
> > nowhere you clear the bit (other than presumably during socket
> > cleanup). 
> 
> Actually, clear the array in 'free_socket_resources' has same effect. I can
> move clear action into it.
> 
> > And then I don't understand the test_and_ parts of the
> > constructs above, i.e. you don't clarify what the return values
> > would be used/needed for.
> > 
> Sorry, 0 means this domain has not been scheduled to the socket yet. If
> test_and_ returns 0, that is the first time the domain runs on the socket
> (the first time the socket is online). So, we need restore 
> 'psr_cos_ids[socket]'
                 ^ missed 'after'. I mean the first time the domain is scheduled
                                   to the socket after the socket is online.
> to 0 in 'psr_ctxt_switch_to()'.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.