Hi, Dario:
        
        
        Thanks for starting this topic. I have limited experience
          with industry, so I'll provide some input from the academia
          side. Please correct me if I am wrong.
        
        
        1. A real-time CPU scheduler would be great. That's
          actually the motivation that we started the RT-Xen project.
        
        
        The scheduling in a virtualized environments maps to a
          two-level scheduling hierarchy in real-time systems. We can
          use the hierarchical scheduling theories to provide formal
          analysis for it. One key assumption of these theories is a
          formally defined 'server' to represent the VCPUs. We
          implemented and compared different servers in RT-Xen. and
          published at:
        
        
        S. Xi, J. Wilson, C. Lu and C.D. Gill, RT-Xen: Towards
          Real-time Hypervisor Scheduling in Xen, ACM International
          Conference on Embedded Software (EMSOFT'11), October 2011.
        
        
        
        
          J. Lee, S. Xi, S. Chen, L.T.X. Phan, C. Gill, I. Lee, C. Lu
          and O. Sokolsky, Realizing Compositional Scheduling through
          Virtualization, IEEE Real-Time and Embedded Technology and
          Applications Symposium (RTAS'12), April 2012.
        
        
        
        
        
        2. An appropriate cache management scheme would be great.
        Current CPU architecture have both dedicated cache (usually
          L1 and L2) and shared cache (L3).
        
        
        a) For the dedicated cache, existing credit1 use
          partitioned scheduling with load-balancing; while credit2 use
          modified global scheduling with migration
          resistant/compensation. I think if the user runs
          cache-sensitive application, partitioned scheduler seems to be
          a better choice.
        
        
        b) For the shared cache, the 'noisy neighbor' problem where
          one guest OS just runs a cache-busy application and everybody
          hurts can happen. I have seen several papers try to solve it,
          but don't know whether they will be integrated into Xen or
          not. 
        <1> If there are multiple LLC, each shared by a
          subset of PCPUs, a dynamic cluster scheme is proposed in this
          paper:
        Min Lee, Karsten Schwan. "Region Scheduling: Efficiently Using
        the Cache Architectures via Page-level Affinity." ASPLOS 2012,
        London, UK, March 3-7, 2012.
        
          <2> If there is one large shared LLC, cache partition by
          domain seems a solution. These two papers have explored it:
        
        
        
        
        3. An deterministic network latency through Domain-0 would
          be great.
        Currently Xen does not support packet prioritization. Users
          can achieve similar function by using the Linux Traffic
          Control Tool in Domain-0, but priority-inversion can still
          happen.
        
        
        We did some work on prioritizing inter-domain communication
          on Xen, and published at:
        S. Xi, C. Li, C. Lu and C. Gill, Prioritizing Local Inter-Domain
        Communication in Xen, ACM/IEEE International Symposium on
        Quality of Service (IWQoS'13), June 2013.
        
        
        We are working on the actual network traffic through NIC
          now.
        
        
        Thanks and I'd love to hear any
          feedback/comments/suggestions on RT-Xen.
        
        
        Sisu