1. 20 Mar, 2018 1 commit
  2. 19 Mar, 2018 2 commits
    • Peter Xu's avatar
      qmp: support out-of-band (oob) execution · cf869d53
      Peter Xu authored
      Having "allow-oob":true for a command does not mean that this command
      will always be run in out-of-band mode.  The out-of-band quick path will
      only be executed if we specify the extra "run-oob" flag when sending the
      QMP request:
      
          { "execute":   "command-that-allows-oob",
            "arguments": { ... },
            "control":   { "run-oob": true } }
      
      The "control" key is introduced to store this extra flag.  "control"
      field is used to store arguments that are shared by all the commands,
      rather than command specific arguments.  Let "run-oob" be the first.
      
      Note that in the patch I exported qmp_dispatch_check_obj() to be used to
      check the request earlier, and at the same time allowed "id" field to be
      there since actually we always allow that.
      Reviewed-by: 's avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: 's avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20180309090006.10018-19-peterx@redhat.com>
      Reviewed-by: 's avatarEric Blake <eblake@redhat.com>
      [eblake: rebase to qobject_to(), spelling fix]
      Signed-off-by: 's avatarEric Blake <eblake@redhat.com>
      cf869d53
    • Peter Xu's avatar
      monitor: let suspend/resume work even with QMPs · e3e977d4
      Peter Xu authored
      This patches allows QMP monitors to be suspended/resumed.
      
      One thing to mention is that for QMPs that are using IOThreads, we need
      an explicit kick for the IOThread in case it is sleeping.
      
      Meanwhile, we need to take special care on non-interactive HMPs.
      Currently only gdbserver is using that.  For these monitors, we still
      don't allow suspend/resume operations.
      
      Since at it, add traces for the operations.
      Signed-off-by: 's avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20180309090006.10018-14-peterx@redhat.com>
      Reviewed-by: 's avatarEric Blake <eblake@redhat.com>
      Signed-off-by: 's avatarEric Blake <eblake@redhat.com>
      e3e977d4
  3. 12 Mar, 2018 1 commit
  4. 19 Feb, 2018 1 commit
    • Stefan Hajnoczi's avatar
      trace: avoid SystemTap "char const" warnings · 7f1d87ab
      Stefan Hajnoczi authored
      SystemTap's dtrace(1) produces the following warning when it encounters
      "char const" instead of "const char":
      
        Warning: /usr/bin/dtrace:trace-dtrace-root.dtrace:66: syntax error near:
        probe flatview_destroy_rcu
      
        Warning: Proceeding as if --no-pyparsing was given.
      
      This is a limitation in current SystemTap releases.  I have sent a patch
      upstream to accept "char const" since it is valid C:
      
        https://sourceware.org/ml/systemtap/2018-q1/msg00017.html
      
      In QEMU we still wish to avoid warnings in the current SystemTap
      release.  It's simple enough to replace "char const" with "const char".
      
      I'm not changing the documentation or implementing checks to prevent
      this from occurring again in the future.  The next release of SystemTap
      will hopefully resolve this issue.
      
      Cc: Daniel P. Berrange <berrange@redhat.com>
      Signed-off-by: 's avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: 's avatarDaniel P. Berrange <berrange@redhat.com>
      Message-id: 20180201162625.4276-1-stefanha@redhat.com
      Signed-off-by: 's avatarStefan Hajnoczi <stefanha@redhat.com>
      7f1d87ab
  5. 16 Jan, 2018 1 commit
  6. 18 Dec, 2017 1 commit
  7. 21 Sep, 2017 1 commit
  8. 01 Aug, 2017 1 commit
  9. 17 Jul, 2017 1 commit
    • Lluís Vilanova's avatar
      trace: [trivial] Statically enable all guest events · 5caa262f
      Lluís Vilanova authored
      The existing optimizations makes it feasible to have them available on all
      builds.
      
      Some quick'n'dirty numbers with 400.perlbench (SPECcpu2006) on the train input
      (medium size - suns.pl) and the guest_mem_before event:
      
      * vanilla, statically disabled
      real    0m2,259s
      user    0m2,252s
      sys     0m0,004s
      
      * vanilla, statically enabled (overhead: 2.18x)
      real    0m4,921s
      user    0m4,912s
      sys     0m0,008s
      
      * multi-tb, statically disabled (overhead: 0.99x) [within noise range]
      real    0m2,228s
      user    0m2,216s
      sys     0m0,008s
      
      * multi-tb, statically enabled (overhead: 0.99x) [within noise range]
      real    0m2,229s
      user    0m2,224s
      sys     0m0,004s
      
      Now enabling all events when booting an ARM system that immediately shuts down
      (https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg04085.html):
      
      * vanilla, statically disabled
      real	0m32,153s
      user	0m31,276s
      sys	0m0,108s
      
      * vanilla, statically enabled (overhead: 1.35x)
      real	0m43,507s
      user	0m42,680s
      sys	0m0,168s
      
      * multi-tb, statically disabled (overhead: 1.03x)
      real	0m32,993s
      user	0m32,516s
      sys	0m0,104s
      
      * multi-tb, statically enabled (overhead: 1.00x) [within noise range]
      real	0m32,110s
      user	0m31,176s
      sys	0m0,156s
      
      And finally enabling all events using Emilio's dbt-bench
      (where orig == vanilla, new == multi-tb):
      
                                                              NBench score; higher is better
      
        180 +-+--------+----------+----------+---------+----------+----------+----------+----------+----------+---------+----------+--------+-+
            |                                                                                                                                 |
            |                                      *** $$$$%%                                                                    orig         |
        160 +-+....................................*.*.$..$.%............................................................orig-enabled       +-+
            |                                      * * $  $ %                                                                     new         |
        140 +-+....................................*.*.$..$.%............................................................new-disabled.......+-+
            |                                      * * $  $ %                                                                                 |
            |                                      * * $  $ %                                                                                 |
        120 +-+....................................*.*.$..$.%...............................................................................+-+
            |                                      * * $  $ %                                                                                 |
            |                                      * * $  $ %                                                                                 |
        100 +-+....................................*.*.$..$.%.....$$$%%%....................................................................+-+
            |                                      * * $  $ % *** $ $  % *** $$$%%                                                            |
         80 +-+....................................*.*.$..$.%.*.*.$.$..%.*.*.$.$.%..........................................................+-+
            |                                      * * $  $ % * * $ $  % * * $ $ %                                                            |
            |                                      * * $  $ % * * $ $  % * * $ $ %                                                            |
         60 +-+.........................***..$$$%%.*.*##..$.%.*.*.$.$..%.*.*.$.$.%..***.$$$%%...............................................+-+
            |                **** $$$%% * *  $ $ % * * #  $ % * *## $  % * * $ $ %  * * $ $ %                                                 |
            |                *  * $ $ % * *  $ $ % * * #  $ % * * # $  % * *## $ %  * * $ $ %                                                 |
         40 +-+..............*..*.$.$.%.*.*..$.$.%.*.*.#..$.%.*.*.#.$..%.*.*.#.$.%..*.*.$.$.%...............................................+-+
            |                *  * $ $ % * *  $ $ % * * #  $ % * * # $  % * * # $ %  * *## $ %                                  *** $$$%%%     |
         20 +-+....***.$$$%%.*..*##.$.%.*.*###.$.%.*.*.#..$.%.*.*.#.$..%.*.*.#.$.%..*.*.#.$.%..................................*.*.$.$..%...+-+
            |      * *## $ % *  * # $ % * *  # $ % * * #  $ % * * # $  % * * # $ %  * * # $ %                                  * *## $  %     |
            |      * * # $ % *  * # $ % * *  # $ % * * #  $ % * * # $  % * * # $ %  * * # $ %            ***###$$%% ***##$$$%% * * # $  %     |
          0 +-+----***##$$%%-****##$$%%-***###$$%%-***##$$$%%-***##$$%%%-***##$$%%--***##$$%%-****##$$%%-***###$$%%-***##$$$%%-***##$$%%%---+-+
           NUMERIC SORTSTRING SORT   BITFIEFP EMULATION ASSIGNMENT       IDEA    HUFFMAN    FOURIER NEURLU DECOMPOSITION      gmean
      png: http://imgur.com/a/8XG5SSigned-off-by: 's avatarLluís Vilanova <vilanova@ac.upc.edu>
      Reviewed-by: 's avatarEmilio G. Cota <cota@braap.org>
      Signed-off-by: 's avatarEmilio G. Cota <cota@braap.org>
      Message-id: 149915849243.6295.4484103824675839071.stgit@frigg.lan
      Signed-off-by: 's avatarStefan Hajnoczi <stefanha@redhat.com>
      5caa262f
  10. 15 Jun, 2017 3 commits
  11. 13 Jun, 2017 2 commits
  12. 06 Jun, 2017 1 commit
  13. 02 Jun, 2017 1 commit
  14. 23 May, 2017 1 commit
    • Eric Blake's avatar
      shutdown: Add source information to SHUTDOWN and RESET · cf83f140
      Eric Blake authored
      Time to wire up all the call sites that request a shutdown or
      reset to use the enum added in the previous patch.
      
      It would have been less churn to keep the common case with no
      arguments as meaning guest-triggered, and only modified the
      host-triggered code paths, via a wrapper function, but then we'd
      still have to audit that I didn't miss any host-triggered spots;
      changing the signature forces us to double-check that I correctly
      categorized all callers.
      
      Since command line options can change whether a guest reset request
      causes an actual reset vs. a shutdown, it's easy to also add the
      information to reset requests.
      Signed-off-by: 's avatarEric Blake <eblake@redhat.com>
      Acked-by: David Gibson <david@gibson.dropbear.id.au> [ppc parts]
      Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [SPARC part]
      Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x parts]
      Message-Id: <20170515214114.15442-5-eblake@redhat.com>
      Reviewed-by: 's avatarMarkus Armbruster <armbru@redhat.com>
      Signed-off-by: 's avatarMarkus Armbruster <armbru@redhat.com>
      cf83f140
  15. 25 Apr, 2017 2 commits
  16. 05 Mar, 2017 1 commit
    • Markus Armbruster's avatar
      qmp: Drop duplicated QMP command object checks · 104fc302
      Markus Armbruster authored
      qmp_check_input_obj() duplicates qmp_dispatch_check_obj(), except the
      latter screws up an error message.  handle_qmp_command() runs first
      the former, then the latter via qmp_dispatch(), masking the screwup.
      
      qemu-ga also masks the screwup, because it also duplicates checks,
      just differently.
      
      qmp_check_input_obj() exists because handle_qmp_command() needs to
      examine the command before dispatching it.  The previous commit got
      rid of this need, except for a tracepoint, and a bit of "id" code that
      relies on qdict not being null.
      
      Fix up the error message in qmp_dispatch_check_obj(), drop
      qmp_check_input_obj() and the tracepoint.  Protect the "id" code with
      a conditional.
      Signed-off-by: 's avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: 's avatarEric Blake <eblake@redhat.com>
      Message-Id: <1488544368-30622-9-git-send-email-armbru@redhat.com>
      104fc302
  17. 21 Feb, 2017 1 commit
  18. 31 Jan, 2017 2 commits
  19. 16 Jan, 2017 2 commits
  20. 03 Jan, 2017 2 commits
    • Stefan Hajnoczi's avatar
      aio: self-tune polling time · 82a41186
      Stefan Hajnoczi authored
      This patch is based on the algorithm for the kvm.ko halt_poll_ns
      parameter in Linux.  The initial polling time is zero.
      
      If the event loop is woken up within the maximum polling time it means
      polling could be effective, so grow polling time.
      
      If the event loop is woken up beyond the maximum polling time it means
      polling is not effective, so shrink polling time.
      
      If the event loop makes progress within the current polling time then
      the sweet spot has been reached.
      
      This algorithm adjusts the polling time so it can adapt to variations in
      workloads.  The goal is to reach the sweet spot while also recognizing
      when polling would hurt more than help.
      
      Two new trace events, poll_grow and poll_shrink, are added for observing
      polling time adjustment.
      Signed-off-by: 's avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: 's avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 20161201192652.9509-13-stefanha@redhat.com
      Signed-off-by: 's avatarStefan Hajnoczi <stefanha@redhat.com>
      82a41186
    • Stefan Hajnoczi's avatar
      aio: add polling mode to AioContext · 4a1cba38
      Stefan Hajnoczi authored
      The AioContext event loop uses ppoll(2) or epoll_wait(2) to monitor file
      descriptors or until a timer expires.  In cases like virtqueues, Linux
      AIO, and ThreadPool it is technically possible to wait for events via
      polling (i.e. continuously checking for events without blocking).
      
      Polling can be faster than blocking syscalls because file descriptors,
      the process scheduler, and system calls are bypassed.
      
      The main disadvantage to polling is that it increases CPU utilization.
      In classic polling configuration a full host CPU thread might run at
      100% to respond to events as quickly as possible.  This patch implements
      a timeout so we fall back to blocking syscalls if polling detects no
      activity.  After the timeout no CPU cycles are wasted on polling until
      the next event loop iteration.
      
      The run_poll_handlers_begin() and run_poll_handlers_end() trace events
      are added to aid performance analysis and troubleshooting.  If you need
      to know whether polling mode is being used, trace these events to find
      out.
      
      Note that the AioContext is now re-acquired before disabling notify_me
      in the non-polling case.  This makes the code cleaner since notify_me
      was enabled outside the non-polling AioContext release region.  This
      change is correct since it's safe to keep notify_me enabled longer
      (disabling is an optimization) but potentially causes unnecessary
      event_notifer_set() calls.  I think the chance of performance regression
      is small here.
      Signed-off-by: 's avatarStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: 's avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 20161201192652.9509-4-stefanha@redhat.com
      Signed-off-by: 's avatarStefan Hajnoczi <stefanha@redhat.com>
      4a1cba38
  21. 31 Oct, 2016 1 commit
    • Alex Williamson's avatar
      memory: Don't use memcpy for ram_device regions · 4a2e242b
      Alex Williamson authored
      With a vfio assigned device we lay down a base MemoryRegion registered
      as an IO region, giving us read & write accessors.  If the region
      supports mmap, we lay down a higher priority sub-region MemoryRegion
      on top of the base layer initialized as a RAM device pointer to the
      mmap.  Finally, if we have any quirks for the device (ie. address
      ranges that need additional virtualization support), we put another IO
      sub-region on top of the mmap MemoryRegion.  When this is flattened,
      we now potentially have sub-page mmap MemoryRegions exposed which
      cannot be directly mapped through KVM.
      
      This is as expected, but a subtle detail of this is that we end up
      with two different access mechanisms through QEMU.  If we disable the
      mmap MemoryRegion, we make use of the IO MemoryRegion and service
      accesses using pread and pwrite to the vfio device file descriptor.
      If the mmap MemoryRegion is enabled and results in one of these
      sub-page gaps, QEMU handles the access as RAM, using memcpy to the
      mmap.  Using either pread/pwrite or the mmap directly should be
      correct, but using memcpy causes us problems.  I expect that not only
      does memcpy not necessarily honor the original width and alignment in
      performing a copy, but it potentially also uses processor instructions
      not intended for MMIO spaces.  It turns out that this has been a
      problem for Realtek NIC assignment, which has such a quirk that
      creates a sub-page mmap MemoryRegion access.
      
      To resolve this, we disable memory_access_is_direct() for ram_device
      regions since QEMU assumes that it can use memcpy for those regions.
      Instead we access through MemoryRegionOps, which replaces the memcpy
      with simple de-references of standard sizes to the host memory.
      
      With this patch we attempt to provide unrestricted access to the RAM
      device, allowing byte through qword access as well as unaligned
      access.  The assumption here is that accesses initiated by the VM are
      driven by a device specific driver, which knows the device
      capabilities.  If unaligned accesses are not supported by the device,
      we don't want them to work in a VM by performing multiple aligned
      accesses to compose the unaligned access.  A down-side of this
      philosophy is that the xp command from the monitor attempts to use
      the largest available access weidth, unaware of the underlying
      device.  Using memcpy had this same restriction, but at least now an
      operator can dump individual registers, even if blocks of device
      memory may result in access widths beyond the capabilities of a
      given device (RTL NICs only support up to dword).
      Reported-by: 's avatarThorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
      Signed-off-by: 's avatarAlex Williamson <alex.williamson@redhat.com>
      Acked-by: 's avatarPaolo Bonzini <pbonzini@redhat.com>
      4a2e242b
  22. 12 Oct, 2016 2 commits
  23. 28 Sep, 2016 5 commits
  24. 27 Sep, 2016 4 commits