Configuring your system for cloud gaming with VFIO, Parsec and AMD
This project will continue the previous one, where I considered setting up VFIO in general. For many, this will be enough, but my goal is still to have a powerful gaming configuration that I can connect to from my PC in the living room via 20 meters of twisted pair.
becomes more and more popular with the fact that
and other companies offer gameplay streaming services to your device. Such configurations usually rely on a powerful server in the data center that processes the game itself, sending a compressed video stream to the user’s device. This technology works surprisingly well, but due to the high cost, low speed of the Internet in many regions of the world and
, this option is not suitable for everyone.
However, if you already have a relatively powerful gaming PC and want to use it to run your existing games over the network, there are many solutions for that.
Here are some options:
- Steam remote play: Works for games launched via Steam.
- Parsec: can stream the entire desktop, including games.
- Moonlight: the same, but only works on hosts with Nvidia GPUs, since the implementation is based on the Nvidia GameStream protocol.
I’ve done well in the past with Steam remote play, but the limiting factor has been my dependence on Steam. I recently bought a couple of games from
which you can probably play remotely by running them through Steam, but I don’t feel like messing around with that.
For this particular configuration, I preferred to use Parsec. It is not perfect, but it will do for our purposes. The video card used, namely the Radeon RX570, also introduces its own limitations, since it cannot work with Moonlight.
First, Parsec needs to be installed on the client and host machines. My client is returned from the other world Lenovo ThinkPad X230… It does not shine with speed, but it can still perform hardware decoding of H.264 and uses only 12W of power at rest, making it an ideal candidate for such testing. I also note that I connected both machines to the local network via Ethernet in order to exclude possible speed drops due to WiFi.
VFIO, games and you
The more serious problem was the performance of games inside the VM. After the VFIO setup was completed and everything worked, I no longer optimized this configuration. When it came time to actually use it, I ran into a number of performance issues, in particular freezes and low frame rates. Still, the latency requirements are much more stringent for gaming than for other workloads.
Thanks to great Arch guide I have come up with many options that could potentially improve the performance of this virtual machine for gaming.
Before starting, I ran a few benchmarks to get a better understanding of the specific level of performance available. I used GTA V as a baseline for testing each change, as this game is good at identifying performance issues.
What I ended up doing:
- Set the CPU model to
- Used dynamic isolation CPU cores, tied to VMto prevent these kernels from being used by the host OS and other VMs.
- Configured the CPU governor (processor frequency selection manager) to increase the performance of isolated cores in order to eliminate problems with insufficient frequency and switching between standby / work modes:
echo performance > /sys/devices/system/cpu/cpu[4-7]/cpufreq/scaling_governor…
- Included static pool HugePages to avoid problems with low memory processing speed.
- Disabled SMT in UEFI settings, removing one extra variable.
- Installed a 64GB DDR4-3600 kit to allocate more memory for the VM (16GB), leaving enough for the host and other services.
The golden rule when debugging a configuration is to change one parameter at a time, comparing the results obtained with the original ones. Not every change I made was positive. For example, when configuring the VM to use 4 cores and 8 threads instead of just 4 cores, the frame rate in GTA V dropped by half. My guess is that the VM viewed the “SMT kernels” as real, confusing the Windows scheduler.
By experimenting with this setting and applying the above techniques, I was able to get rid of most of the problems, having a smoother gameplay. Finally, when playing in a virtual machine, it began to feel that under the hood it had 4 cores, 16GB of RAM and an RX570.
The result of the CPU performance test in Passmark before improvements
The same Passmark test after revisions
Memory performance test result in Passmark before revisions
The same Passmark test after revisions
At this point, I would find the current config to be fantastic for someone looking to play using VFIO. However, since I was using it over the network using Parsec, I soon ran into other problems, but this time with the video card.
Graphics card problems
Parsec and other similar solutions encode the image into a video stream using an encoder on the video card itself. In the settings, this can be referred to as hardware encoding (hardware encoding). Before assembling this configuration, I did not know that the encoders of AMD video cards
If the host has a card from AMD, then encoding is usually much slower than on Nvidia and even Intel cards. Although, at low resolutions, problems should not be observed. If all guests support the H.265 codec, it will perform better when enabled.
And it is palpable. When trying to stream games at 1080p, the result was an inconsistent, laggy mess. Imagine trying to play at about 30fps and the frame rate graph resembles heart rate monitoring – that’s how it felt. At 720p, the process was already much smoother. Yes, the image quality suffers in this case, but you can already somehow play.
I also decided to try changing the codec from H.264 to H.265. I tested the result in Dirt Rally due to the presence of benchmark looping mode in this game. When using H.264, the encoding delay was about 10ms. In the case of H.265, it dropped to 8ms. Not very impressive, but still 20% better. The downside to this trick for my configuration was that the client laptop simply does not support H.265 hardware accelerated decoding, which has been supported by Intel embedded GPUs starting with 7th generation processors.
In addition, the video card also causes problems with Parsec, periodically dropping the connection and displaying a host encoder error message. As a rule, this is treated by rebooting the VM, which, after literally a couple of failures, is already annoying. In other cases, Parsec would sometimes simply freeze during gameplay and cause the client to freeze concomitantly at 100% CPU load. All of this is not really a smooth experience.
Another issue seen with MSI Afterburner, was the instability of the frame rate in some games, for example in GTA V. After all the fixes and improvements, small freezes still continued to appear, even when turned on
vsync… I decided to go through the AMD Radeon settings to see if any of the driver features are causing a side effect.
I had the mode selected
Gaming, because it was for this that the video card was used, but in the end I decided to change it to
Standard… And then the friezes magically disappeared! I suspect the option could be the reason
Radeon Anti-Lagas this is one of the main settings that turned out to be disabled after switching to Standard.
Is it worth it?
I love technical testing, and going through all these steps to get more and more performance out of the configuration was an interesting experience. However, others may not share my enthusiasm. If you still want to go this way and gain knowledge along the way, then you can safely rely on my project as a guide. It will also not be superfluous to read reports on the experiments of other people and evaluate your own results each time to understand whether they are giving the desired effect.
Well, for those who just want to play games without bothering too much with getting the expected performance from the machine, I still recommend building a separate gaming PC.
If you are preparing the machine specifically for such a workload, then I would advise you to change the following:
- Replace the APU with a standard one such as Ryzen 9 5950X. Thanks to physically separated layout of kernels you can assign one of their segments to the game VM, and leave the rest to the host. These processors also have a large L3 cache, which will come in handy in low latency workloads in the same games.
- If you are going to build for streaming, then it is better to start with an Nvidia graphics card. Error 43 is gone, so these GPUs are now quite usable for solutions like Moonlight.
- Use a board with more SATA or M.2 slots. This greatly simplifies forwarding storage devices.
My adventure doesn’t end there. I recently decided to look for an Nvidia graphics card to compare with my RX570. In the end, I came across a GTX1060, which comes from the same era and has roughly similar performance characteristics. I think it will be a good project to compare these cards in the Parsec vs Parsec and Parsec vs Moonlight scenarios.
I also hope that I can still make the transition to a more modern client PC that will support H.265 and higher output resolutions. Yes, 4K at 60Hz is too expensive for a laptop from 2012. If you consider that we are talking about ThinkPads, then this will happen sometime after 2025.
As for the storage configuration, a friend advised me to configure using Samba to share a virtual machine with a NAS and place a Steam library with other game files on it. In the end, the transfer speed over the virtual LAN during testing reached 2-3Gbps. And given that ZFS 2.0 has a persistent L2ARC cache, I can take advantage of that too (if I need L2ARC at all, since ARC is very efficient). Now NAS and game virtual machines are synchronized simply through Syncthing, that is, at least I have a basic backup of all games.