Why mice that are too fast can break FPS in games

Reason for writing this article

When developing or porting a game for PC, you have to deal with user input, which usually falls into three categories of sources: mouse, keyboard And gamepads.

At first, it may seem that using a mouse and keyboard is easiest, but in reality this is not the case; at least when we're talking about Windows. Many very popular AAA games were released with serious input problems with high-end miceand some popular engines still have this problem.

In this article, we will explore the reasons for this and also create a working but unsatisfactory solution. I think there's a whole extra layer of complexity required to properly operate accessories like steering wheels, joysticks, and other simulation devices, but I haven't worked on a game that requires that kind of input yet, so we won't cover it in this article. .

While the bulk of this article is about mouse input, we recently discovered something very interesting about xinput performance that we'll share towards the end of the article.

Introduction – Raw Input

There are many ways to receive input from the user in Windows. The most traditional one is to receive Windows messages sent to your application's message queue. This is how we receive keyboard and mouse input in a typical Windows application. However, when it comes to gaming, this method has some disadvantages.

The most important of these shortcomings is that it is not possible to use a message queue to receive accurate and unmodified mouse input, which is especially important for a game where the mouse is used to control a 3D camera. Traditional input is designed to control the cursor, so before it reaches your application, the system applies acceleration and other transformations to it; In addition, it does not provide sub-pixel precision.

If your game is controlled only by the cursor, for example, if it is a strategy or point-and-click adventure game, then you can probably ignore this article and the standard Windows messages will be enough.

The solution to this problem is to use Raw Input APIwhich allows you to receive input from devices such as mice and keyboards in raw, unchangeable form. This is the API that most games use to receive mouse input; the linked article provides a good introduction to its use, which I won't repeat here.

Why are there mournful notes in the article? Oh, we've only just begun.

A Razer Viper mouse with 8k polling rate

Razer Viper mouse with 8k polling rate – I'm guessing all those people in the picture are looking at it with bewilderment, because when using it in some games the frame rate drops by 100 FPS.

Working with Raw Input

If you're familiar with the Raw Input API or just read the documentation in the link, you might think I'm going to talk about the importance of using buffered input instead of handling individual events, but in reality the situation wouldn't be that bad and wouldn't be worth writing an article about. The real problem is this. that everything is far from so simple – as far as I know, there is no general way to do this.

Let's go back a little: There are two ways to receive raw input from a device:

  1. Usage standard read operations from the device, this is the easiest way. In fact, for this you just need to receive additional messages like WM_INPUTwhich can then be processed.

  2. Usage buffered reads: Access all raw input events simultaneously by calling GetRawInputBuffer.

As one might assume, the latter method was intended to be more productive, since processing individual event messages using a message queue is not particularly efficient.

Doing this, and doing it correctly, is not as easy as it should be (or maybe I'm missing something). As far as I know, to avoid problems associated with “losing” messages created at certain points in time, when processing only raw input in batch form, you need to do something like this:

processRawInput(); // здесь выполняется всё, связанное с `GetRawInputBuffer`

// просматриваем все сообщения, *за исключением* WM_INPUT
// исключение: когда приложения нет фокуса, то мы просматриваем все сообщения, чтобы проснуться в нужный момент
MSG msg{};
auto peekNotInput = [&] {
  if(!g_window->hasFocus()) {
    return PeekMessage(&msg, NULL, 0, 0, PM_REMOVE);
  }
  auto ret = PeekMessage(&msg, NULL, 0, WM_INPUT-1, PM_REMOVE);
  if (!ret) {
    ret = PeekMessage(&msg, NULL, WM_INPUT+1, std::numeric_limits<UINT>::max(), PM_REMOVE);
  }
  return ret;
};

while (peekNotInput()) {
  TranslateMessage(&msg);
  DispatchMessage(&msg);
}

runOneFrame(); // здесь находится игровая логика

As you can see from the code snippet shown above, you need to view all messages except WM_INPUTso as not to lose any messages that occur between the moments of processing batch raw input and “regular” messages. This is not formulated very clearly in the documentation, and the API does not make the task particularly easy either, but a couple of extra lines of code solves the problem.

But it still wouldn't be much of a problem; the normal amount of fuss that is to be expected when working with an operating system that has to maintain several decades of backward compatibility. So let's get to the real problem.

The real problem

Let's say you did all this correctly and now we get raw, buffered output from the mouse as intended. You might think that the problem is solved, but no. Actually we still just getting started.

Comparison Frametime Chart with and without mouse movement

Comparison between no mouse movement (top) and mouse movement (bottom), everything else is the same

Above is a comparison of the frame rate graph in the same scene. The only difference is that at the bottom the mouse is shaken violently, and not just any ordinary mouse, but an expensive one, with a polling rate of 8 kHz. As you can see, just moving the mouse destroys performance, dropping it from a soft FPS border (around 360 FPS) to around 133 FPS with very unstable performance. And all this is just an active movement of the mouse.

You might be thinking, “Aha, he showed this example to show the importance of batch processing!” Alas, no – the one shown above, unfortunately, and there is game performance when batch processing raw input. Let's figure out why this happens and what to do about it.

The Curse of Legacy Input

In short, the problem lies in the so-called “legacy input”. When initializing raw input for a device using RegisterRawInputDevices you can set a flag RIDEV_NOLEGACY. This flag prevents the system from generating legacy messages, for example, WM_MOUSEMOVE. And this is our problem: if you do not set this flag, the system will generate both raw input messages and legacy messages, and the latter will still clog the message queue.

So why am I complaining about this? You can just turn off legacy input, right? This does solve the performance problem, of course, if you do it right, as shown above.

You congratulate yourself on a job well done and move on to the next task. A few days later, the build is sent to beta testers and you receive a bug report stating that the game window can no longer be moved. And then you realize that you have disabled the system's ability to move the window, because this is done using legacy input.

Disabling legacy input disables all types of input interactions that are normally handled by the system.

What can we do about it? Here's a short list of everything I've tried or even fully implemented; all this either did not work, or was not possible, or was simply stupid in terms of complexity:

  1. Use a separate window for messages only and a thread for input processing. This seemed like a good solution, so I decided to implement it. Essentially, this requires creating a completely separate, invisible window and logging raw input with it. It’s a little confusing, but it seemed to me that this would solve the problem, and solve it “correctly.” But alas, the system still continued to generate legacy messages for the main window at a high frequency, even if the raw input device was registered by another window.

    Raw input affects the entire process, even though the API receives a window handle.

  2. Disable legacy input only in full screen modes. At least this will solve the problem for the vast majority of users, but as far as I know it can't be done. It seems, it is forbidden switch between legacy input and raw input after enabling it. You might think it would help RIDEV_REMOVEbut it completely removes all device-generated input, both legacy and raw.

    You cannot switch between legacy input and raw input once it is enabled.

  3. Use a separate process to pass raw input. It's a pretty stupid idea, but I thought it would actually work. You can create a separate process that passes raw input to the main process, and then use some kind of IPC to pass the input. This would be very confusing and I don't want to support something like this, but I'm pretty sure it would work.

  4. Disable legacy input, create custom legacy input events with low frequency. Another idea from the “stupid, but should work” category, but legacy messages manyand maintaining all this would also be a real nightmare.

  5. Move everything from the thread processing the main message queue. I would definitely try this approach if I were starting from scratch, but implementing it required making huge changes to the existing codebase. And at the same time, one thread would still spend a lot of time on meaningless processing of input messages.

Options 1 and 2 looked realistic enough, but the first did not work, and the second turned out to be impossible. The rest, in my opinion, are too stupid to research for use in a finished game, or are not feasible for porting.

So now you understand why AAA games are released for PC that break 8KHz mice, and why I A little upset by the situation. What have we done?

Our solution

Our current solution is very stupid and shouldn't seem to work, or at least should have serious consequences, but so far it seems to be working well and not causing any problems. It's kind of a hack, but it's the best we can come up with so far.

In the decision legacy input remains enabledbut for real game input uses batch raw input. The stupid trick is this: we we prevent performance degradation by simply not processing more than N message queue events per frame.

While we are working with N=5but this is a rather arbitrary choice. When I tried this solution, I had a lot of questions: what if a bunch of messages accumulate? What if the window stops responding? I don't care about the input itself in the game because we get all the buffered raw input events quickly and with very low latency, but the message buildup can cause the window to become unresponsive to interactions.

After a lot of testing with an 8KHz mouse, none of this showed up, although we tried hard.

This is the situation we find ourselves in: a completely unsatisfactory solution that seems to work well and provide raw 8kHz input with no performance hit or impact on legacy window interactions. If you know how to solve the problem correctly, then write me a comment on the original post, send me a letter, find me on the street and tell me. At least send a carrier pigeon. I will be very grateful.

A note about XInput

This section is completely unrelated to the rest of the article, but I found it interesting and may be new to someone. You might think that when using the XInput API to work with gamepads, it's almost impossible to make a mistake. This is an extremely simple API, and for the most part we just use XInputGetState. However, there is an interesting note in the documentation that is very easy to miss:

For performance reasons, do not call XInputGetState on user slot 'empty' every frame. We recommend allowing a few seconds between testing new controllers.

This is not just a phrase: we saw a performance drop of 10-15% in extremely CPU-constrained cases where we just called every frame XInputGetState for controllers, if no controllers were connected!

I have no idea why the API is designed this way or why it doesn't have some kind of internal event-based tracking that would make calls to disabled controller slots practically “free”, but that's the situation. You will have to implement your own fallback mechanism to avoid this performance hit, because there is no alternative API (at least in pure XInput) that tells whether a controller is connected.

This is another area where the existing API is quite inconvenient – we usually want to ensure that no Nth frame takes longer than its neighbors, so we need to move it all to another thread. But this is still much easier to deal with than the problem of raw mouse input and high polling rates.

Conclusion

Game input on Windows is not perfect. I hope this article saves someone some time and as I said above I would love to know how to solve this problem Right.

And we haven’t even talked about keyboard layouts yet! If you're a QWERTZ user, you've probably wondered why actions are mapped to keys by default. Z, X And Cillogical for your keyboard? But that's a story for another article.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *