WebSocket session counter is not exactly same as VU #885

gcg · 2018-12-27T00:20:29Z

Hey there, first I really appricate k6 and thank you for this :)

I am trying to write some tests with websockets but for some reason k6 never opens the connection count that I want with -u.

If I give -u 10000 it usually opens 8k socket connections. If I give the -u 5000, it's somewhere around 4k. I am not sure if I am missing something here.

Here is the code version for the same .js file on the screen shot

import ws from "k6/ws";
import { check, fail } from "k6";
import { Counter } from "k6/metrics";

let sessionCounter = new Counter('session_counter');
let msgCounter = new Counter('msg_counter');
let errCounter = new Counter('err_counter');
let closeCounter = new Counter('close_counter');
let pingCounter = new Counter('ping_counter');

export function setup() {
  let trivia_id = __ENV.trivia_id;

  return {
    trivia_id
  }
}

export default function (data) {
  let url = `ws://gcg.me:3000/trivia-status/${data.trivia_id}`;
  let con = ws.connect(url, null, function( socket ) {
    socket.on('open', function () {
      sessionCounter.add(1);
      socket.setInterval(function timeout() {
        socket.ping();
        pingCounter.add(1);
      }, 1000);
    });

    socket.on('close', function () {
      console.log(`Connection closed on ${__VU} on ${__ITER}`);
      closeCounter.add(1);
    });
    socket.on('error', function(e) {
      if (e.error() != "websocket: close sent") {
        console.log('An unexpected error occured: ', e.error());
      }
      errCounter.add(1);
    });

    socket.on('message', function () {
      msgCounter.add(1);
    });
  });

  check(con, {
    "status is 101": (r) => r && r.status === 101
  });
}

As you can see I am also logging error and socket close events and since they are not showing up on the results I am assuming there was no error and no server side close issues. I also add active socket connection count to my backend and tracked that number real time and it's the same with the sessionCounter. So how come I can't open all the connections I want?

And from the documantation I was hoping to see default ws metrics to be on the result screen but for some reason I am not seeing, that is the reason I am counting every event manually.

And as an extra question, I also want to handle the socket connection errors to re-open that connection after an error to keep the session going. But I didn't see on the documantation page. I want to make my tests as real as possible and if a client loses its socket connection, it usually re-opens the connection.

Thank you.

na-- · 2018-12-31T12:48:24Z

Thanks for creating this issue and the great description!

I don't know what the reason for the low session_counter value is 😕 It might be a bug in k6, but it might also be because of a congested network or an overloaded machine. And it can be either the source or the destination machine. After all, 5000/10000 VUs (i.e. JavaScript runtimes) and concurrent network connections are quite a lot. Something that is indicative of this is that with 5000 VUs, k6 managed to open 4907 sessions (98.1%), while with 10000 VUs it opened only 8676 (86.7%).

Some questions to help debug this and suggestions what to try:

What are the CPU/RAM/networks specs of the machine you ran those tests on?
When you launched k6 with both tests, what percentage of those resources was "eaten" by it?
What happens when you increase the test duration? The current dialing timeout in k6 is 30 seconds, so that, added to any delays and roundtrips in the actual websocket communication, means that those 2 minutes may very well pass before an actual connection is established. That way the VU may be running, but session_counter won't be incremented.
What happens when you try the same test with 3000 VUs?
What happens when instead of starting all of the VUs simultaneously, you ramp them up gradually via the stages option? Something like this:
```
export let options = {
  stages: [
      { duration: "2m", target: 5000 }, // Linearly ramp up from 1 to 5000 VUs over 2 minutes
      { duration: "2m", target: 5000 }, // Stay at 5000 VUs for 2 more minutes
  ]
};
```
Or maybe add a random sleep between 1 and 10 seconds before the ws.connect() call, so each VU doesn't simultaneously try to talk to the remote server. After all, 10000 simultaneous connection attempts is bound to get messy.
When you say "active socket connection count to my backend and tracked that number real time", do you mean that you tracked the number of actual network connections at the OS level, or the number of established websocket connections at the application level?

And as an extra question, I also want to handle the socket connection errors to re-open that connection after an error to keep the session going. But I didn't see on the documantation page. I want to make my tests as real as possible and if a client loses its socket connection, it usually re-opens the connection.

Your current code should already do that 🙂 k6 works by having each active VU loop through the exported default function over and over again, until the test iterations or duration or stages run out and the test ends. So in your case, if the ws.connect() call ends, the next script iteration will start, making that VU connect to the server again. Although there's currently a bug in the websocket handling code that causes k6 to get stuck in some situations...

And from the documantation I was hoping to see default ws metrics to be on the result screen but for some reason I am not seeing, that is the reason I am counting every event manually.

This is the reason why I tagged the issue as bug/refactoring. It is partially caused by your code clashing with some k6 semantics, but it's also because of an unfinished refactoring on my part. Some background:

As I mentioned above, k6 currently works through looping VUs that iterate over the default function.
All HTTP requests and websocket connections in k6 are measured and different metrics are emitted. Those are then used to calculate the summary results at the end of the test.
k6 intentionally discards any metrics that were generated after the test duration (2 minutes in this case) has passed.
Your script doesn't finish even a single iteration (notice the No data generated, because no script iterations finished warning) because for the whole script duration, each VU is stuck inside of the ws.connect() call, until k6 kills it at the end of the test.
Websocket metrics are unfortunately emitted only after the websocket is closed... In your case, that means that they're either not emitted at all (if socket.done wasn't triggered) or sent after the test duration has passed (and thus discarded).

That last one is definitely a bug, and it's something I overlooked when I was refactoring the k6 metrics to be emitted in real-time in this pull request. All of the websocket metrics except ws_session_duration should actually be emitted as soon as they happen, not when the socket is closed...

gcg · 2018-12-31T13:58:26Z

Wow, thank you @na-- for the explanations, I will go thru them point by point and share with you the results.

closes grafana#885

na-- added bug refactor labels Dec 31, 2018

na-- mentioned this issue Feb 13, 2019

Implement non application error codes fix #895 #907

Merged

mstoykov closed this as completed in 4cacb81 Mar 23, 2020

mstoykov added a commit to mstoykov/k6 that referenced this issue Mar 25, 2020

Push websocket metric samples as they are produced (grafana#1367)

d20648c

closes grafana#885

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WebSocket session counter is not exactly same as VU #885

WebSocket session counter is not exactly same as VU #885

gcg commented Dec 27, 2018

na-- commented Dec 31, 2018

gcg commented Dec 31, 2018

WebSocket session counter is not exactly same as VU #885

WebSocket session counter is not exactly same as VU #885

Comments

gcg commented Dec 27, 2018

na-- commented Dec 31, 2018

gcg commented Dec 31, 2018