-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Description of the bug/issue
Missing Test Results on Concurrency Runner
When running tests with the concurrency runner, test results are occasionally lost. This issue appears to be a race condition related to the timing of inter-process communication (IPC) and promise resolution in the main and worker processes. The problem seems to happen about 1 in 10 times.
Detailed Description
In the worker process, the testSuiteFinished function calls this.sendReportToParentWorker(), which uses process.send() to transmit the test report. This is a non-blocking "fire and forget" operation.
Simultaneously, the main process is waiting on a promise (piscina.run().then(...)) that resolves when the worker task completes. The core of the problem is that this promise resolves immediately after the worker task finishes, but before the IPC message with the test report has had a chance to arrive.
This creates a race:
The worker process sends the report via process.send().
The piscina.run() promise in the main process resolves.
The main process proceeds with its logic, assuming all data has been received.
The IPC message with the report arrives too late to be processed by the main process's listening logic, causing the report to be lost.
Logs:
I added some logs to runWorkerTask when the message of testsuite_finished arrives and the promise solves
The problem seems to be, that sometimes, the promise solves before the arrival of the message from ICP (10 ms) and this make the report step have less results than runned tests.
NORMAL BEHAVIOR:
[testsuite_finished - TRIGGER - before] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:12:22.086Z
[testsuite_finished - TRIGGER - after] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:12:22.089Z
[THEN] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:12:22.089Z
┌ ────────────────── ✖ chrome: tests/TEST_MENU/TEST-CASE.ts
...test logs
[END] Task chrome: tests/TEST_MENU/TEST_CASE.ts finished with SUCCESS 2025-09-04T14:25:48.333Z
...report
ISSUE:
[THEN] Task chrome: tests/TEST_MENU/TEST_CASE.ts 2025-09-04T14:25:48.307Z
┌ ────────────────── ✖ chrome: tests/TEST_MENU/TEST-CASE.ts
...test logs
[END] Task chrome: tests/TEST_MENU/TEST_CASE.ts finished with SUCCESS 2025-09-04T14:25:48.333Z
after promise on runMultipleTestWorkers 2025-09-04T14:25:48.333Z
Before reportResults 2025-09-04T14:25:48.336Z
Running reportResults 2025-09-04T14:25:48.336Z
✨ PASSED. 52 total assertions (1m 30s)
[testsuite_finished - TRIGGER - before] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:25:48.343Z
Wrote HTML report file to: /home/will/nightwatch/PROJECT-NAME/reports/nightwatch-html-report/index.html
[testsuite_finished - TRIGGER - after] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:12:22.350Z
After reportResults 2025-09-04T14:25:48.457Z
The runWorkerTask set promises and receive the message of testsuite_finished
The green line shows the flow that Node does, the two red ones, shows the race condition, because Node keep running after the ICP message is sent, but it still not arrived

Steps to reproduce
Since we cannot reproduce the issue on demand due to its nature, its a race condition and there is no way to delay the ICP message, we've instrumented the test runner with detailed logs to capture the behavior when the problem occurs. Based on these logs, we've verified the following:
Promise and async Behavior are not the cause: We have fixed all known async and Promise-related issues, including ensuring all await calls are correctly placed. The problem persists, indicating it's not a standard test-execution-flow issue.
The Test Completes Successfully: The logs show that the test case itself passes and all test hooks (afterEach, after) execute and complete successfully. This proves that the test is not hanging or failing before the reporting stage.
I tried force unfinished promises, but even with this, Nightwatch just go to next steps when the promise is solved.
Log Evidence
Below are examples of two scenarios where reports are missing, but the logs show the test has fully completed in the worker process.
Scenario 1: All Logs Return, but Report is Missing
The logs show a full test run, including all commands and hooks, completing successfully. The worker process successfully deletes the browser session and completes its [after] hook. The parent process logs a PASSED result, but the report for that specific test suite is not generated or is missing.
chrome ✔ chrome [Test case Three] Get email (15.167s) │
│ chrome → Running [after]: │
│ chrome → Running command: end () │
│ chrome → Running command: session ('delete', [Function]) │
│ chrome Request DELETE http │
│ chrome Response 200 DELETE http: │
│ chrome { value: null } │
│ chrome → Completed command: end () (312ms) │
│ chrome → Completed command: session ('delete', [Function]) (310ms) │
│ chrome → Completed [after]. │
│ │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
✨ PASSED. 51 total assertions (1m 38s)
Wrote HTML report file to: /home/will/nightwatch/e2e/reports/nightwatch-html-report/index.html
Wrote Rerun Json report file to: /home/will/nightwatch/e2e/reports/minimal_report.json
Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.json
Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_two.json
Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.xml
Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_two.xml
Missing the third test
Scenario 2: Logs are Cut Off, but Container Shows Full Execution
In this scenario, the returned process's log output from worker stops abruptly after the Request DELETE http line, this made me think that even the messages with logs from child worker someties arrive after the promise end. However, we can confirm from the container logs that the worker thread continued to execute and completed the test and teardown.
chrome → Completed command: url
chrome → Running [afterEach]:
chrome → Completed [afterEach].
chrome ✔ chrome [Test case two] Step (7.49s)
chrome → Running [after]:
chrome BEFORE BROWSER END - test_case_two
chrome → Running command: end ()
chrome → Running command: session ('delete', [Function])
chrome Request DELETE http
│└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
✨ PASSED. 52 total assertions (1m 30s)
Wrote HTML report file to: /home/will/nightwatch/e2e/reports/nightwatch-html-report/index.html
Wrote Rerun Json report file to: /home/will/nightwatch/e2e/reports/minimal_report.json
Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.xml
Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_three.xml
Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.json
Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_three.json
Conclusion from Investigation
The data from the logs strongly suggests the problem is not within the test execution itself but in the communication phase. Specifically, the test worker process successfully completes its work and sends the test report message. However, the parent process does not receive this message in time, leading to the reports been generated without one or more tests. The missing assertions in the final report (should be 83 assertions) further support the hypothesis that some test results never reach the parent process. 🕵️♂️
Running local on my machine, with 3 tests, I achieved to reproduce this running a lot of times.
In the schedulers, with 30+ tests, running all day each hour, 2~3 times a day we see 1 test missing
In the scheduler with 80+ tests, we already saw more then 4 tests missed.
Sample test
Command to run
Verbose Output
NORMAL BEHAVIOR:
[testsuite_finished - TRIGGER - before] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:12:22.086Z
[testsuite_finished - TRIGGER - after] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:12:22.089Z
[THEN] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:12:22.089Z
┌ ────────────────── ✖ chrome: tests/TEST_MENU/TEST-CASE.ts
...test logs
[END] Task chrome: tests/TEST_MENU/TEST_CASE.ts finished with SUCCESS 2025-09-04T14:25:48.333Z
...report
ISSUE:
[THEN] Task chrome: tests/TEST_MENU/TEST_CASE.ts 2025-09-04T14:25:48.307Z
┌ ────────────────── ✖ chrome: tests/TEST_MENU/TEST-CASE.ts
...test logs
[END] Task chrome: tests/TEST_MENU/TEST_CASE.ts finished with SUCCESS 2025-09-04T14:25:48.333Z
after promise on runMultipleTestWorkers 2025-09-04T14:25:48.333Z
Before reportResults 2025-09-04T14:25:48.336Z
Running reportResults 2025-09-04T14:25:48.336Z
✨ PASSED. 52 total assertions (1m 30s)
[testsuite_finished - TRIGGER - before] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:25:48.343Z
Wrote HTML report file to: /home/will/nightwatch/PROJECT-NAME/reports/nightwatch-html-report/index.html
[testsuite_finished - TRIGGER - after] Task chrome: tests/TEST_MENU/TEST-CASE.ts 2025-09-04T14:12:22.350Z
After reportResults 2025-09-04T14:25:48.457Z
Before the logs on the Nightwatch, I reproduce the issue twice:
One:
chrome ✔ chrome [Test case Three] Get email (15.167s) │
│ chrome → Running [after]: │
│ chrome → Running command: end () │
│ chrome → Running command: session ('delete', [Function]) │
│ chrome Request DELETE http │
│ chrome Response 200 DELETE http: │
│ chrome { value: null } │
│ chrome → Completed command: end () (312ms) │
│ chrome → Completed command: session ('delete', [Function]) (310ms) │
│ chrome → Completed [after]. │
│ │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
✨ PASSED. 51 total assertions (1m 38s)
Wrote HTML report file to: /home/will/nightwatch/e2e/reports/nightwatch-html-report/index.html
Wrote Rerun Json report file to: /home/will/nightwatch/e2e/reports/minimal_report.json
Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.json
Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_two.json
Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.xml
Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_two.xml
Two:
chrome → Completed command: url
chrome → Running [afterEach]:
chrome → Completed [afterEach].
chrome ✔ chrome [Test case two] Step (7.49s)
chrome → Running [after]:
chrome BEFORE BROWSER END - test_case_two
chrome → Running command: end ()
chrome → Running command: session ('delete', [Function])
chrome Request DELETE http
│└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
✨ PASSED. 52 total assertions (1m 30s)
Wrote HTML report file to: /home/will/nightwatch/e2e/reports/nightwatch-html-report/index.html
Wrote Rerun Json report file to: /home/will/nightwatch/e2e/reports/minimal_report.json
Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.xml
Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_three.xml
Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.json
Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_three.json
Nightwatch Configuration
module.exports = {
src_folders: "tests",
globals_path: "globals.js",
selenium: {
version2: true,
host: baseFile.selenoid_host,
port: baseFile.selenoid_port,
},
test_workers: {
enabled: true,
workers: "auto",
},
test_settings: test_settings,
};
var test_settings = {
default: {
globals: variablesFile,
desiredCapabilities: {
"selenoid:options": {
enableVNC: true,
enableVideo: false,
use_ssl: false,
proxy: baseFile.proxy,
},
},
output_folder: "reports/",
persist_globals: true,
screenshots: {
enabled: true,
on_failure: true,
path: "screenshots",
},
skip_testcases_on_fail: true,
silent: true,
},
end_session_on_fail: false,
chrome: chromeBrowser([
"no-sandbox",
"disable-gpu",
"headless",
"window-size=1920,1080",
]),
firefox: firefoxBrowser(),
safari: safariBrowser(),
edge: edgeBrowser(),
iPhone: genericIphoneDevice(),
android: genericAndroidDevice(),
};
module.exports.test_settings = test_settings;
Nightwatch.js Version
nightwatch-3.12.2
Node Version
v22.18.0
Browser
Chrome 139
Operating System
Linux
Additional Information
Proposed Solution: Acknowledgment Handshake
The proposed solution is to implement a two-way acknowledgment handshake between the child worker and the parent process. This ensures that the child worker only proceeds after it has received a confirmation message from the parent, guaranteeing the test report has been received and processed.
Here is how the implementation would work:
The child worker sends the test report with a unique message ID.
The child worker then waits for a confirmation message from the parent with that same ID.
The parent process, upon receiving the report message, immediately sends back an acknowledgment message.
Once the child worker receives the acknowledgment, it can safely finish its process. A timeout is also included to prevent the child from hanging if the parent process fails.
Code Examples
- In the Child Worker Process
This code modifies the child process to send the report and await a confirmation.
sendReportToParentWorker = async () => {
return new Promise((resolve, reject) => {
const messageId = Date.now();
const ackListener = (msg) => {
if (msg.type === 'report_received' && msg.id === messageId) {
process.removeListener('message', ackListener);
resolve(); // Resolve the promise once acknowledged
}
};
process.on('message', ackListener);
process.send({
type: 'test_report',
id: messageId,
payload: this.reporter.testResults.eventDataToEmit
});
setTimeout(() => {
process.removeListener('message', ackListener);
}, 1000);
});
};
The testSuiteFinished method is then updated to await this new function.
testSuiteFinished = async (failures) => {
// ...
if (Concurrency.isWorker()) {
await this.sendReportToParentWorker(); // The child now waits here
}
NightwatchEventHub.emit(TestSuiteHook.finished, this.reporter.testResults.eventDataToEmit);
return failures;
};
By adding the await, the testSuiteFinished function (and the piscina runner) will not complete and resolve its promise until the parent acknowledges receipt of the message, thereby eliminating the race condition.
- In the Parent Process
This code adds a listener to the parent process that handles the incoming report and sends an acknowledgment back.
switch (result.type) {
case 'testsuite_finished':
if (result.id) {
port2.postMessage({
type: 'report_received',
id: result.id
});
}
result.itemKey = this.label;
this.emit('message', result);
break;
case 'stdout':
this.writeToStdOut(result.data);
break;
}
};
This change is minimal on the parent's side, as it only needs to react to the incoming message and send a reply, keeping its existing promise-based logic intact.