Missing test results on concurrency runner

### Description of the bug/issue

# Missing Test Results on Concurrency Runner
### When running tests with the concurrency runner, test results are occasionally lost. This issue appears to be a race condition related to the timing of inter-process communication (IPC) and promise resolution in the main and worker processes. The problem seems to happen about 1 in 10 times.

## Detailed Description
In the worker process, the testSuiteFinished function calls this.sendReportToParentWorker(), which uses process.send() to transmit the test report. This is a non-blocking "fire and forget" operation.

Simultaneously, the main process is waiting on a promise (piscina.run().then(...)) that resolves when the worker task completes. The core of the problem is that this promise resolves immediately after the worker task finishes, but before the IPC message with the test report has had a chance to arrive.

### This creates a race:

The worker process sends the report via process.send().

The piscina.run() promise in the main process resolves.

The main process proceeds with its logic, assuming all data has been received.

The IPC message with the report arrives too late to be processed by the main process's listening logic, causing the report to be lost.


## Logs:
### I added some logs to runWorkerTask when the message of testsuite_finished arrives and the promise solves
The problem seems to be, that sometimes, the promise solves before the arrival of the message from ICP (10 ms) and this make the report step have less results than runned tests.

```bash
NORMAL BEHAVIOR:
[testsuite_finished - TRIGGER - before] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:12:22.086Z
[testsuite_finished - TRIGGER - after] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:12:22.089Z
[THEN] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:12:22.089Z
┌ ────────────────── ✖  chrome: tests/TEST_MENU/TEST-CASE.ts
...test logs
[END] Task  chrome: tests/TEST_MENU/TEST_CASE.ts  finished with SUCCESS 2025-09-04T14:25:48.333Z
...report

ISSUE:
[THEN] Task  chrome: tests/TEST_MENU/TEST_CASE.ts  2025-09-04T14:25:48.307Z
┌ ────────────────── ✖  chrome: tests/TEST_MENU/TEST-CASE.ts
...test logs
[END] Task  chrome: tests/TEST_MENU/TEST_CASE.ts  finished with SUCCESS 2025-09-04T14:25:48.333Z
after promise on runMultipleTestWorkers 2025-09-04T14:25:48.333Z
Before reportResults 2025-09-04T14:25:48.336Z
Running reportResults 2025-09-04T14:25:48.336Z

  ✨ PASSED. 52 total assertions (1m 30s)
[testsuite_finished - TRIGGER - before] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:25:48.343Z
 Wrote HTML report file to: /home/will/nightwatch/PROJECT-NAME/reports/nightwatch-html-report/index.html
[testsuite_finished - TRIGGER - after] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:12:22.350Z

After reportResults 2025-09-04T14:25:48.457Z
```
## The runWorkerTask set promises and receive the message of testsuite_finished
## The green line shows the flow that Node does, the two red ones, shows the race condition, because Node keep running after the ICP message is sent, but it still not arrived

<img width="1372" height="1298" alt="Image" src="https://github.com/user-attachments/assets/114cc2e8-2809-4a1a-a5c7-549b245408f4" />

### Steps to reproduce

### Since we cannot reproduce the issue on demand due to its nature, its a race condition and there is no way to delay the ICP message, we've instrumented the test runner with detailed logs to capture the behavior when the problem occurs. Based on these logs, we've verified the following:

Promise and async Behavior are not the cause: We have fixed all known async and Promise-related issues, including ensuring all await calls are correctly placed. The problem persists, indicating it's not a standard test-execution-flow issue.

The Test Completes Successfully: The logs show that the test case itself passes and all test hooks (afterEach, after) execute and complete successfully. This proves that the test is not hanging or failing before the reporting stage.

I tried force unfinished promises, but even with this, Nightwatch just go to next steps when the promise is solved.

## Log Evidence
Below are examples of two scenarios where reports are missing, but the logs show the test has fully completed in the worker process.

## Scenario 1: All Logs Return, but Report is Missing
The logs show a full test run, including all commands and hooks, completing successfully. The worker process successfully deletes the browser session and completes its [after] hook. The parent process logs a PASSED result, but the report for that specific test suite is not generated or is missing.

```bash
chrome ✔ chrome [Test case Three] Get email (15.167s)                                                                                                                                                           │
│   chrome → Running [after]:                                                                                                                                                                                                        │
│   chrome → Running command: end ()                                                                                                                                                                                                                 │
│   chrome → Running command: session ('delete', [Function])                                                                                                                                                                                         │
│   chrome   Request DELETE http                                                                                                             │
│   chrome   Response 200 DELETE http:                                                                                                   │
│   chrome { value: null }                                                                                                                                                                                                                           │
│   chrome → Completed command: end () (312ms)                                                                                                                                                                                                       │
│   chrome → Completed command: session ('delete', [Function]) (310ms)                                                                                                                                                                               │
│   chrome → Completed [after].                                                                                                                                                                                                                      │
│                                                                                                                                                                                                                                                    │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

  ✨ PASSED. 51 total assertions (1m 38s)
 Wrote HTML report file to: /home/will/nightwatch/e2e/reports/nightwatch-html-report/index.html

 Wrote Rerun Json report file to: /home/will/nightwatch/e2e/reports/minimal_report.json
 Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.json
 Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_two.json
 Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.xml
 Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_two.xml
```
### Missing the third test

## Scenario 2: Logs are Cut Off, but Container Shows Full Execution
In this scenario, the returned process's log output from worker stops abruptly after the Request DELETE http line, this made me think that even the messages with logs from child worker someties arrive after the promise end. However, we can confirm from the container logs that the worker thread continued to execute and completed the test and teardown.

```bash
chrome → Completed command: url          
chrome → Running [afterEach]:                                                                                                                                                                                                   
chrome → Completed [afterEach].                                                                                                                                                                                                 
chrome ✔ chrome [Test case two] Step (7.49s)                                                                                                                                             
chrome → Running [after]:                                                                                                                                                                                                       
chrome BEFORE BROWSER END  - test_case_two                                                                                                                                                                                                    
chrome → Running command: end ()                                                                                                                                                                                                
chrome → Running command: session ('delete', [Function])                                                                                                                                                                        
chrome   Request DELETE http                                                                                                
                                                                                                                                                                                                                                │└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

  ✨ PASSED. 52 total assertions (1m 30s)
 Wrote HTML report file to: /home/will/nightwatch/e2e/reports/nightwatch-html-report/index.html

 Wrote Rerun Json report file to: /home/will/nightwatch/e2e/reports/minimal_report.json
 Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.xml
 Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_three.xml
 Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.json
 Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_three.json
```


## Conclusion from Investigation
The data from the logs strongly suggests the problem is not within the test execution itself but in the communication phase. Specifically, the test worker process successfully completes its work and sends the test report message. However, the parent process does not receive this message in time, leading to the reports been generated without one or more tests. The missing assertions in the final report (should be 83 assertions) further support the hypothesis that some test results never reach the parent process. 🕵️‍♂️

Running local on my machine, with 3 tests, I achieved to reproduce this running a lot of times.
In the schedulers, with 30+ tests, running all day each hour, 2~3 times a day we see 1 test missing
In the scheduler with 80+ tests, we already saw more then 4 tests missed.

### Sample test

```node

```

### Command to run

```bash

```

### Verbose Output

```fundamental
NORMAL BEHAVIOR:
[testsuite_finished - TRIGGER - before] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:12:22.086Z
[testsuite_finished - TRIGGER - after] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:12:22.089Z
[THEN] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:12:22.089Z
┌ ────────────────── ✖  chrome: tests/TEST_MENU/TEST-CASE.ts
...test logs
[END] Task  chrome: tests/TEST_MENU/TEST_CASE.ts  finished with SUCCESS 2025-09-04T14:25:48.333Z
...report

ISSUE:
[THEN] Task  chrome: tests/TEST_MENU/TEST_CASE.ts  2025-09-04T14:25:48.307Z
┌ ────────────────── ✖  chrome: tests/TEST_MENU/TEST-CASE.ts
...test logs
[END] Task  chrome: tests/TEST_MENU/TEST_CASE.ts  finished with SUCCESS 2025-09-04T14:25:48.333Z
after promise on runMultipleTestWorkers 2025-09-04T14:25:48.333Z
Before reportResults 2025-09-04T14:25:48.336Z
Running reportResults 2025-09-04T14:25:48.336Z

  ✨ PASSED. 52 total assertions (1m 30s)
[testsuite_finished - TRIGGER - before] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:25:48.343Z
 Wrote HTML report file to: /home/will/nightwatch/PROJECT-NAME/reports/nightwatch-html-report/index.html
[testsuite_finished - TRIGGER - after] Task  chrome: tests/TEST_MENU/TEST-CASE.ts  2025-09-04T14:12:22.350Z

After reportResults 2025-09-04T14:25:48.457Z




Before the logs on the Nightwatch, I reproduce the issue twice:
One:
chrome ✔ chrome [Test case Three] Get email (15.167s)                                                                                                                                                           │
│   chrome → Running [after]:                                                                                                                                                                                                        │
│   chrome → Running command: end ()                                                                                                                                                                                                                 │
│   chrome → Running command: session ('delete', [Function])                                                                                                                                                                                         │
│   chrome   Request DELETE http                                                                                                             │
│   chrome   Response 200 DELETE http:                                                                                                   │
│   chrome { value: null }                                                                                                                                                                                                                           │
│   chrome → Completed command: end () (312ms)                                                                                                                                                                                                       │
│   chrome → Completed command: session ('delete', [Function]) (310ms)                                                                                                                                                                               │
│   chrome → Completed [after].                                                                                                                                                                                                                      │
│                                                                                                                                                                                                                                                    │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

  ✨ PASSED. 51 total assertions (1m 38s)
 Wrote HTML report file to: /home/will/nightwatch/e2e/reports/nightwatch-html-report/index.html

 Wrote Rerun Json report file to: /home/will/nightwatch/e2e/reports/minimal_report.json
 Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.json
 Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_two.json
 Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.xml
 Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_two.xml


Two:
chrome → Completed command: url          
chrome → Running [afterEach]:                                                                                                                                                                                                   
chrome → Completed [afterEach].                                                                                                                                                                                                 
chrome ✔ chrome [Test case two] Step (7.49s)                                                                                                                                             
chrome → Running [after]:                                                                                                                                                                                                       
chrome BEFORE BROWSER END  - test_case_two                                                                                                                                                                                                    
chrome → Running command: end ()                                                                                                                                                                                                
chrome → Running command: session ('delete', [Function])                                                                                                                                                                        
chrome   Request DELETE http                                                                                                
                                                                                                                                                                                                                                │└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

  ✨ PASSED. 52 total assertions (1m 30s)
 Wrote HTML report file to: /home/will/nightwatch/e2e/reports/nightwatch-html-report/index.html

 Wrote Rerun Json report file to: /home/will/nightwatch/e2e/reports/minimal_report.json
 Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.xml
 Wrote XML report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_three.xml
 Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_one.json
 Wrote JSON report file to: /home/will/nightwatch/e2e/reports/CHROME_139.0.7258.66__test_case_three.json
```

### Nightwatch Configuration

```node
module.exports = {
  src_folders: "tests",
  globals_path: "globals.js",
  selenium: {
    version2: true,
    host: baseFile.selenoid_host,
    port: baseFile.selenoid_port,
  },
  test_workers: {
    enabled: true,
    workers: "auto",
  },
  test_settings: test_settings,
};

var test_settings = {
  default: {
    globals: variablesFile,
    desiredCapabilities: {
      "selenoid:options": {
        enableVNC: true,
        enableVideo: false,
        use_ssl: false,
        proxy: baseFile.proxy,
      },
    },
    output_folder: "reports/",
    persist_globals: true,
    screenshots: {
      enabled: true,
      on_failure: true,
      path: "screenshots",
    },
    skip_testcases_on_fail: true,
    silent: true,
  },
  end_session_on_fail: false,
  chrome: chromeBrowser([
    "no-sandbox",
    "disable-gpu",
    "headless",
    "window-size=1920,1080",
  ]),
  firefox: firefoxBrowser(),
  safari: safariBrowser(),
  edge: edgeBrowser(),
  iPhone: genericIphoneDevice(),
  android: genericAndroidDevice(),
};

module.exports.test_settings = test_settings;
```

### Nightwatch.js Version

nightwatch-3.12.2

### Node Version

v22.18.0

### Browser

Chrome 139

### Operating System

Linux

### Additional Information

## Proposed Solution: Acknowledgment Handshake
### The proposed solution is to implement a two-way acknowledgment handshake between the child worker and the parent process. This ensures that the child worker only proceeds after it has received a confirmation message from the parent, guaranteeing the test report has been received and processed.

### Here is how the implementation would work:

The child worker sends the test report with a unique message ID.

The child worker then waits for a confirmation message from the parent with that same ID.

The parent process, upon receiving the report message, immediately sends back an acknowledgment message.

Once the child worker receives the acknowledgment, it can safely finish its process. A timeout is also included to prevent the child from hanging if the parent process fails.

## Code Examples
1. In the Child Worker Process
This code modifies the child process to send the report and await a confirmation.

```javascript
sendReportToParentWorker = async () => {
    return new Promise((resolve, reject) => {
        const messageId = Date.now(); 
        
        const ackListener = (msg) => {
            if (msg.type === 'report_received' && msg.id === messageId) {
                process.removeListener('message', ackListener); 
                resolve(); // Resolve the promise once acknowledged
            }
        };

        process.on('message', ackListener);
        
        process.send({
            type: 'test_report',
            id: messageId,
            payload: this.reporter.testResults.eventDataToEmit
        });
        
        setTimeout(() => {
            process.removeListener('message', ackListener);
                 }, 1000); 
    });
};
```


The testSuiteFinished method is then updated to await this new function.

```javascript
testSuiteFinished = async (failures) => {
    // ...
    if (Concurrency.isWorker()) {
        await this.sendReportToParentWorker(); // The child now waits here
    }
    NightwatchEventHub.emit(TestSuiteHook.finished, this.reporter.testResults.eventDataToEmit);
    return failures;
};
```

By adding the await, the testSuiteFinished function (and the piscina runner) will not complete and resolve its promise until the parent acknowledges receipt of the message, thereby eliminating the race condition.

2. In the Parent Process
This code adds a listener to the parent process that handles the incoming report and sends an acknowledgment back.



```javascript

   switch (result.type) {
        case 'testsuite_finished':
            if (result.id) {
                port2.postMessage({
                    type: 'report_received',
                    id: result.id
                });
            }
            result.itemKey = this.label;
            this.emit('message', result);
            break;

        case 'stdout':
            this.writeToStdOut(result.data);
            break;
    }
};
```

This change is minimal on the parent's side, as it only needs to react to the incoming message and send a reply, keeping its existing promise-based logic intact.


Missing test results on concurrency runner #4416

Description

Description of the bug/issue

Missing Test Results on Concurrency Runner

When running tests with the concurrency runner, test results are occasionally lost. This issue appears to be a race condition related to the timing of inter-process communication (IPC) and promise resolution in the main and worker processes. The problem seems to happen about 1 in 10 times.

Detailed Description

This creates a race:

Logs:

I added some logs to runWorkerTask when the message of testsuite_finished arrives and the promise solves

The runWorkerTask set promises and receive the message of testsuite_finished

The green line shows the flow that Node does, the two red ones, shows the race condition, because Node keep running after the ICP message is sent, but it still not arrived

Steps to reproduce

Since we cannot reproduce the issue on demand due to its nature, its a race condition and there is no way to delay the ICP message, we've instrumented the test runner with detailed logs to capture the behavior when the problem occurs. Based on these logs, we've verified the following:

Log Evidence

Scenario 1: All Logs Return, but Report is Missing

Missing the third test

Scenario 2: Logs are Cut Off, but Container Shows Full Execution

Conclusion from Investigation

Sample test

Command to run

Verbose Output

Nightwatch Configuration

Nightwatch.js Version

Node Version

Browser

Operating System

Additional Information

Proposed Solution: Acknowledgment Handshake

The proposed solution is to implement a two-way acknowledgment handshake between the child worker and the parent process. This ensures that the child worker only proceeds after it has received a confirmation message from the parent, guaranteeing the test report has been received and processed.

Here is how the implementation would work:

Code Examples

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions