feat(workflow): enforce coverage #1709

ulfgebhardt · 2025-09-26T17:11:17Z

🍰 Pullrequest

enforce coverage

Enforce a certain percentage of coverage. If this requirement is not met the test fails.

You can see it fail here, as the requirement is set to 100%.

You can see it succeed here.

This PR also fixes broken unit tests.

Issues

As the author of phpunit does not recognize the requirement Add minimum coverage option to fail if coverage falls below a threshold sebastianbergmann/phpunit#1877 (comment) additional scripts seem to be needed.
Use phpunit-coverage-check instead

Todo

Unit tests are not enforced to pass before merge

Example:

Discovered ~~flaky~~ broken tests - fixed in 118f8f1

coverage requirement on the right place

more fixes

find clover.xml find clover.xml find clover.xml found clover.xml remove clutter

ulfgebhardt · 2025-09-26T19:40:51Z

@marclaporte Please have a look. I am done here.

As this fixes the unit tests I would like to call priority on this. If you have doubts about the coverage enforcement, I would separate the fix from the feature.

Also I would like to have a talk about enforcing some workflows, so broken tests can't happen anymore and the coverage does not dwindle further.

ulfgebhardt · 2025-09-26T21:42:07Z

tests/phpunit/modules/core/message_list_functions.php

    public function test_message_controls() {
        $mod = new Hm_Output_Test(array('msg_controls_extra' => 'foo', 'foo' => 'bar', 'bar' => 'foo'), array('bar'));
-        $this->assertEquals('<a class="toggle_link" href="#"><i class="bi bi-check-square-fill"></i></a><div class="msg_controls fs-6 d-none gap-1 align-items-center"><div class="dropdown on_mobile"><button type="button" class="btn btn-outline-success btn-sm dropdown-toggle" id="coreMsgControlDropdown" data-bs-toggle="dropdown" aria-haspopup="true" aria-expanded="true">Actions</button><ul class="dropdown-menu" aria-labelledby="coreMsgControlDropdown"><li><a class="dropdown-item msg_read core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="read">Read</a></li><li><a class="dropdown-item msg_unread core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="unread">Unread</a></li><li><a class="dropdown-item msg_flag core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="flag">Flag</a></li><li><a class="dropdown-item msg_unflag core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="unflag">Unflag</a></li><li><a class="dropdown-item msg_delete core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="delete">Delete</a></li><li><a class="dropdown-item msg_archive core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="archive">Archive</a></li><li><a class="dropdown-item msg_junk core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="junk">Junk</a></li></ul></div><a class="msg_read core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="read">Read</a><a class="msg_unread core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="unread">Unread</a><a class="msg_flag core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="flag">Flag</a><a class="msg_unflag core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="unflag">Unflag</a><a class="msg_delete core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="delete">Delete</a><a class="msg_archive core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="archive">Archive</a><a class="msg_junk core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="junk">Junk</a>foo</div>', message_controls($mod));
+        $this->assertEquals('<a class="toggle_link" href="#"><i class="bi bi-check-square-fill"></i></a><div class="msg_controls fs-6 d-none gap-1 align-items-center"><div class="dropdown on_mobile"><button type="button" class="btn btn-outline-secondary btn-sm dropdown-toggle" id="coreMsgControlDropdown" data-bs-toggle="dropdown" aria-haspopup="true" aria-expanded="true">Actions</button><ul class="dropdown-menu" aria-labelledby="coreMsgControlDropdown"><li><a class="dropdown-item msg_read core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="read">Read</a></li><li><a class="dropdown-item msg_unread core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="unread">Unread</a></li><li><a class="dropdown-item msg_flag core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="flag">Flag</a></li><li><a class="dropdown-item msg_unflag core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="unflag">Unflag</a></li><li><a class="dropdown-item msg_delete core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="delete">Delete</a></li><li><a class="dropdown-item msg_archive core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="archive">Archive</a></li><li><a class="dropdown-item msg_junk core_msg_control btn btn-sm btn-light text-black-50" href="#" data-action="junk">Junk</a></li></ul></div><a class="msg_read core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="read">Read</a><a class="msg_unread core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="unread">Unread</a><a class="msg_flag core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="flag">Flag</a><a class="msg_unflag core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="unflag">Unflag</a><a class="msg_delete core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="delete">Delete</a><a class="msg_archive core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="archive">Archive</a><a class="msg_junk core_msg_control btn btn-sm btn-light no_mobile border text-black-50" href="#" data-action="junk">Junk</a>foo</div>', message_controls($mod));


The diff is hard to read here.

The difference in all failing tests is a html class.

btn-outline-success becomes btn-outline-secondary

kroky · 2025-09-29T05:17:27Z

@marclaporte Please have a look. I am done here.

As this fixes the unit tests I would like to call priority on this. If you have doubts about the coverage enforcement, I would separate the fix from the feature.

Also I would like to have a talk about enforcing some workflows, so broken tests can't happen anymore and the coverage does not dwindle further.

Our coverage is not good and enforcing a percent will further deepen the problem with tests - it will require developers to add tests just for the sake of coverage instead of thinking what really should be tested. Imagine adding some simple code that drops the coverage from 70.5% to 69.5%. You are forced to add some tests to cover the 70% threshold but the code you added is simple and doesn't need testing. You are forced to either add tests for other code (which done in the same MR has a number of issues by itself) or add tests to cover your code which is sometimes counter-productive. Having too many tests is also not something desirable - we have more and more code to maintain and having many tests doesn't mean they are quality tests - i.e. they are really catching bugs when they should.

Thus, I am not a fan of coverage reports and am against enforcing any percent here. Instead, we should think about critical or complicated code blocks that should be tested. We can add a requirement for adding proper unit or integration tests for any new feature added or major refactoring. We can and should enforce passing the test suite before merging but this should be done when the tests are stable - recently, we have had too many intermittent selenium test failures, for example. Overall, a common sense should be used when thinking about what items to test and what tests to add - a digit percentage is not suitable enough to make this decision.

ulfgebhardt · 2025-09-29T13:52:08Z

@marclaporte Please have a look. I am done here.
As this fixes the unit tests I would like to call priority on this. If you have doubts about the coverage enforcement, I would separate the fix from the feature.
Also I would like to have a talk about enforcing some workflows, so broken tests can't happen anymore and the coverage does not dwindle further.

Our coverage is not good and enforcing a percent will further deepen the problem with tests - it will require developers to add tests just for the sake of coverage instead of thinking what really should be tested. Imagine adding some simple code that drops the coverage from 70.5% to 69.5%. You are forced to add some tests to cover the 70% threshold but the code you added is simple and doesn't need testing. You are forced to either add tests for other code (which done in the same MR has a number of issues by itself) or add tests to cover your code which is sometimes counter-productive. Having too many tests is also not something desirable - we have more and more code to maintain and having many tests doesn't mean they are quality tests - i.e. they are really catching bugs when they should.

Thus, I am not a fan of coverage reports and am against enforcing any percent here. Instead, we should think about critical or complicated code blocks that should be tested. We can add a requirement for adding proper unit or integration tests for any new feature added or major refactoring. We can and should enforce passing the test suite before merging but this should be done when the tests are stable - recently, we have had too many intermittent selenium test failures, for example. Overall, a common sense should be used when thinking about what items to test and what tests to add - a digit percentage is not suitable enough to make this decision.

I disagree on all points with you here.

Forcing Developers to write tests is good to ensure they also learn and cover this part of software development.
Even a Snapshot render test can proof useful for other developers as they see that something changed where they did not expect it
Unit tests are very cheap (compared to integration & fullstack tests)
The projects I work with all use this mechanic and all have coverage of 90%+ consistently.
I can also observe when implementing this mechanic and starting even with a low coverage like 5-10% it will lead to increased coverage over time
You can always lower the coverage requirement

marclaporte · 2025-09-29T15:40:46Z

Instead, we should think about critical or complicated code blocks that should be tested.

Is there a way to help detect this or it's pretty much based on the feedback of experienced developers?

kroky · 2025-09-29T15:41:41Z

You think that quantitative measure is good for ensuring a quality test suite? Forcing developers write tests will lead to bloatware if developers are not careful enough and if they are - they will already write quality tests - then, why do we force them? Sorry but I don't see the point here.

kroky · 2025-09-29T15:46:10Z

@ulfgebhardt have you actually compared the tests we have against the tons of module code we have - I am not even including the complicated JS we have? I think the actual "coverage" of the code is closer to 30% than 78% at the moment just from common judgement based on the fact that:

I have worked on cypht's codebase in the past several years
I have done pretty major refactorings and have reviewed most of the testing code

This percentage enforcement is very misleading.

kroky · 2025-09-29T15:48:33Z

Instead, we should think about critical or complicated code blocks that should be tested.

Is there a way to help detect this or it's pretty much based on the feedback of experienced developers?

I am not sure there is an automated way to detect this but we can establish strong quidelines and make PR reviewers check them besides the constributors.

ulfgebhardt · 2025-09-29T17:04:32Z

@kroky
I did not compare what is covered and what not - and its not my task to do this. There might be portions of the software which are not covered (e.g. all javascript as this are php unit tests) and not detected as to be covered, but that is something you would notice during development and change rather then me actively checking this with an unknown codebase.

My method to verify it works: Coverall (a tool you already use) reports 80%, this PR shows 78%, which is sufficiently close in my opinion.

This also gives the developer new tools for discovering uncovered code. Here is an example from a different project, but this can also be made available to developers in this project if #1710 is solved.

See overview what files are covered and how much

Detect uncovered lines and branches:

quoting @kroky

I think the actual "coverage" of the code is closer to 30% than 78% at the moment just from common judgement based on the fact that:

I have worked on cypht's codebase in the past several years
I have done pretty major refactorings and have reviewed most of the testing code

I do not doubt your competence regarding cypht. I just want to provide a common solution to a problem the project currently faces and I happen to know that this method has achieved its goal on other repositories after being implemented.

The metric of a "good review" is obviously flawed, as you have broken unit tests in your master and coverage has decreased over time from 95% to now 78% (assuming the number is actually correct).

This approach gives a structural solution to exactly this problem; it is the standard all around github; it is proven to work and makes software better and more reliable.

And while I understand that it is very annoying in the beginning to be forced to write tests, it is how it is done. I started with the same mindset, but that has changed - now I actually like ensuring things work with tests and I often discover flaws and shortcomings or edge cases I have not thought about when implementing the tests.

I propose this, because the software currently requires a lot of manual testing - which I am doing and am reported to be one of the few - discovering a lot of flaws.

This solves this problem so I am no longer needed and software quality increases.

kroky · 2025-09-30T11:07:27Z

@ulfgebhardt thanks for your notes and advises! I appreciate you taking the time to do all this manual testing, reporting bugs and dealing with consequences. I can assure you there are others doing the same. Major part of the current situation with broken tests in master and dealing with a lot of bugs (which, btw, are not caught by the existing unit tests) is the lack of reviews that Cypht used to have for at least a year. I have requested to do a review of all code merged in master for a month or so in order to make it stable again and release stable versions. Things are progressing smoothly but maybe more slowly than we anticipated.

I can give you several PR request examples that contained tests that would have increased the coverage but which tests actually tested almost nothing - they contained copy-pasted chunks of code from the modules and tested code running in the tests themselves rather than the module code. This needed a manual review and advice from a more senior developer, so that tests become useful and actually contribute to the coverage. Additionally, the developers gained experience and learned how to deal with such situations in the future, so this manual review resulted in many benefits and I can't consider it as "flawed". My main point here is that numeric percentage is not the one we need here to deal with the current situation. It might be harmful and misleading if we have to deal with sloppy written tests. That's why I am against forcing it.

However, I would definitely want to:

add a requirement for unit and integration tests to pass before being able to merge code in master
show code coverage report in the CI and warn/advise developer to consider updating their code if the coverage drops. It will at least be a nice reminder to add tests to something you have forgotten to do
establish commit and code quality guidelines that includes do's and don'ts for automated tests

kroky · 2025-10-01T05:59:36Z

An interesting article illustrating my points: https://stackoverflow.blog/2025/09/29/making-your-code-base-better-will-make-your-code-coverage-worse/

ulfgebhardt changed the title ~~feat(workflow): enforce coverage~~ [WIP] feat(workflow): enforce coverage Sep 26, 2025

enforce coverage

2d748b5

coverage requirement on the right place

ulfgebhardt force-pushed the enforce-coverage branch from 98f441e to 2d748b5 Compare September 26, 2025 17:14

ulfgebhardt added 2 commits September 26, 2025 20:05

rregeer/phpunit-coverage-check

c03c575

use xml file to configure phpunit coverage

a2d387e

ulfgebhardt force-pushed the enforce-coverage branch from 696de46 to a2d387e Compare September 26, 2025 18:23

ulfgebhardt self-assigned this Sep 26, 2025

ulfgebhardt requested a review from marclaporte September 26, 2025 18:32

ulfgebhardt added bug something is wrong in Cypht tests test related labels Sep 26, 2025

fix unit tests

118f8f1

more fixes

ulfgebhardt force-pushed the enforce-coverage branch from fbdebbb to 118f8f1 Compare September 26, 2025 18:34

fix coverage enforcment script

43e72d8

find clover.xml find clover.xml find clover.xml found clover.xml remove clutter

ulfgebhardt force-pushed the enforce-coverage branch from 89f91f4 to 43e72d8 Compare September 26, 2025 19:23

ulfgebhardt changed the title ~~[WIP] feat(workflow): enforce coverage~~ feat(workflow): enforce coverage Sep 26, 2025

coverage requirement variable, reduce coverage to 78%

0c3a678

ulfgebhardt force-pushed the enforce-coverage branch from 4eff8da to 0c3a678 Compare September 26, 2025 19:57

ulfgebhardt commented Sep 26, 2025

View reviewed changes

marclaporte requested a review from kroky September 27, 2025 02:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(workflow): enforce coverage #1709

feat(workflow): enforce coverage #1709

Uh oh!

ulfgebhardt commented Sep 26, 2025 •

edited

Loading

Uh oh!

ulfgebhardt commented Sep 26, 2025

Uh oh!

ulfgebhardt Sep 26, 2025

Uh oh!

kroky commented Sep 29, 2025

Uh oh!

ulfgebhardt commented Sep 29, 2025 •

edited

Loading

Uh oh!

marclaporte commented Sep 29, 2025

Uh oh!

kroky commented Sep 29, 2025

Uh oh!

kroky commented Sep 29, 2025 •

edited

Loading

Uh oh!

kroky commented Sep 29, 2025

Uh oh!

ulfgebhardt commented Sep 29, 2025 •

edited

Loading

Uh oh!

kroky commented Sep 30, 2025

Uh oh!

kroky commented Oct 1, 2025

Uh oh!

Uh oh!

feat(workflow): enforce coverage #1709

Are you sure you want to change the base?

feat(workflow): enforce coverage #1709

Uh oh!

Conversation

ulfgebhardt commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🍰 Pullrequest

Issues

Todo

Uh oh!

ulfgebhardt commented Sep 26, 2025

Uh oh!

ulfgebhardt Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

kroky commented Sep 29, 2025

Uh oh!

ulfgebhardt commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marclaporte commented Sep 29, 2025

Uh oh!

kroky commented Sep 29, 2025

Uh oh!

kroky commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kroky commented Sep 29, 2025

Uh oh!

ulfgebhardt commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kroky commented Sep 30, 2025

Uh oh!

kroky commented Oct 1, 2025

Uh oh!

Uh oh!

ulfgebhardt commented Sep 26, 2025 •

edited

Loading

ulfgebhardt commented Sep 29, 2025 •

edited

Loading

kroky commented Sep 29, 2025 •

edited

Loading

ulfgebhardt commented Sep 29, 2025 •

edited

Loading