Skip to content

Conversation

greenEkatherine
Copy link
Contributor

There are 3980 generated test cases in SniHelper_TruncatedData_Fails scenario, it breaks XUnit limits for subresults and therefore doesn't report them properly.
The same change was made in YARP's test dotnet/yarp#1432

@ghost
Copy link

ghost commented Dec 7, 2021

Tagging subscribers to this area: @dotnet/ncl, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

There are 3980 generated test cases in SniHelper_TruncatedData_Fails scenario, it breaks XUnit limits for subresults and therefore doesn't report them properly.
The same change was made in YARP's test dotnet/yarp#1432

Author: greenEkatherine
Assignees: -
Labels:

area-System.Net.Security

Milestone: -

// moving inside one test because there are more than 3000 cases and they overflow subresults
foreach ((int id, byte[] clientHello) in InvalidClientHelloDataTruncatedBytes())
{
InvalidClientHello(clientHello, id, shouldPass: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now if one of the test fails we won't know which one and also the rest of them won't get executed.
If we're fine with this, no problem. I'm just making sure all the side-effects are understood and agreed on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. I may add additional info to identify failed test, if it helps - we still have an id.

@stephentoub
Copy link
Member

Is this actually coming from xunit itself? Do we see this locally or only in CI? Seems like it's actually coming from AzDo, misspellings and all. Is it causing problems? I'd be inclined to keep it as is and get the local benefits of better debugability, unless there's a real issue.

@stephentoub
Copy link
Member

stephentoub commented Dec 7, 2021

That said, 3980 test cases seems like a lot for this. Are they all valuable? Sometimes theories make it a little too easy to test the same conditions over and over :-)

@greenEkatherine
Copy link
Contributor Author

Is this actually coming from xunit itself? Do we see this locally or only in CI? Seems like it's actually coming from AzDo, misspellings and all. Is it causing problems? I'd be inclined to keep it as is and get the local benefits of better debugability, unless there's a real issue.

It is only in CI. The possible problem - ignoring test results after the first 1000 cases.

@greenEkatherine
Copy link
Contributor Author

That said, 3980 test cases seems like a lot for this. Are they all valuable? Sometimes theories make it a little too easy to test the same conditions over and over :-)

That was my question to @wfurt 😃 There are no duplicates, so my suggestion was to stop generating data after we reach broken bytes. It should reduce number of cases.

@stephentoub
Copy link
Member

stephentoub commented Dec 7, 2021

The possible problem - ignoring test results after the first 1000 cases.

Does it:

  • not run them in the first place
  • ignore only successes
  • ignore failures as well

?

We have many large theories throughout runtime. I'd be surprised and worried if this was actually ignoring failures in CI.

@greenEkatherine
Copy link
Contributor Author

The possible problem - ignoring test results after the first 1000 cases.

Does it:

  • not run them in the first place
  • ignore only successes
  • ignore failures as well

?

We have many large theories throughout runtime. I'd be surprised and worried if this was actually ignoring failures in CI.

I double checked, failures should be counted in final result, the issue is only in reporting for separate cases https://developercommunity.visualstudio.com/t/VSTest-test-publication-miscounts-test-c/909375#T-ND914320

@stephentoub
Copy link
Member

Thanks. That thread makes it sound like there isn't actually a problem here, and it's just grouping all cases under the parent?

@greenEkatherine
Copy link
Contributor Author

Thanks. That thread makes it sound like there isn't actually a problem here, and it's just grouping all cases under the parent?

I think no, it's related to the similar issues where not all test results are published. Parent should show if any test fails, without it I cannot find a confirmation that CI will not ignore failures if they happen after the limit.

}

public static IEnumerable<object[]> InvalidClientHelloDataTruncatedBytes()
public static IEnumerable<Tuple<int, byte[]>> InvalidClientHelloDataTruncatedBytes()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, as a style thing, we seem (I think?) to most often use the (int, byte[]) syntax (possibly with names) not the old Tuple<X, Y> syntax.

@greenEkatherine
Copy link
Contributor Author

I found out that it is a publishing issue only, failing test must be indicated correctly in CI. I don't even see warnings in runtime pipeline as it was in YARP pipeline, it's because publishing steps are different. Taking this and feedback into account I would rather decline this change.

@stephentoub
Copy link
Member

Taking this and feedback into account I would rather decline this change.

Sounds good. Thanks.

@ghost ghost locked as resolved and limited conversation to collaborators Jan 8, 2022
@wfurt
Copy link
Member

wfurt commented Feb 15, 2022

I'mm sorry for commenting on old and closed issue - but I completely missed it for some reason until @rzikm brought it to my attention.

This was was added by dotnet/corefx#28278 very long time and it feels like simple form of fuzzing e.g. feeding the parser with variances of invalid input and checking that the parsing still fails. It is unlikely IMHO we will ever break it and if so int should be easy to figure out details from local test run.

There are two reasons why I wanted to fix this (since we already made same change for YARP)

  • when running tests locally, this kills the scroll buffer so often it is not possible to see other test failures directly.
  • It creates thousands of extra entries for each build on each platform. That creates unnecessary burden on CPU and storage. Not huge but we seems to be on constant battle for resources.

I'm wondering if we could reconsider this @stephentoub and take it as small improvement.

@stephentoub
Copy link
Member

I'm wondering if we could reconsider this @stephentoub and take it as small improvement.

If we don't believe the tests are valuable, it's fine to get rid of them or simplify them. My pushback was on removing them or changing the structure of them purely to work around the cited limitation, which didn't seem to be correct.

@wfurt
Copy link
Member

wfurt commented Feb 15, 2022

I think there is some value of feeding in invalid data since SslStream operates on untrusted inputs. But I'm not sure what would be good simplification. I added another invalid data test in #63184 But it seems difficult to come up with somewhat more complete set. For that reason I thought moving this chunk into single test may be useful. We can try to come up with better ways how to fuzz or generate invalid or incomplete data.

@karelz karelz added this to the 7.0.0 milestone Apr 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants