Skip to content

Conversation

StephanTLavavej
Copy link
Member

In general, the STL cannot avoid all stack overflows. (We don't know how much stack has been consumed when the user calls us, so thinking any thoughts might be too much.) However, <regex> tries to throw exceptions instead of stack overflowing due to excessive recursion. We've had a long-standing struggle with exactly where to set this threshold during matching; right now our threshold detects most potential overflows before they happen, but not all.

However, we don't do anything during regex construction. It turns out that requesting an excessive number of capture groups can easily trigger a stack overflow. We even had commented-out legacy code, indicating that our predecessors thought about this scenario somewhat. This PR adds a hardcoded limit arbitrarily set at 1000, which is more than generous enough for realistic code, yet comfortably far away from where I begin to observe stack overflows.

Alternatively, we could have more comprehensive logic to track the number of nested function calls during regex construction (which I believe would be entirely ABI-safe), but I'd like to go with this targeted change for now.

@StephanTLavavej StephanTLavavej added the enhancement Something can be improved label Mar 5, 2024
@StephanTLavavej StephanTLavavej requested a review from a team as a code owner March 5, 2024 05:41
@fsb4000
Copy link
Contributor

fsb4000 commented Mar 5, 2024

Will your patch fix the issue: #997 ?

@StephanTLavavej
Copy link
Member Author

No, that occurs during matching.

@StephanTLavavej StephanTLavavej self-assigned this Mar 6, 2024
@StephanTLavavej
Copy link
Member Author

I'm speculatively mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

Copy link
Contributor

@barcharcraz barcharcraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine, although I find checking for stack overflows in this way somewhat dubious.

@StephanTLavavej StephanTLavavej merged commit d6efe94 into microsoft:main Mar 8, 2024
@StephanTLavavej StephanTLavavej deleted the max-capture-groups branch March 8, 2024 05:38
@StephanTLavavej StephanTLavavej added the regex meow is a substring of homeowner label Jun 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Something can be improved regex meow is a substring of homeowner
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants