-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingfixedSomething works now, yay!Something works now, yay!regexmeow is a substring of homeownermeow is a substring of homeowner
Description
The regular expression [\w\s]
fails to match whitespace characters with code points > 255.
Test case
#include <iostream>
#include <regex>
using namespace std;
int main() {
const wregex re1(LR"([\s])");
const wregex re2(LR"([\w\s])");
cout << R"(U+0020 SPACE is matched by "[\s]": )" << regex_match(L" ", re1) << '\n';
cout << R"(U+0020 SPACE is matched by "[\w\s]": )" << regex_match(L" ", re2) << '\n';
cout << R"(U+2028 LINE SEPARATOR is matched by "[\s]": )" << regex_match(L"\u2028", re1) << '\n';
cout << R"(U+2028 LINE SEPARATOR is matched by "[\w\s]": )" << regex_match(L"\u2028", re2) << '\n';
}
https://godbolt.org/z/oEdTs3Th4
This prints:
U+0020 SPACE is matched by "[\s]": 1
U+0020 SPACE is matched by "[\w\s]": 1
U+2028 LINE SEPARATOR is matched by "[\s]": 1
U+2028 LINE SEPARATOR is matched by "[\w\s]": 0
Expected result
This should print:
U+0020 SPACE is matched by "[\s]": 1
U+0020 SPACE is matched by "[\w\s]": 1
U+2028 LINE SEPARATOR is matched by "[\s]": 1
U+2028 LINE SEPARATOR is matched by "[\w\s]": 1
Additional remarks
The underlying cause is #5242. But while I consider fixing #5242 ABI-breaking, I think this issue can be fixed without breaking ABI.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingfixedSomething works now, yay!Something works now, yay!regexmeow is a substring of homeownermeow is a substring of homeowner