Skip to content

Instantly share code, notes, and snippets.

@WojciechMula
Created June 14, 2019 09:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save WojciechMula/78f7b579abe77ebcfe38beae8d037e88 to your computer and use it in GitHub Desktop.
Save WojciechMula/78f7b579abe77ebcfe38beae8d037e88 to your computer and use it in GitHub Desktop.
Demonstration of regex and string_view problems
/*
Demonstration of regex and string_view problems
$ g++ -std=c++17 -Wall -Wextra regex.cpp -o test
$ ./test
wrong()
- fixed
- fi
- fixed
correct()
- kitten
- 42
- fixed
*/
#include <regex>
#include <string_view>
#include <iostream>
#include <vector>
std::regex re{"name=([a-z]+), number=([0-9]+), fixed=(fixed)"};
std::string_view input{"name=kitten, number=42, fixed=fixed"};
std::vector<std::string_view> wrong()
{
std::vector<std::string_view> result;
std::match_results<std::string_view::const_iterator> match;
if (std::regex_match(input.cbegin(), input.cend(), match, re)) {
for (size_t i=1; i < match.size(); i++)
result.push_back(match[i].str());
}
return result;
}
std::vector<std::string_view> correct()
{
std::vector<std::string_view> result;
std::match_results<std::string_view::const_iterator> match;
if (std::regex_match(input.cbegin(), input.cend(), match, re)) {
for (size_t i=1; i < match.size(); i++) {
const char* first = match[i].first;
const char* last = match[i].second;
result.push_back({first, static_cast<std::size_t>(last - first)});
}
}
return result;
}
void dump(const std::vector<std::string_view>& v) {
for (const auto& sv: v)
std::cout << "- " << sv << '\n';
}
int main() {
auto r1 = wrong();
std::cout << "wrong()" << '\n';
dump(r1);
auto r2 = correct();
std::cout << "correct()" << '\n';
dump(r2);
}
@mmatrosov
Copy link

mmatrosov commented Jan 18, 2021

Thanks for this useful example! Note that in C++20 you can also use this simpler correct form:

      result.push_back({first, last});

@WojciechMula
Copy link
Author

@mmatrosov thanks for the C++20 pointer, I wasn't aware of it.

Glad you found this snippet helpful, I wasted a lot of time trying to figure out why a simple regex produces crazy results. :)

@wich
Copy link

wich commented Apr 4, 2021

Thanks for the gist, the definitions of first and last are not necessary, also one can use std::vector::emplace_back:

std::for_each(match.begin()+1, match.end(), [&args](const auto& m) {result.emplace_back(m.first, m.second);});

or if you prefer the discrete loop:

for (size_t i = 1; i < match.size(); ++i)
    result.emplace_back(match[i].first, match[i].second);

@WojciechMula
Copy link
Author

@wich Thank you for the comment. Your version uses the C++20 constructor of string_view, which accepts iterators. The snippet was written in the good old days of C++17 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment