Skip to content

Instantly share code, notes, and snippets.

@Filnor
Last active March 8, 2018 15:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Filnor/d08dc362e05768bf31d4995628213dbe to your computer and use it in GitHub Desktop.
Save Filnor/d08dc362e05768bf31d4995628213dbe to your computer and use it in GitHub Desktop.
Regex to improve Beli's captures
From https://github.com/SOBotics/Belisarius/blob/433477414b13d516ed6de369d1ef1880be404f81/ini/BlackListedAnswerWords.txt converted to txt
'help me':
(?i)help\W?me[^\w]
'posted (working) solution', enhanced to also catch 'ed' -> 'posted'
(?i)(post|posted)\W?(a|)\W?(working|)\W?solution
'solution':
(?i)solution
'have another problem' in multiple forms:
(?i)(have|had|got)\W?(another|other)\W?(new|fresh|)\W?(problem|issue)
============
TESTS
============
RegEx: (?i)help\W?me[^\w]
Test Lines:
pls help me - match
please help me fix this - match
help method - no match
assume help was not - no match
Help me, I'm stuck - match
PlEaSE HeLP mE - macth
just a common text that's containing help or me - no match
waffle - no match
Test lines after replacing matches with a * (using the regex replace function of Notepad++):
pls *
please *fix this
help method
assume help was not
* I'm stuck
PlEaSE *
just a common text that's containing help or me
waffle
package org.sobotics.belisarius;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
try {
//Reading Regexes from test file
String string = readFile("teststrings.txt", Charset.forName("UTF-8")).toLowerCase();
ArrayList<Matcher> matcherList = new ArrayList<>();
//Adding Regexes to the matcher list
matcherList.add(Pattern.compile("(?i)(have|had|got)\\W?(?:an)?other\\W?(new|fresh|)\\W?(problem|issue)").matcher(string));
matcherList.add(Pattern.compile("(?i)help\\W?me\\W").matcher(string));
//Find and print the matcher
for(int i = 0; i < matcherList.size(); i++) {
System.out.println("\nMatcher " + (i + 1) + "\n");
while(matcherList.get(i).find()) {
System.out.println(matcherList.get(i).group());
}
}
} catch(IOException e) {
System.out.println(e.getStackTrace());
}
}
static String readFile(String path, Charset encoding) throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, encoding);
}
}
but now have another problem
but had another problem
got another issue
have another new problem
I have another issue
I have other issue
please help me
help me
help method
assume help was not
HeLP mE
waffle
@Filnor
Copy link
Author

Filnor commented Mar 8, 2018

I've now created a Repository for this: https://github.com/pbdevch/BeliRegex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment