Skip to content

Instantly share code, notes, and snippets.

@cikasfm
Created May 28, 2013 13:08
Show Gist options
  • Save cikasfm/5662633 to your computer and use it in GitHub Desktop.
Save cikasfm/5662633 to your computer and use it in GitHub Desktop.
/**
* The problem:
*
* http://stackoverflow.com/questions/15314616/removing-minimal-letters-from-a-
* string-a-to-remove-all-instances-of-string-b
*
* <p>
* If we have string A of length N and string B of length M, where M < N, can I
* quickly compute the minimum number of letters I have to remove from string A
* so that string B does not occur as a substring in A?<br/>
* <br/>
* If we have tiny string lengths, this problem is pretty easy to brute force:
* you just iterate a bitmask from 0 to 2^N and see if B occurs as a substring
* in this subsequence of A. However, when N can go up to 10,000 and M can go up
* to 1,000, this algorithm obviously falls apart quickly. Is there a faster way
* to do this?<br/>
* <br/>
* Example: A=ababaa B=aba. Answer=1.Removing the second a in A will result in
* abbaa, which does not contain B.<br/>
* <br/>
* Edit: User n.m. posted a great counter example: aabcc and abc. We want to
* remove the single b, because removing any a or c will create another instance
* of the string abc.
* </p>
*
* @author zvilutis
*/
public class RemovingMinimalNumberOfLetters {
/**
* @param args
*/
public static void main( String[] args ) {
String A = "aaaabbbbbccccabcabcabcababaaaabbbabcacbabca";
// to make it more difficult
try {
for ( int i = 0; i < 10; i++ ) {
A += A;
}
}
catch ( OutOfMemoryError e ) {
RuntimeException re = new RuntimeException( "A.length=" + A.length(), e );
re.printStackTrace();
}
long nanoTime = System.nanoTime();
String B = "ab";
String acopy = A.toString();
int lettersRemoved = 0;
for ( int i = 0; i < acopy.length(); i++ ) {
boolean matchFound = true;
for ( int j = 0; j < B.length(); j++ ) {
int index = i + j;
if ( index < acopy.length() && acopy.charAt( index ) != B.charAt( j ) ) {
matchFound = false;
break;
}
}
if ( matchFound ) {
// OK OK this is not optimal :)
acopy = acopy.substring( 0, i ) + ( i + 1 < acopy.length() ? acopy.substring( i + 1 ) : "" );
lettersRemoved++;
if ( i > 1 ) {
// in case of A="aaabbb" and B="ab"
i -= 2;
} else if ( i > 0 ) {
// when we remove a char - we need to decrement it's index
i--;
}
}
}
System.out.println( "time = " + ( System.nanoTime() - nanoTime ) + "ns");
System.out.println( "total = " + A.length() );
System.out.println( "removed = " + lettersRemoved );
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment