Skip to content

Instantly share code, notes, and snippets.

@dantewang
Forked from anonymous/makeRandom.java
Created November 6, 2012 09:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dantewang/4023720 to your computer and use it in GitHub Desktop.
Save dantewang/4023720 to your computer and use it in GitHub Desktop.

Description copied from http://blog.zhaojie.me/2012/11/how-to-generate-typoglycemia-text.html

Typoglycemia是个新词,描述的是人们识别一段文本时的一个有趣的现象:只要每个单词的首尾字母正确,中间的字母顺序完全打乱也没有关系,照样可以正常理解。例如这么一段文字:

I cdnuol't blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg: the phaonmneel pweor of the hmuan mnid. Aoccdrnig to a rseearch taem at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Scuh a cdonition is arppoiatrely cllaed Typoglycemia.

Amzanig huh? Yaeh and you awlyas thguoht slpeling was ipmorantt.

我们其实可以较为轻松地识别出其原文:

I couldn't believe that I could actually understand what I was reading: the phenomenal power of the human mind. According to a research team at Cambridge University, it doesn't matter in what order the letters in a word are, the only important thing is that the first and last letter be in the right place. The rest can be a total mess and you can still read it without a problem. This is because the human mind does not read every letter by itself, but the word as a whole. Such a condition is appropriately called Typoglycemia.

Amazing, huh? Yeah and you always thought spelling was important.

事实上中文也有类似的性质,文字序乱是不响影正阅常读的。

那么我们可以如何从一段正确的文本(下方)生成一段Typoglycemia文本(上方)呢?这其实是我今天出的一道面试题,简单地说就是要求实现这么一个方法:

string MakeTypoglycemia(string text); 

规则很简单:

保持所有非字母的字符位置不变。 保持单词首尾字母不变,中间字符打乱。 所谓”打乱“,可以随意从网上找一段数组乱序的算法即可,无需保证一定改变(例如某些除去头尾只有两个字母的单词,偶尔保留不变问题也不大)或者每个字符都不在原来的位置上。不过,我们假设这段代码会被大量调用,因此希望可以尽可能地效率高些,内存使用少些。

//果然华丽丽地把最后一个单词漏掉了。。。
package im.dante.typoglycemia;
import java.util.Objects;
import java.util.Random;
/**
* @author dante wang
*/
public class Typoglycemia {
/**
* @param args the command line arguments
*/
public static void main(String[] args) throws Exception {
Typoglycemia r = new Typoglycemia();
String content = "I couldn't believe that I could actually understand "
+ "what I was reading: the phenomenal power of the human mind. "
+ "According to a research team at Cambridge University, it "
+ "doesn't matter in what order the letters in a word are, the "
+ "only important thing is that the first and last letter be "
+ "in the right place. The rest can be a total mess and you "
+ "can still read it without a problem. This is because the "
+ "human mind does not read every letter by itself, but the "
+ "word as a whole. Such a condition is appropriately called "
+ "Typoglycemia. Amazing, huh? Yeah and you always thought "
+ "spelling was important.";
System.out.println(r.produceTypoglycemia(content));
}
private String produceTypoglycemia(String content) {
Objects.requireNonNull(content);
if (content.length() <= 3) {
return content;
}
// one array which has the same length with the original string's internal array
char[] contentChars = content.toCharArray();
int indexSpace = 0;
int indexStart = 0;
int indexEnd = 0;
Random random = new Random();
// first level loop: looking for spaces
for (int i = 0; i < contentChars.length; i++) {
if (contentChars[i] != ' ') {
continue;
}
indexStart = indexSpace;
indexSpace = i;
indexEnd = indexSpace;
randomize(contentChars, indexStart, indexEnd, random);
}
if (indexSpace != contentChars.length - 1) {
indexStart = indexSpace;
indexEnd = contentChars.length - 1;
randomize(contentChars, indexStart, indexEnd, random);
}
// finally one string with the same length of the array
return String.valueOf(contentChars);
}
private void randomize(
char[] contentChars, int indexStart, int indexEnd, Random random) {
char temp;
// search for the last letter.
// some occasions hasn't been taken into consideration
for (int i = indexEnd; i > indexStart; i--) {
if (Character.isLetter(contentChars[i])) {
break;
}
indexEnd = i;
}
indexStart += 2;
indexEnd -= 2;
int length = indexEnd - indexStart + 1;
if (length == 2) {
// shortcut for 2 chars inside the word, avoid using Random.
temp = contentChars[indexStart + 1];
contentChars[indexStart + 1] = contentChars[indexEnd - 1];
contentChars[indexEnd - 1] = temp;
}
else if (length > 2) {
for (int i = indexStart; i <= indexEnd; i++) {
int randPos = random.nextInt(length) + indexStart;
temp = contentChars[i];
contentChars[i] = contentChars[randPos];
contentChars[randPos] = temp;
}
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment