Skip to content

Instantly share code, notes, and snippets.

@Davidebyzero Davidebyzero/gist:9090628
Last active Dec 12, 2018

Embed
What would you like to do?
Solutions to teukon's Draft Regex Golf Bonus Levels (SPOILERS! SPOILERS!) and discussion of mathematical regexes
@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

1.0) Subtraction

182 points: ^(x*)(.+)- \2= \1$
180 points: ^(x+)(x+) - \2 = \1$

2.0) Anyway

171 points (robust): ^((^|[aeiouy])([^aeiou]|$))*$
161 points (robust): ^[^aeiou]?([aeiouy][^aeiou])*[aeiouy]?$
176 points (exact): ^[ae]?(.[aeiouy])*[^e]?$
174 points (exact): ^.?([aeiouy][^e])*[aeoy]?$

3.0) Addition

177 points: ^(.*)(.*)(.*) = \1\3\2$ or ^(.*)(.*)(.*) = \2\1\3$
160 points: ^(x*)(x*)( . )(x*)(x*) = \1\3?\2\4\3?\5$

4.0) Coprime

182 points: ^(?!(xx+)\1* \1+$)

5.0) Typist

189 points: ^[^yuh-p]+$

6.1) Dominoes

172 points: ^.?((.).? (\2|(?=.\2)))*\S+$
152 points: ^((.)(.) (?=(.?)(\2|\3)(.?)($| .?(\4\6).?)))*..$

11.1) Dominoes 2

172 points: ^.?((.).? (\2|(?=.\2)))*\S+$

7.0) Modulus

170 points: ^(?=.*% (x+))(?=\1*(x*)).* \2$
169 points: ^(?=.* (x+) = (x*))\1*(?!\1)\2
167 points: ^(?=.* (x+) )\1*(?!\1)(x*) .* \2$
161 points: ^(?=.* (x+) )(?=(\1*))\2(x*) % \1 = \3$
152 points: ^(x+)((?=(\1*))\3(x*) % \1 = \4| % \1(x+) = \1)$

8.0) Euclid

174 points: (?=\((x+)\1*, \1*\)).* \1$
173 points: (?=(\((x+)(\2|\W)+= ))\1\2$
172 points: (?=(\W(x+)\2*, \2+\W+))\1\2$
170 points: \((?=((x+)\2*, \2+\)))\1 = \2$

9.0) Latin squares

159 points (robust): ((.)(?!(.{5})*.{4}\2)(?=\S* \S*\2)| ){15}
153 points (robust): ^(?!.*(\S)(.{5})*.{4}\1)((.)(?=(.* .*\4){3}))+
141 points (robust): ^(?!.*(\S)((.{5})*.{4}|\S*)\1)(.)(.)(.)(.)( |\4|\5|\6|\7)+$
134 points (robust): ^(?!.*(\S)(.{5})*.{4}\1)(.)(.)(.)(.)( ((\3|\4|\5|\6)(?!\w*\9))+)+$
143 points (exact): ^(.)(.)(.)(.)( ((\1|\2|\3|\4)(?!(\w*|(.{5})*.{4})\7))+)+$

10.1) Matryoshka

168 points: ^((\w*)(\w*) (?=\2\w+\3\b))*\w+$

12.0) Xor

162 points: ^(?!.*(.).{12}(0.{10}(?!\1)|1.{10}\1))
160 points: ((.)(?=.{12}(\2.{10}0|(?!\2).{11}1))){8}

13.0) Tic-tac-toe

169 points: (\w)(\1|..(\1|.\1.|..\1..)..)\1

@teukon

This comment has been minimized.

Copy link

commented Feb 19, 2014

Nice work.

I have a 171-point robust solution for Anyway and a 165 point robust solution for Addition. I can post them if you want them but first wanted to let you know that improvements are possible.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

I'd like to try to find those solutions myself. I do have two questions regarding Anyway, though:

  1. Am I right about the conditions for robustness?
  2. If a y is phonologically a consonant, does that disqualify it from fitting into a vowel slot?
  3. Would may, honey, holiday, aye, etc., belong on the left or right? (i.e., if a y is phonologically a vowel, does that disqualify it from fitting into a consonant slot?)

I'm guessing since you put anyway and key on the left side, the answer to #3 would be "on the left", and by extension, the answer to #2 would be the same. (Also the fact that it'd be impossible to solve with a regex otherwise.) But I thought it was an interesting point to bring up.

Edit: I think there may actually be no words that satisfy #2.

@teukon

This comment has been minimized.

Copy link

commented Feb 19, 2014

Yes that was the intent. Do you think the example strings need work?

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

I think I've beaten your Addition solution. Pretty sure ^(.*)(.*)(.*) = \1\3\2$ (177) is robust.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

Anyway: ^((^|[aeiouy])([^aeiou]|$))*$ (171)

Cool trick :)

@teukon

This comment has been minimized.

Copy link

commented Feb 19, 2014

Beautiful solution to Addition. Well done!

I had ^(x*)(x*) . (x*)(x*) = \1\3 . \2\4$.

Yes, that's what I have for Anyway. I think it's worth keeping just for the pretty golf.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

Dominos: ^((.)(.) (?=(.?)(\2|\3)(.?)($| .?(\4\6).?)))*..$ (152)

I have a feeling the solution you have in mind is shorter...

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

Modulus: ^(x+)((?=(\1*))\3(x*) % \1 = \4| % \1(x+) = \1)$ (152)

For now, it seems like I have to treat a divisor greater than the dividend as a special case... I wonder if there is a way around that.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

Euclid: \((?=((x+)\2*, \2+\)))\1 = \2$ (170)

@teukon

This comment has been minimized.

Copy link

commented Feb 19, 2014

You've been busy I see.

I have 158 156 for Dominoes. My solution is very similar to yours so you can gain some points with a few tweaks.

I'm sure you can do much better than that with Modulus.

Your Euclid solution is excellent. I completely overlooked using greediness in this way and ended up with only 157 points. Combining the tricks of my solution with yours allows me to score 173 points.

@teukon

This comment has been minimized.

Copy link

commented Feb 19, 2014

It's been a while so I'll state that the intended robust solution to Typist was ^[a-gq-tv-xz]*$ (185 points). These are the letters under the left hand on the qwerty keyboard layout. All the words to be matched are difficult to touch-type on a qwerty keyboard.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

I started out with ^[a-gq-tv-xz]*$ for Typist, and then inverted it, assuming that the domain was alphabetical characters only. I thought the inversion was what you intended.

Actually I started out with ^[qwertasdfgzxcvb]*$, then sorted it, collapsed the contiguous ranges, then inverted it.

Your Euclid solution is excellent. I completely overlooked using greediness in this way and ended up with only 157 points. Combining the tricks of my solution with yours allows me to score 173 points.

Oooh.

@teukon

This comment has been minimized.

Copy link

commented Feb 19, 2014

Doh! Somehow I had it in my mind that inverting made it longer. You're quite right and your solution is better than mine.

You may assume that all strings to be tested in Typist are of the form ^[a-z]+$ so your solution is robust.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 19, 2014

Latin squares: ^(.)(.)(.)(.)( ((\1|\2|\3|\4)(?!(\w*|(.{5})*.{4})\7))+)+$ (143)

@teukon

This comment has been minimized.

Copy link

commented Feb 19, 2014

Nice solution to Latin squares. My solution was 12 characters longer:

^(?!.*((\w)\w*(\2| (?!\w*\2))|\b(\w)(.)(.)(.).* (\4|.\5|..\6|...\7)))

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Thanks!

Modulus improvement: ^(?=.* (x+) )(?=(\1*))\2(x*) % \1 = \3$ (161)

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

Much better. There's still plenty of room for improvement though. My current solution scores 169 (and ends with a space character).

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Modulus: ^(?=.* (x+) = (x*))\1*(?!\1)\2 (169)

gist.github.com's Markdown can't handle code blocks beginning or ending with spaces. It ate the trailing space. I had to use <code>...</code>

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

Well done.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Thanks. You too!

Matryoshka: ^((\w*)(\w*) (?=\2\w+\3\b))*\w+$ (168)

I'm having a lot of trouble getting the Euclid solution to 173 points.
Can't seem to improve on (?=(\W(x+)\2*, \2+\W+))\1\2$ (172)

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Latin squares (robust): ^(?!.*(\S)(.{5})*.{4}\1)(.)(.)(.)(.)( ((\3|\4|\5|\6)(?!\w*\9))+)+$ (134)

9 characters longer. So sad.

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

My 173-point solution is fairly similar to your 172-pointer. Would you like my solution? or a hint?

My Matryoshka solution is practically identical to yours. It's not so difficult to solve, but it was a royal pain to construct.

I noticed that neither of our Latin squares solutions were robust so I tweaked the solution strings. I spent some time looking for a good robust solution (164 points), tweaking to stop cheating as I went.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Still can't get 173 on Euclid, but don't want a hint.
Found a silly alternative 172, though: (?=(\((x+)(\2|\W+)+= ))\1\2$ - would make a good cat toy.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Euclid: (?=(\((x+)(\2|\W)+= ))\1\2$ (173)

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

My Matryoshka solution is practically identical to yours.

Not 100% identical, really? I don't see anything that can differ, except for the last two * and +, and changing some or all of the \w to \S.

It's not so difficult to solve, but it was a royal pain to construct.

The hard part is creating the right column, right? You've got to put in lots of entries that are almost, but not quite, matches.

Anyway, I've really enjoyed your challenges so far. Very nicely done.

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

Not 100% identical, really? I don't see anything that can differ, except for the last two * and +, and changing some or all of the \w to \S.

Exactly. You know I like *.

Anyway, I've really enjoyed your challenges so far. Very nicely done.

Thanks. And thank you for helping me debug them.

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

Well done on making it to 173 points. Interestingly, my solution was significantly different.

Euclid (173): \((?=((x+)(\2*. ){3}))\1\2$

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

My cat helped me get 141 on Latin squares. ^(?!.*(\S)((.{5})*.{4}|\S*)\1)(.)(.)(.)(.)( |\4|\5|\6|\7)+$

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

My other cat pitched in and we came up with ^(?!.*(\S)(.{5})*.{4}\1)((.)(?=(.* .*\4){3}))+ (153)

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Here's an exact solution for 9.2 that only shows its flaws in 9.0 and 9.1:
^(((\S)(?!(.{5})*.{4}\3)(?=.* .*\3))+ )+\S+$ (156 on 9.2, 136 on 9.0 and 9.1)

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

Nice progress, although it does seem that "other cat" is carrying the team.

I really need to be careful with the strings in Latin squares. I'll study your latest solution and see if I can tweak things without breaking other natural flawed solutions.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Why not allow yourself more than 20 test cases per column? I have space on my monitor for at least 53 test cases per column, at the default font size and landscape orientation. Not that I'm saying you should be that lavish (maybe even 25, 30 or 40 would be fine), but even those who don't have enough space on their monitor can scroll down or zoom out if need be.

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

It's not about whether or not you'd have to scroll (I'm already scrolling at 20). I just picked a nice number and decided to stick to it. Even with 50 test cases I'd miss things.

On looking at the problem with your almost-robust solution, I discovered that mine had a similar bug. A fix cost me 5 characters (so 159 points).

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Latin squares: ((.)(?!(.{5})*.{4}\2)(?=\S* \S*\2)| ){15} (159)

Note that ((.)(?!(.{5})*.{4}\2)(?=\S* \S*\2)| ){13} and ^((.)(?!(.{5})*.{4}\2)(?=\S* \S*\2)| ){11} also work. Not sure if you want to bother fixing that, because they don't score any better that with {15}. But this makes me wonder if you came up with the same solution.

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

My solution was ((.)(?=\w* \w*\2)(?!.{4}(.{5})*\2) ?){12} which is very similar to yours. I believe this and your {15} solution really are robust.

Thanks for the note, and yes, I'm not so worried about the {11} and {13} exact solutions. There are probably many ways of breaking things like that, e.g. ((.)(?!.{4}\2|.{9}\2)(?=\S* \S*\2)| ){15}.

How are you getting on with the latest problems? I guess Dominoes 2 is a more appropriate difficulty for you (you've just stormed through the easier ones). I'm of half a mind to challenge you with what I believe is roughly a 9-star problem.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 20, 2014

Hope you don't mind if I do these a little out-of-order.

Xor: ((.)(?=.{12}(\2.{10}0|(?!\2).{11}1))){8} (160)
Tic-tac-toe: (\w)(\1|..(\1|.\1.|..\1..)..)\1 (169)

Now for Dominoes 2...

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

Not at all.

Your Tic-tac-toe solution is character-for-character identical to mine. I have 162 points for Xor.

I changed the difficulty for Dominoes 2 to 6 stars because my solution is long. I only have 89 points. Hopefully you can do better.

@teukon

This comment has been minimized.

Copy link

commented Feb 20, 2014

Just an update. I'm making slow but steady progress on Dominoes 2: 92 95 97 98 points.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

Did you model Dominoes 2 after a real set of Dominoes rules that is actually played? Whatever they are, they seem to be really convoluted. Really, 64 46 46 66 26 26 26 32 23 23 23 is a match but 56 66 26 26 26 62 22 62 62 62 is not? Whaaaaat.

The closest I've been able to get is ^((.)(.) (?!(?:\3\2|\2\3) (?!\3\2)\2\3 ((?!\2)).\3|(?:\2\3|\3\2) (?!\2\3)\3\2 ((?!\3)).\2)(?=(.?)(\2|\3)(.?)($| .?(\6\8).?)))*..$, and that is merely trying to find what the rules are, not bothering to optimize yet.

Also, perhaps you should add more entries to the right side that are false positives when the regular Dominoes regex is used. Right now, your best Dominoes regex cheatily scores higher (118) on Dominoes 2 than your best robust one (98), because there are only four false positives. (Even my Dominoes regex scores higher, at 112.)

Your Tic-tac-toe solution is character-for-character identical to mine.

This is awesome.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

Ah, I didn't realise that the pattern was not clear.

The idea of the dominoes problem was to determine if a given string of dominoes could be transformed to a valid dominoes position (with one natural simplifying assumption for how doubles are sometimes treated; no spinners) by rotating pieces in place.

For example: 01 21 20 should match because the 21 can be rotated so that each domino touches its neighbours with matching values. 01 12 21 20 should not match because there is no way of rotating the pieces in place to make a valid dominoes position.

I hadn't realised that the Dominoes solution scores better in Dominoes 2 with false positives than my Dominoes 2 solution. This really is ugly. Combining this with how subtle the parity issue is makes it clear that the strings need to be completely redone. I'll focus on more short strings at the beginning, drawing attention to the parity issue, and build very gently in complexity, making sure there are plenty of false positives for natural Dominoes solutions.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

The idea of the dominoes problem was to determine if a given string of dominoes could be transformed to a valid dominoes position (with one natural simplifying assumption for how doubles are treated) by rotating pieces in place.

That... actually only confuses me even more. By "rotating a piece" I assume you mean flipping the first and second digits? That has absolutely no effect on whether a sequence is valid Dominoes or not, whether it's a single piece that's rotated, or all the pieces, or an arbitrary subset of pieces.

For example: 01 21 20 should match because the 21 can be rotated so that each domino touches its neighbours with matching values. 01 12 21 20 should not match because there is no way of rotating the pieces in place to make a valid dominoes position.

What the hell. My Dominoes regex already works this way, and it had to, otherwise it would have failed to be a robust or even an exact solution for Dominoes. If this were the rule for Dominoes 2, then those would be exactly the same rules as Dominoes.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

That... actually only confuses me even more. By "rotating a piece" I assume you mean flipping the first and second digits? That has absolutely no effect on whether a sequence is valid Dominoes or not, whether it's a single piece that's rotated, or all the pieces, or an arbitrary subset of pieces.

Yes, perhaps "flipping" is a better word than rotating. I consider a valid dominoes sequence as being something like: 01 15 56. Each piece is touching its neighbours at only matching values. 01 15 65 is not legal in real dominoes, but it would be legal if the 65 piece was flipped. This was the idea behind the Dominoes puzzle: can the given string be transformed into a legal dominoes position just by flipping some of the pieces.

With Dominoes 2, I introduced the idea that the same domino could appear twice (in the first puzzle, each sequence is just a subset of a standard 28-piece dominoes set; each domino is different).

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

What the hell. My Dominoes regex already works this way, and it had to, otherwise it would have failed to be a robust or even an exact solution for Dominoes. If this were the rule for Dominoes 2, then those would be exactly the same rules as Dominoes.

Your solution to Dominoes matches both 01 12 21 10 and 01 12 21 20, even though the second string cannot be resolved into a valid domino chain. You can test this here

I've hand-crafted a set of strings for Dominoes 2 now to help make things clearer. The first set was computer generated.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

Oh, I understand now. In Dominoes 1, only every window of 3 consecutive pieces must be a valid sequence. I didn't even realize at the time that pieces further away could influence whether a sequence was valid, so my Dominoes solution wasn't actually even robust for what I thought the rules for Dominoes 1 were.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

I see, and I'm glad you worked this out.

As the one creating the problems, I have an idea of what I want to do but find it difficult to convey my thoughts with a title and example strings alone.

Do you think Dominoes 2 can be salvaged or should it just be scrapped? Are the new new strings better? Are there example strings you would like to see added to the set? Perhaps Dominoes 1 could be tweaked/scrapped. I could give the puzzles more helpful names or I could include textual hints à la Erling's original set.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

I think I should defer answering that question until I figure out what's involved in solving Dominoes 2.

Right now I can't see how it can be done without faking recursion, which means setting a maximum run length. I got a whopping 6 points! Yay. ^((.)(.) (?=(.?)(\2|\3)(.?)($| (.?)(\4\6)(.?)($| (.?)(\8\10)(.?)($| (.?)(\12\14)(.?)($| (.?)(\16\18)(.?)($| (.?)(\20\22)(.?)($| (.?)(\24\26)(.?)($| (.?)(\28\30)(.?)($| .?(\32\34).?))))))))))*..$

Oh, and for your Dominoes 2 update, I get 27 points. ^((.)(.) (?=(.?)(\2|\3)(.?)($| (.?)(\4\6)(.?)($| (.?)(\8\10)(.?)($| (.?)(\12\14)(.?)($| (.?)(\16\18)(.?)($| (.?)(\20\22)(.?)($| (.?)(\24\26)(.?)($| .?(\28\30).?)))))))))*..$

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

Great. Making progress.

Just to clarify: Recursion is not required. But, like Powers from the original set, it might be difficulty to find a way around it. I gave this problem 6 stars for a reason.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

I think I've just pulverized your solution to a fine mist :) It even beats our previous Dominoes 1 solutions.

Dominoes 2: ^.?((.).? (\2|(?=.\2)))*\S+$ (172)
Dominoes 2: ^.?((.).?($| \2| (?=.\2)))*$ (172 alternative)

I'd be very curious to see what your solution was (including the optimization from 89 -> 92 -> 95 -> 97 -> 98, if you saved a record of that).

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

I think solving the 3-piece-window version of Dominoes first actually made it harder to subsequently solve the general case of Dominoes — because it warped my expectations of how the solution was to be done. I'd imagine the same might apply to other people. But I totally had fun with having it happen that way, and it made solving the general form of Dominoes all the more satisfying!

So by all means, keep both versions! It's delightfully evil.

But I would indeed recommend putting in little textual hints. Like maybe, "Gaze through a window" in the first one, and "Gaze into infinity" in the second one. Or something like that.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

Wow, I'll have to look at that. I've just been working on my Dominoes 2 solution and worked it up to 118 points, but it seems I've missed something.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

Ha! how elegant. How it didn't occur to me to do it this way is beyond me. This realisation makes me want to scrap the first problem and downgrade the second one to 5 stars.

It hardly seems worth posting now but here's my old solution.

Dominoes 2 (118): ^((.)(.) (?=((.?)(\2|\3)(\d?))(?=((( (\4|\7\6\5)?){2})*))\8($|\d?\6| .?\5\7)))*..$

While it employs a number of tricks, it's undoubtedly ugly compared to your solution. Well done indeed!

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

As you expressed interest in my progress:

(89) ^((.)(.) (?=((.?)(\2|\3)(.?))( \4| \7\6\5)?(( \4| \7\6\5){2})*($| (?!\4|\7\6\5)((?=\8).?\5\7|(?!\8).?\6))))*..$
(92) ^((.)(.) (?=((.?)(\2|\3)(.?))(?=((( \4| \7\6\5){2})*( \4| \7\6\5)?))\8($| ((?=\11).?\5\7|(?!\11).?\6))))*..$
(95) ^((.)(.) (?=((.?)(\2|\3)(.?))(?=(((( \4| \7\6\5)?){2})*))\8($| (\10(?!\8)).?\6| ((?!\10)|\8).?\5\7)))*..$
(97) ^((.)(.) (?=((.?)(\2|\3)(.?))(?=((( \4| \7\6\5){2})*( \4| \7\6\5)?))\8($| \11.?\5\7| (?!\11).?\6)))*..$
(97) ^((.)(.) (?=(.?)(\2|\3)(.?)(?=(((( \4\5\6| \6\5\4)?){2})*))\7($| (?!\8)\9.?\5| (\8|(?!\9)).?\4\6)))*..$
(98) ^((.)(.) (?=(.?)(\2|\3)(.?)(?=(((( \4\5\6| \6\5\4)?){2})*))\7($| .?((?!\8)\9\5|(\8|(?!\9))\4\6))))*..$
(110) ^((.)(.) (?=(.?)(\2|\3)(.?)(?=$| )(?=((( (\4\5\6|\6\5\4)?){2})*))\7($|\d?\5| .?\4\6)))*..$
(116) ^((.)(.) (?=(.?)(\2|\3)(\d?)(?=((( (\4\5\6|\6\5\4)?){2})*))\7($|\d?\5| .?\4\6)))*..$
(118) ^((.)(.) (?=((.?)(\2|\3)(\d?))(?=((( (\4|\7\6\5)?){2})*))\8($|\d?\6| .?\5\7)))*..$

It's almost a shame that your solution works; there's a lot of fun golf here.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

I think I have everything I need now to publish a decent complete set of bonus levels. I'll be cutting some stuff and re-ordering what's left. If you have a great idea for a puzzle which you think would go well with the set we have, now would be the time to share it.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

Here is one of my puzzle ideas: Powers 2

My solution has md5sum bdf0e3052c867d1a884cea5751ab3c88.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

May I ask how many points that scores?

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

To say so might be a spoiler.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

Hmm... ok
I've got 48 points. I can't hit your md5sum though.

Edit: Also, my computer pauses for nearly 2 seconds every time I try to change my regex. :(

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

Nice, you beat mine. I got ^((?=(x+)(\2{5})$)\3)*x$ (46)

Would your technique beat 93 in Powers 1?

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

I had ^((x+)\2{4}(?=\2$))*x$ so I guess it's just a bit tweaked.

Anyway, nice problem! Counter-attack: Prime powers

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

Nice! But I tried that in Powers 1 and it didn't work, so I didn't think of trying it on Powers 2...

I don't understand why ^((x+)(?=\2$))*x$ is not a solution for Powers 1. Can you explain it?

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

Your solution has a pretty way of using the {5} twice. I tried to tweak it to beat my own but I can't seem to.

Also, I just checked, and my solution, tailored to Powers, just scores 93 (same as the usual method).

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

I don't understand why ^((x+)(?=\2$))*x$ is not a solution for Powers 1. Can you explain it?

What Powers 1 are you looking at? It works for me. Maybe the site doesn't like you.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

Holy crap, looks like I found a major bug in Opera 12.16's ECMAScript engine. ^((x+)(?=\2$))*x$ does not work as a solution for Powers. I tried it on another browser and it did work.

Do you already have a solution in mind for Prime powers? If so have you implemented it yet?

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

I have a solution for Prime powers now. md5sum: 58bfe7c876dcf4d53aee8b121eecb556. I'll try and tweak it.

Imagine a problem which asks you to separate square numbers from non-squares. >:D

Update: Yes, I was working on it while you were fighting with Opera. It's not too hard really.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

I don't know how you did Prime powers so fast. I'm having a lot of trouble stemming from the fact that there's no non-atomic lookahead.

BTW, did you see my message about how my solution for Dominoes 2 gave me the idea to use that technique on Powers (and non-prime powers)? It was up for a short time, and then I deleted it because it occurred to me that I could use that as a puzzle for you.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

I didn't see that comment no. I did see one comment which disappeared but it didn't mention anything about powers.

Actually, I thought your problem was more difficult. I started on it immediately, thinking of Powers, Modulus, and Euclid. This wasn't getting me anywhere but I very quickly realised that I could do Prime Powers.

I spent a few more minutes thinking about the powers of 6, and asked for your score at this point (wondering if the solution was really long) when I somehow stumbled into your intended approach. I guess this was because I'd just been studying your solution to Dominoes 2 and it put me in the right frame of mind.

While I was waiting for your replies, I put a very rough Prime Powers problem together (this takes 2 minutes thanks to OEIS and three lines in the python interpreter). But I didn't have time to try and solve it before you responded.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

I don't know how you did Prime powers so fast. I'm having a lot of trouble stemming from the fact that there's no non-atomic lookahead.

I guess it all depends on your frame of mind. Look what happened to me with Dominoes 2.

What does non-atomic lookahead mean?

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 21, 2014

What does non-atomic lookahead mean?

Lookahead with backtracking. Such that the regex engine tries all possible matches to the lookahead group until it find one that causes the expression after the lookahead group to match.

@teukon

This comment has been minimized.

Copy link

commented Feb 21, 2014

I see. While I see how useful that would be in general, I'm interested in an approach to Prime powers that would use this. Hmm...

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

I've just been looking at Powers 2 again and it got me wondering: was there some special reason for preferring 6 over 4?

I was having trouble playing around with it because of the 46656 character string so I built a leaner set and just thought I'd share it. Despite only having 5 strings to match, and 20 to miss, I can't find an exact solution which is shorter than the robust one.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

Well, I wanted it to be too hard to do the test by using prime divisibility tests. Power of 4 could be tested just by seeing if it was divisible by 2 an even number of times, and no other primes. But power of 6 would have to be tested by having the number of times it's divisible by 3 equal to the number of times it's divisible by 2, and divisible by no other primes. The latter seems more complex than the former.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

I see. However, the "even number of times" bit seems quite hard to me. Is there some trick I'm missing.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

I don't know how to do either, but "equal number of times" seemed like it'd be harder. After you asked me about it 45 minutes ago, though, I had second thoughts. Now I think "even number of times" might actually be the harder one.

P.S. Maybe on a subconscious level, I wanted to see what you'd come up with if you were crazy enough, and brilliant enough, to try for the "equal number of times" test ;)

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

I finally solved Prime Powers. But you made a mistake and put 1 in the "not prime powers" column.

Here is a correctly done Prime powers.

And here is my solution: ^(?!((xx+)(?=\2*$)x+)(?=\1*$)\1*(?!$)\2*(?!\2|$)) (151)

Now I'm dying to see what yours is. Or, at least having a hint, and/or your score.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Here's mine: ^(?=(xx+?)\1*$)(?!(xx+)\2*(?!\1*$)\2$) (162).

Since you've added 1 to the left-side, it only scores 152. However, I've never regarded 1 to be a prime power myself. Some mathematicians take 1 as a prime power but it's unusual in my experience. Not that these sources count for much, but both Wikipedia and Wolfram start from 2, and while OEIS does start with 1, the first comment on the page is that 1 is sometimes omitted.

Perhaps it would be best to omit 1 from both sides.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

That's very nice. I see you used the same basic tactic that I ended up using. But I'd like to see your md5sum 58bfe7c876dcf4d53aee8b121eecb556 one please. The one you had 4 hours ago.

Also, combining a trick from yours with mine, ^(?!((xx+)(?=\2*$)x+)(?=\1*$)\1*(?!\2*$)\1$) (156)

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

That's the one. Unless I'm making some horrible error.

Maybe I just don't know what I'm doing on the command line. I'm running

echo '^(?=(xx+?)\1*$)(?!(xx+)\2*(?!\1*$)\2$)' | md5sum

in bash.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

Oh, you didn't use echo -n. You did the md5sum with the newline at the end.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Doh! I can be such a noob at times. :p

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

I still haven't managed to work out how your solution works. Can you help me out?

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

It's almost the same as yours:

PrimePower(N):
It is not true that for any two factors of N, a and b, such that a<b, b is not divisible by a.

The main difference is that I'm explicitly making the two factors different, and not explicitly making the smaller factor the smallest existing factor.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

^(?!((xx+)(?=\2*$)x+)\1*(?!\2*$)\1$) (164)

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

^(?!((xx+)(?=\2*$)x+)\1*(?!\2*$)\1$) (164)

Very nice!

It heartens me to see that a simpler (shorter) solution than both of our original ones made the match with 1 fall in line. But I still think you're probably right and 1 should probably be omitted from both columns. If for no other reason than to potentially see more variety in people's solutions.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Aha, I'm with you. Nice approach!

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Well, I've really got to go to bed. I have a load of work to get done tomorrow.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

Goodnight! Today was a good day.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

It heartens me to see that simplying your solution made the match with 1 fall in line. But I still think you're probably right and 1 should probably be omitted from both columns. If for no other reason than to potentially see more variety in people's solutions.

Indeed. I think these two different approaches emulate how different people think about prime powers, and hence why there is ambiguity with 1 in the first place.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Goodnight! Today was a good day.

Yes, the most fun I've had in a while.

Cheers!

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

Interestingly, it works even without excluding 1 as a factor: ^(?!((x+)(?=\2*$)x+)\1*(?!\2*$)\1$) (165)

Heck, it works even without explicitly excluding 0 as a factor, and without explicitly making the second factor larger.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Very nice. It probably wouldn't have occured to me to allow 0 as a factor, but then again, my name's not Davidebyzero.

It would be a shame to have made so much progress and leave my primary task unfinished, so I took an hour to clean up the problem set I'd like to publish. You can see it here.

I felt that things were getting a bit mathematics-heavy so I've not included our most recent stuff and have cut out Coprimes, Xor, and the evil Dominoes warm-up. Please feel free to reserve the first couple of comments if you'd like to present these extras and/or your own problems (I liked the "Powers of six" problem), or would like to post md5sums of your solutions. It would be good to get a bit of padding so that people with lots of vertical resolution don't see other's solutions when they visit the gist.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

Imagine a problem which asks you to separate square numbers from non-squares. >:D

I've done more than imagine it! Perfect squares

My solution has md5sum 7d0406080811edd44523b03247bcf201. I won't tell you its exact score yet, but it's between 80 and 220.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

I felt that things were getting a bit mathematics-heavy

Aw, but that's the best part. And I still haven't matched or beaten your Xor score, which speaks well for that problem. But I definitely don't think it'd be complete without Coprimes.

Also, I thought that when your problem set was ready, you'd make a clone page of http://regex.alf.nu with your problems on it, so that people's scores are saved and they don't have to open a separate page for each problem.

I really like that you included the domain for each problem, though! That's very cool.

(I liked the "Powers of six" problem)

Actually I think you were right, powers of four would be better.

It would be good to get a bit of padding so that people with lots of vertical resolution don't see other's solutions when they visit the gist.

Good thinking. But should I put md5sums or scores? Or both?

Maybe it'd be a good idea to post this on a forum that supports spoiler tags? Perhaps xkcd forums, though I'm not sure what subforum it should go under (Logic Puzzles? Computer Science? Coding?). You've probably seen the Regex Golf comic.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

I've done more than imagine it! Perfect squares

My solution has md5sum 7d0406080811edd44523b03247bcf201. I won't tell you its exact score yet, but it's between 80 and 220.

Ha! I thought this was possible but taxing. Alas, I won't have the time to look at this until at least Thursday of next week. Well done.

Can you imagine how "Squares" would have been received if it were included alongside "Primes" and "Powers" in the original set. I think many people would have been claiming the problem impossible to do without cheating (heck, some were claiming that of Powers).

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

BTW, x{7560} and x{7920} break pcregrep, when testing my Perfect Squares regex, as do all perfect squares x{5625} or higher.

bash-3.1$ printf '%*s' 7920 ''|tr ' ' x|pgrep 'CENSORED'
pcregrep: pcre_exec() gave error -8 while matching text that starts:

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

pcregrep: Error -8, -21 or -27 means that a resource limit was exceeded.
pcregrep: Check your regex for nested unlimited loops.

Another version, running on a different machine, says grep: exceeded PCRE's backtracking limit at the same thresholds, so it's probably being limited by a built-in limit, not available memory.

EDIT: Using the --match-limit option allows preventing this error from occurring.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Aw, but that's the best part. And I still haven't matched or beaten your Xor score, which speaks well for that problem.

Really? I thought you'd just forgotten about it. You still have some golf to do then. Also, did you go back and find robust solutions for the original set of problems (the possible ones at least).

This is a plus in favour of Xor but, in comparison with the other problems, the solution seemed a bit clunky/ugly.

Also, I thought that when your problem set was ready, you'd make a clone page of http://regex.alf.nu with your problems on it, so that people's scores are saved and they don't have to open a separate page for each problem.

I don't know how and I lack the time to find out. I just wanted to publish something clean and playable before I got to work. If it's easy to do, please be my guest.

I really like that you included the domain for each problem, though! That's very cool.

Thanks. I've been thinking in this way for a while but decided it was worth codifying to help people resolve disputes over "whether the empty string should match in Matyroshka" for example.

(I liked the "Powers of six" problem)

Actually I think you were right, powers of four would be better.

I agree, but the idea is great. I must admit I found it more attractive when I realised that, even with just 5 strings on the left, the shortest solution I could come up with was the robust one.

Good thinking. But should I put md5sums or scores? Or both?

Whatever you think is best. Would you have preferred solving my problems if I'd have given my score on the problem page? Would knowing possible scores have made the problems easier (I'm thinking of Dominoes).

Maybe it'd be a good idea to post this on a forum that supports spoiler tags? Perhaps xkcd forums, though I'm not sure what subforum it should go under (Logic Puzzles? Computer Science? Coding?). You've probably seen the Regex Golf comic.

Certainly. Unfortunately, I'm not very well connected. I don't have an account on xkcd (nor reddit, twitter, facebook, et cetera). Indeed, I only made this github account to comment on the Regex Golf and give my solution to Alphabetical. If you have such connections then please do let others know.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

Okay, a naive exact solution to Perfect squares as I presented it is ^(x|x{4}|x{9}|x{16}|x{25}|x{36}|x{49}|x{64}|x{81}|x{100}|x{121}|x{144}|x{169}|x{196}|x{225}|x{256}|x{289}|x{324}|x{361}|x{400}|x{441}|x{484}|x{529}|x{576}|x{625}|x{676}|x{729}|x{784}|x{841}|x{900})$ which is worth 102 points.

So I will reveal that my robust solution scores between 160 and 220.

Edit: I really need to improve the right column. ^(x|x{4}|x{9}|x{16}|x{25}|x{36}|x{49}|x{64}|x{81}|x{100}|x{121}x*)$ scores 233.

Edit: Here we go. Perfect squares, fixed. Now scores 20 points higher with robust solutions. My unchanged solution now scores between 180 and 240.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Nice to see that you managed to address your problem. It's also nice to see that you're not using any really long strings. I look forward to tackling it myself.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 22, 2014

This is a plus in favour of Xor but, in comparison with the other problems, the solution seemed a bit clunky/ugly.

Hmm. Xor: ^(?!.*(.).{12}(0.{10}(?!\1)|1.{10}\1)) (162)

Don't know why I didn't do it that way immediately. Well, I still think it would still make a good easy-level one, but I'm no longer broken-up about seeing it dropped.

@teukon

This comment has been minimized.

Copy link

commented Feb 22, 2014

Character-for-character identical again!

I don't think Xor is a bad problem. I was just cutting what I felt was silver to maximise the visibility of the gold. I wanted to make the bonus set as fun as possible so that more people feel motivated to make their own problems.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

Well, I wrote a program to simulate the algorithm my perfect-square regex uses. You see, I had a long version of it which I was sure was robust, but then I found that removing a huge portion of the string seemed not to compromise its robustness — but I was no longer sure it was truly robust with this portion removed.

Turned out I was right to wonder. The short version indeed isn't robust. It gives false positives on two numbers greater than 9500 — too large for the regex engine to test. (I'm not revealing exactly which numbers, to avoid giving too many spoilers about how my algorithm works.) I've tested up to 170,000 so far, and it's still just those two false positives, but I've every reason to expect that as it goes to infinity there will be an infinite number of false positives.

So I will go back to saying it scores between 100 and 240 points... hopefully I'll be able to find a robust way of shortening it.

Also, I'm going to need to create a smaller test sample. My latest one (with 32 perfect squares on the left, and 100 non-matches on the right) can get pretty laggy.

Edit: Oh dear! The larger of the two false positives happens even in the long version... I'm beginning to wonder if maybe this is impossible.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

Edit: Oh dear! The larger of the two false positives happens even in the long version... I'm beginning to wonder if maybe this is impossible.

Really? I'm quite confident it's possible, just very difficult. Ah, if only I had the time. :(

If you do discover that the problem is impossible, please don't tell me.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

Actually I have another idea now :) It might work...

Though I if it does, I'm not sure I'll be able to prove it. It'd probably help if I knew more number theory.

Edit: Oh wow. My idea made the algorithm much faster and much more robust (up to 28,400,000 so far, and counting [EDIT: as of 17 days after this comment was made]). And it's "short" again. Though I'm still going to have to prove that it's truly robust (or isn't). Which I have no idea how to do. For all I know it might be as hard to prove as Fermat's Last Theorem.

Edit: Made a regex string out of this improved algorithm. It allows the regex engine to work up to larger numbers than before, with the same --match-limit.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

Right. I got up 2 hours early this morning to have a go at this problem. I was going around in circles for a long time but finally found what I think is a solution. md5sum: feda71258877293543e5cad3896bb13c (no trailing newline this time).

I've got to move now but I expect this will prey on my mind and I'll want to come back to test it's validity further and/or tweak it for length/speed.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

I think this is of sufficient complexity that we're highly unlikely to have character-identical strings. But FWIW, my shortest candidate is 27a3f7250f873fdf097d902fd81b89a3. I see no reason not to share scores now; this solution is 101 characters long and scores 219 points on this test set (the last one I posted).

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

Reduced it to 91 characters (229 points) with the algorithm identical as before: 69aafe0481051e148a6c8eafa25dd3fe

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

It looks like I'm about to have my revenge for Dominoes.

My solution was 65 characters long, but I've managed to compress it to 52 characters (268 points): d0532dd568a9189c32477bfce3d6d659

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

Nice!!

And funny how simply seeing a score can give hints. I now have an idea for an obviously robust way of doing it! :) Now all that remains is implementation...

Edit: Er, no. That won't work.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

Thanks.

I've just realised that your test set starts with 1 rather than 0. If I'm allowed to ignore the empty string, I can save one more character.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

Are you sure your solution is robust? i.e. have you only been able to test it empirically, or is it obviously robust?

BTW, my regex in its minimal form matches zero. It would have to be made longer to not match zero.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

I'm not certain, no. It seems like the algorithm should work to me, but I've not proven that it works.

It certainly copes with the test set, but you already mentioned that you've experienced problems with some 4-digit numbers, so it's possible mine has such a flaw. It would be good to know what these numbers were so I could try them.

I've got a bit of time now so I'll try and prove that my algorithm works. I'll get my laptop to check some larger numbers in the background.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

I'm not certain, no. It seems like the algorithm should work to me, but I've not proven that it works.

In that case perhaps you shouldn't examine my solution yet. I've edited it out.

But I will tell you, some potential problem numbers are 5265, 17920, 216513, 393472, 690625, 1184625, 1792000, if your algorithm is anything like mine.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

Ah, my algorithm doesn't take a 2n+1 style route. I did spend some time thinking about biting off pieces of x in such a way that the property of being a square was preserved at each step, but my own grasp of number theory was insufficient for me to make much progress.

When I say I'm not certain that mine works, I'm just being very cautious. The mathematical part is straight-forward, but there may be something in the regex that doesn't behave as I expect. A careful inspection of my solution has largely convinced me that it is robust.

My regex doesn't match the first few of your problem numbers. It works for the first 10000 numbers at least.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

When I say I'm not certain that mine works, I'm just being very cautious. The mathematical part is straight-forward, but there may be something in the regex that doesn't behave as I expect.

Ah, well that's very different from my situation. I'm not sure my mathematics are sound, but I'm 100% sure that the regex is working how I think it is.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

I did catch a glimpse of your solution with the algebraic robustness condition. Could you post that up again given that our algorithms are different. I'd like to work out what it's doing.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

^(x$|(x(x+))(?=\2+$)(?=\3+$)(?=x(x\2)*x$)(?!(x+)(\2\5)+(?=\6$)(?!(xx+)(?=\7*$)\5\7*$))\3)*$

The robustness of this hinges on a question I haven't solved yet. Does there exist a b = n(n(nd+1)-d) such that its factor n is not coprime with any larger factor of b, such that there exists no e>n that is also a factor of b and for which b-1 is divisible by both e-1 and e+1 and e is not coprime with any factor of b larger than e, where b-2n+1 is a perfect square? (Where all variables are of course all natural numbers.) If it does exist, then my solution is not robust.

Edit: d does not have to be an integer. It can be of the form f/2 where f is an integer, iff the resulting b is divisible by n.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

I don't understand. If n is not coprime with any larger factor of b then, in particular, it's not coprime with ((n^2-1)d+n). But ((n^2-1)d+n) is larger than n, so n cannot be the largest such factor.

I'm probably just not parsing this right

Edit: Don't worry, you cleared up my confusion with your edit.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

b=216513, n=33, d=6 is an example that satisfies all the above conditions except for b-2n+1 being a perfect square.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

Hmm... I have no intuition about whether or not such a collection of numbers exists, although it does seem plausible that if such numbers do exist that the smallest example is quite large.

As for a proof that such numbers don't exist, might you start by ruling out b being a prime power (to kill the "not coprime" elements and focus on the rest)?

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

Edit: d does not have to be an integer. It can be of the form f/2 where f is an integer, iff the resulting b is divisible by n.

The iff doesn't seem necessary here. I think that if d is f/2 for some integer f then n will divide b in all cases.

My mistake. Iff n is odd.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

As for a proof that such numbers don't exist, might you start by ruling out b being a prime power (to kill the "not coprime" elements and focus on the rest)?

I don't know what you mean.

Here's a list of my search for possible robustness-breaking numbers so far:

b0=5265, n=15, d=3/2, b1=5236
b0=216513, n=33, d=6, b1=216448
b0=17920, n=10, d=18, b1=17901
b0=393472, n=58, d=2, b1=393357
b0=690625, n=65, d=5/2, b1=690496
b0=7997440, n=110, d=6, b1=7997221
b0=22576401, n=111, d=33/2, b1=22576180
b0=142277472, n=228, d=12, b1=142277017
b0=365056000, n=230, d=30, b1=365055541
b0=439238656, n=332, d=12, b1=439237993
b0=308093625, n=345, d=15/2, b1=308092936
b0=294299136, n=366, d=6, b1=294298405
b0=1016961561, n=231, d=165/2, b1=1016961100
b0=1480161600, n=282, d=66, b1=1480161037
b0=1686970881, n=321, d=51, b1=1686970240
b0=937086976, n=406, d=14, b1=937086165
b0=172896256, n=442, d=2, b1=172895373

If b1 were a perfect square in any of these, it would mean my algorithm was not robust.

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

As for a proof that such numbers don't exist, might you start by ruling out b being a prime power (to kill the "not coprime" elements and focus on the rest)?

I don't know what you mean.

Sorry, that wasn't clear. I just meant, if you aren't able to prove it directly, could you prove a similar result where, in addition, b always had to be a prime power. That's where I would start, but then again, I don't know much number theory. I'm guessing you're better equipped to tackle this problem than I.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 23, 2014

But this is golf. I want the string to be as short as possible, and if it's robust as it is, why make it longer but adding extra conditions? :)

I don't know much number theory either, BTW.

Edit: Actually I still don't understand you. Most perfect squares are not prime powers, and b0 is the number being tested for perfect-squareness (and n would be the square root if b were a perfect square, and b1 is the next candidate for perfect square that it passes back to itself pseudo-recursively by eating away the rest). So how does it help testing if b is a prime power?

@teukon

This comment has been minimized.

Copy link

commented Feb 23, 2014

Edit: Actually I still don't understand you. Most perfect squares are not prime powers, and b0 is the number being tested for perfect-squareness (and n would be the square root if b were a perfect square, and b1 is the next candidate for perfect square that it passes back to itself pseudo-recursively by eating away the rest). So how does it help testing if b is a prime power?

Oh, it doesn't. I'm not suggesting a modification to the algorithm, or even that such an assumption would be at all useful in a proof of the condition. I was just thinking out loud, about where I'd start in looking for a proof (I'd need to build up my understanding of how everything fits together first). Alas, I'm mired in my own mathematics at the moment and am so tired (shouldn't have gotten up so early), otherwise I'd have a crack at proving this.

Would you like me to post my solution?

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 24, 2014

I'd like to spend some more time thinking about alternative solutions. Maybe I can come up with yours myself. Have to research what other properties perfect squares have. Seems to me you must have used a property that I'm not aware of.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 24, 2014

Okay, here's one that is mathematically very simple (and thus obviously robust), and 83 characters long (8 characters shorter than my extremely-hard-to-prove one): ^(x$|(?=((x+)\3+)\2+$)(?!(x+)(?=(\4*)\4$)(?!\5(xx+)\6+$|.*(?=\2$)\4+$)).*(?=\3$))*$

This can be easily modified to match other powers; can yours, too?
Examples: (cubes, fourth powers, fifth powers)
^(x$|(?=(((x+)\4+)\3+)\2+$)(?!(x+)(?=(\5*)\5$)(?!\6(xx+)\7+$|.*(?=\3$)\5+$)).*(?=\4$))*$
^(x$|(?=((((x+)\5+)\4+)\3+)\2+$)(?!(x+)(?=(\6*)\6$)(?!\7(xx+)\8+$|.*(?=\4$)\6+$)).*(?=\5$))*$
^(x$|(?=(((((x+)\6+)\5+)\4+)\3+)\2+$)(?!(x+)(?=(\7*)\7$)(?!\8(xx+)\9+$|.*(?=\5$)\7+$)).*(?=\6$))*$
I don't see any way to generalize it to match all powers, though.

I still have no clue how you could have matched perfect squares with only 52 characters.

@teukon

This comment has been minimized.

Copy link

commented Feb 24, 2014

I'd like to spend some more time thinking about alternative solutions. Maybe I can come up with yours myself. Have to research what other properties perfect squares have. Seems to me you must have used a property that I'm not aware of.

Ok, good luck!

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 24, 2014

Thanks.

Fifth powers and higher are shorter with this generalization. Execution is also faster, because backtracking is reduced.
 5: ^(x$|((?=(x+)(\3+)$)(?!(x+)(?=(\5*)\5$)(?!\6(xx+)\7+$|.*(?=\3$)\5+$))\4){4}(?=(x+)(\8+)$)\9)*$
10: ^(x$|((?=(x+)(\3+)$)(?!(x+)(?=(\5*)\5$)(?!\6(xx+)\7+$|.*(?=\3$)\5+$))\4){9}(?=(x+)(\8+)$)\9)*$

@teukon

This comment has been minimized.

Copy link

commented Feb 24, 2014

Clever.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 24, 2014

I think I've hit your original 65 character solution. Would be pretty cool if this were character-for-character identical.
^(x$|(?=((x+)\3+)\2+$)(?=(xx+?)\4+$)(?=.*(?=\2$)\4+$).*(?=\3$))*$ (65 characters)

Except, you said you didn't do it in such a way that eats away the string while keeping a perfect square at each step.

I did spend some time thinking about biting off pieces of x in such a way that the property of being a square was preserved at each step, but my own grasp of number theory was insufficient for me to make much progress.

Hmmm. You also said removing the requirement that zero be matched made your string shorter. So I guess this is not your string.

Optimized some more:
^(x$|(?=((x+)(\3+))(\2+)$)(?=(xx+?)\6+$)\5(?=\6+$)\4)*$ (55 characters)
Generalized for higher powers, where N is the power minus one:
^(x$|((?=(x+)(\3+)$)(?=(xx+?)\5+$)\4(?=\5+$)){N}(?=(x+)(\6+)$)\7)*$ (67 characters)

teukon, please show me your 52 character string now. I'm still stumped as to how you did it, and am dying to see.

@teukon

This comment has been minimized.

Copy link

commented Feb 24, 2014

Except, you said you didn't do it in such a way that eats away the string while keeping a perfect square at each step.

I did spend some time thinking about biting off pieces of x in such a way that the property of being a square was preserved at each step, but my own grasp of number theory was insufficient for me to make much progress.

Would you like me to clarify this? I don't want to give hints if you don't want them.

Also, I have a 50 character solution: c58c03a671fda234c3fb15b504ff6236

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 24, 2014

Sure, that'd be fun. Please give me hints as to how you did it and I'll see if I can do what you did.

My solution is currently just 5 characters longer than yours, but it sounds like you did it a completely different way.

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Phew.

Except, you said you didn't do it in such a way that eats away the string while keeping a perfect square at each step.

I did spend some time thinking about biting off pieces of x in such a way that the property of being a square was preserved at each step, but my own grasp of number theory was insufficient for me to make much progress.

My sincere apologies, I didn't express myself well here at all. I certainly didn't mean to mislead you.

Ah, my algorithm doesn't take a 2n+1 style route. I did spend some time thinking about biting off pieces of x in such a way that the property of being a square was preserved at each step, but my own grasp of number theory was insufficient for me to make much progress.

I was speaking in the context of thinking about how two squares differ by 2n - 1. I wanted to try something along the lines of your 91-character solution but I didn't have the number theory to back this up. My solution does indeed bite pieces off, preserving perfect squareness at each step.

Now, for a proper hint:

Our algorithms are essentially the same. It was one of your recent posts that caused me to slap my forehead and make an easy 2-character saving. You can save at least 5 characters through good golf.

Let me know if you'd like another hint.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

Well, now I see how you dropped the zero match. I don't like it, but I suppose it's good golf.
^((?=((x+)(\3+))(\2+)$)(?=(xx+?)\6+$)\5(?=\6+$)\4)*x$ (53 characters)
Though it's possible to keep the zero match and still save a character.
^((?=((x+)(\3+))(\2+)$)(?=(xx+?)\6+$)\5(?=\6+$)\4)*x?$ (54 characters)

Edit: Generalized powers optimization
^((?=(xx+?)\2+$)((?=(x+)(\4+)$)\5(?=\2+$)){N}(?=(x+)(\6+)$)\7)*x?$ (66 characters)

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

I didn't drop the zero match. If I did, I'd be at 49 characters (which would be cool).

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

Well, I've got to say, well done, well done indeed. You pared this down to a very small form much faster than I did. Not to mention hitting upon this way of doing it much faster than I did (I went down the blind alley of trying the 2n-1 approach).

Of course I'm going to try to get it down to 50 myself too.

I didn't drop the zero match. If I did, I'd be at 49 characters (which would be cool).

Because 49 is a perfect square? :D

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Not to mention hitting upon this way of doing it much faster than I did (I went down the blind alley of trying the 2n-1 approach).

Once again, my cowardliness is to my credit.

Of course I'm going to try to get it down to 50 myself too.

Respect.

Because 49 is a perfect square? :D

You know it.

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

AAAHHH!!

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

What happened?

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

I have a gift for you.

I think you're really going to like it. :)

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

You made a regex that matches all powers >=2?

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Oh, much better than that.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

You found a way to divide the current number by a backreference and get the quotient as the new current number?

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Hmm... not quite that good.

Edit: You may well disagree.

Nah. It's better!

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

Well, can I have a hint? :D

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Triples (524):

(?=((.*?[147]){3})*((.*?[147]|){2}))(?=((.*?[258]){3})*((.*?[258]|){2}))^.*$(\3\7|(?!\3|\7)\4\8|(?!\4|\8))
@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

HOLY SHIT

I THOUGHT IT COULDN'T BE DONE

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Happy Birthday (I hope it's your birthday).

1 character better than the best FSM solution.

Naturally, feel free to rearrange terms to your taste.

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

I THOUGHT IT COULDN'T BE DONE

It couldn't. Well, not by mortals anyway.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

All I can seem to do at the moment is this cute but crappy single character shortening of the higher-powers version:
^((?=(xx+?)\2+$)((?=(x+)(\4+)$)\5(?=\2+$)){N}(?=(x+)(\6+)$)\7)*x?$ from this
^((?=(xx+?)\2+$)((?=(\2*(x*))(\4+)$)\6){N}(?=(x+)(\7+)$\5)\8)*x?$ to this
It results in more than an order of magnitude of slowdown. Hardly worth it.

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

That's a bit ugly. Although with golf, I feel length has to be taken as absolute, and a situation like this just suggests a bad problem.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

The fact that this happens with an obviously elegant problem suggests that that is a bad bad-problem schema ;-)

But what concerns me is: when your best golf string is generalized for higher powers, does it beat my ugly slow one? And does incorporating this nasty length-optimization result in the shortest length even for your golf string?

And a more general question — is your 50 character squares string slower than my 54 character string?

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Looking for more hints eh?

yes, no, and no.

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

I had another thought about your powers of 6 problem. Have you considered powers of 3?

I know this can be solved by checking that every factor is a multiple of 3, but I don't think there's a way of doing this that is as short as the loop (imagine: "How the hell are you able to score 181 on Powers of three!?!"). I think I prefer powers of 4 to the other options, but just wanted to share this thought.

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Oh, and have you noticed? Someone's trying the bonus levels.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

Looking for more hints eh?

Not exactly. I was feeling demoralized from coming up with the ugly slowness-inducing length-optimization, and was very worried that the only way to get any shorter was that same sort of thing.

yes, no, and no.

Now that is very encouraging.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

I had another thought about your powers of 6 problem. Have you considered powers of 3?

(imagine: "How the hell are you able to score 181 on Powers of three!?!")

Ooh, I like that. Sort of like the evil Dominoes 1. And an added bonus is that there can be more left-column matches in the problem set.

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Modulus (170): ^(?=.*% (x*))(?=\1*(x*)).* \2$
Euclid (174): (?=\((x+)\1*, \1*\)).* \1$

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

Wow, you're cleaning up. Literally. Those solutions are a lot cleaner. Very nice.

I hope you don't mind if I change the divisor to x+ in your Modulus. Even though my nickname is Davidebyzero, I still like a change that gives the regex engine a little less work, and/or rejects strings that don't fit the logic of the problem (even for strings that are outside the problem's stated domain, as long as doing so doesn't change the length of the regex). This does both, rejecting the pattern ^ *% += +$ which would otherwise be accepted, and short-circuiting a non-match of the pattern ^x+ % += x*$ sooner.

Modulus (170): ^(?=.*% (x+))(?=\1*(x*)).* \2$

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Sure, style them as you like.

I certainly appreciate the extra motivations of runtime efficiency and extra robustness. Personally, I place my own idea of simplicity/readability over both of these, but each to his own. I recall Peter Norvig being pleased with his solution f.o to "Plain strings", because it used less ink than foo.

I believe the total of the best known robust scores for the 10 bonus problems is now 1729 (a seemingly dull number). Just one more point and we'll have the average solution length down to 27 characters.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

Your problem set is proving to be quite hardy in the face of our raw maneuvering on.

@teukon

This comment has been minimized.

Copy link

commented Feb 25, 2014

Your problem set is proving to be quite hardy in the face of our raw maneuvering on.

Sorry, I don't understand.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 25, 2014

Your problem set is proving to be quite hardy in the face of our raw maneuvering on.

Sorry, I don't understand.

I hope that's not an unfavourable omen.

@teukon

This comment has been minimized.

Copy link

commented Feb 26, 2014

It's the "raw maneuvering on" that threw me. How does that reference Ramanujan?

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 26, 2014

"Raw ma new zhan" vs. "Raw ma neuvring on". Not quite the right number of syllables, but it's the best I could manage. It's hard to turn his name into a sequence of words that make sense, and I thought it wouldn't be fair to use only Hardy's name.

Edit: Just checked out it's pronounced, and I guess I wasn't doing it right. I was saying "Ra mah NEW zhan", but apparently it's pronounced "Raw MAH new zhan". Is that the reason you didn't get it?

@teukon

This comment has been minimized.

Copy link

commented Feb 26, 2014

Aha. I see. :D

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 26, 2014

Also, I meant "raw maneuvering on" to idiomatically mean "forging onward", i.e., our efforts at coming up with the best solutions — where "raw" means "no holds barred", i.e. we're not holding back. It is "hardy" in the face of our efforts, in that it took more than 6 days before you found a way to improve two solutions by 1 character each.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 27, 2014

Hahahaha it's so obvious now that I realize what it is.

^((?=(xx+?)\2+$)((?=\2+$)(?=(x+)(\4+)$)\5){N})*x?$ (50 characters)

So awesome. The best golf optimization automatically generalizes it for higher powers. And makes it faster than the former perfect square golf string (by reducing backtracking). And the power doesn't have to have 1 subtracted from it.

@teukon

This comment has been minimized.

Copy link

commented Feb 27, 2014

Well done indeed! Mine was practically identical, I just preferred * to + in a few places.

^((?=(xx+?)\2*$)((?=\2*$)(?=(x*)(\4+)$)\5){2})*x?$

It may seem simple now, but I maintain that this is a very difficult problem, one likely to make people argue its possibility (supposing they're not given warm-up problems such as Powers of four).

Another bonus for there being such a short squares string is that a problem doesn't need a lot of strings for the robust solution to also be the highest scoring solution. I can't construct particularly short regexes which match the first few square numbers and miss all non-squares; for example, I used 41 characters for squares 0 through 10.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 27, 2014

I think the reason the last length-optimization didn't occur to me for almost 3 days might be that the speed-optimizer in me automatically discarded the idea of putting a test at the beginning of an inner loop that'd be useless every first iteration. But it turns out that in practice, the difference in speed is negligible (the short version is on the order of being a few percent slower for large squares, which is a sacrifice I make gladly; for non-squares, the difference is less than a percent).

May I see your 65 and 52 character ones too please? Your 52 was the one with md5sum d0532dd568a9189c32477bfce3d6d659.

@teukon

This comment has been minimized.

Copy link

commented Feb 27, 2014

Sure.

65 points. md5sum: feda71258877293543e5cad3896bb13c

^((?=(xx+?)\2+$)(?=((xx+)\4*)\4$)\3(?=\2+$)(?=((x+)\6*)\6$)\5)*x$

52 point. md5sum: d0532dd568a9189c32477bfce3d6d659

^((?=(xx+?)\2*$)((?=\2*$)(?=((x*)\5*)\5$)\4){2})*x?$

As you can see, my first solution collapsed to 50 characters pretty easily. The last 2 characters were inspired by one of your solutions (although, I felt pretty foolish for not seeing it myself).

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Feb 27, 2014

Thanks!

I found a mild curiosity. (?=(x?x+?)\2*$) finds the smallest prime factor, but if the number is 1, it uses 1 instead of a prime. So far I can't think of any use for this, though.

@teukon

This comment has been minimized.

Copy link

commented Feb 27, 2014

Interesting.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 1, 2014

Hi, teukon!

I just completed a regex that matches correct multiplication statements, in the domain ^x+\*x+=x+$. It's very long. Would you like to tackle this problem yourself, or would you like me to share my regex?

If you'd like to try it yourself, the rules are: Must be ECMAScript-compatible, but must not use ECMAScript non-participating capture group behavior.

Here is a script for testing the regex:

#!/bin/bash

regex=$(tr -d '[:space:]' < "regex for matching multiplication.txt")

for (( a=1; a<=12; a++ )); do
for (( b=1; b<=12; b++ )); do
for (( c=1; c<=144; c++ )); do
    result=$(
        perl -E "print 'x' x $a; print '*'; print 'x' x $b; print '='; print 'x' x $c" |
            grep -P "$regex"
    )
    if [ -n "$result" ]; then
        echo $a \* $b = $c
    fi
done
done
done
@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

Great, that seems like fun. I'll give it a go.

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

Ok, I think I have a solution. md5sum 7ee3f71ba9c611066072ae9ac281d087.

I'm not using non-participating group capture.

I ran it through your test and it printed the multiplication tables (eventually :p).

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

I've managed to shave off 4 characters but can't find a way to make my regex any more efficient. md5sum 871497593e9561b06e508fc8e984abd9.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 1, 2014

Holy crap that was fast. It took me days to do this — were you thinking about the problem before? I started working on it at 2014-02-27 23:25 GMT, and was thinking about it days before that.

My solution has md5sum 399e7a50c9feecc15639eded1f5303ff BTW, or 929be506fd5faafad13693bdc12dd670 with whitespace. I have not yet done any optimization of its length since I got the first fully working string (instead, I went to bed). The length without whitespace is 913 characters. What's the length of yours?

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

I had briefly considered a multiplication problem like this before, but didn't dwell on it because I didn't want example strings longer than about 40 characters (long example strings are ok for things like squares because no essential information is lost when the problem is presented). This was long before we got to thinking about squares and at the time I thought it was probably impossible anyway. Your post reminded me of this musing, but it wasn't really much of a head start.

It did take me longer to solve than squares.

Are you sure you want the precise length of my solution? To be safe, I'll just say that it's shorter than 913 characters and try to assure myself that it really is robust.

Edit: I've checked my logic and I'm fairly confident that my solution is robust. I can post the length or just a range if you prefer.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 1, 2014

Sure, a range would be nice.

There's a decent amount of room for optimization in mine. I've gotten it down to 899 characters so far; this has md5sum 756e7246b706f94608ed0550fa5300c4 (and git commit hash be7edce04c30ac94dd0ce4bf099873078935d924).

And here is the current version of my test script:

#!/bin/bash

# Trim whitespace and (?#...) style regex comments, allowing use of them without affecting length
regex=$(tr -d '[:space:]' < "regex for matching multiplication.txt" | sed -e 's/(?#[^)]*)//g')

echo Regex length = $(echo -n "$regex"|wc -c)
echo Regex md5sum = $(echo -n "$regex"|md5sum)
echo

for (( a=3; a<=12; a++ )); do
for (( b=1; b<=12; b++ )); do
for (( c=1; c<=144; c++ )); do
    result=$( (
        time (perl -E "print 'x' x $a; print '*'; print 'x' x $b; print '='; print 'x' x $c" |
            grep -P "$regex")
    ) 2>&1 )
    grep_result="${result%%
*}"
    time_result="${result#*
}"
    time_result="${time_result%%
*}"
    correct=$(($a * $b == $c))
    if [ -n "${grep_result}" ]; then
        echo -n $time_result: $a \* $b = $c
        [[ $correct -ne 0 ]] && echo "" || echo " - FALSE POSITIVE!"
    elif [[ $correct -ne 0 ]]; then
        echo $time_result: $a \* $b != $c " - FALSE NEGATIVE!"
    fi
done
done
done
@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

You scared me for a second there with your pgrep switch.

My solution is between 40 and 800 characters in length. I'll tighten the bounds on request.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 1, 2014

Please tighten that. Actually I see no reason for you not to just tell me the length. The only reason I kept the length of my perfect square regex while you were working on yours was that I didn't want to influence your development of a working string. Once you had a working one, I told you the length of mine (and I think I was planning to tell you even if yours was longer).

Also, if yours is a great deal shorter than mine, there's no reason for me not to just share my regex with you, right?

I gave pcregrep (which is equivalent to grep -P) the alias pgrep on my system. Forgot to make it portable at first, when pasting in the new version of my script.

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

Ok, no problem. My solution is 59 characters long.

I gave pcregrep (which is equivalent to grep -P) the alias pgrep on my system. Forgot to make it portable at first, when pasting in the new version of my script.

Aha. I was beginning to wonder what was going on. pgrep is a standard linux command.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 1, 2014

Wow. You've obviously thought of a way to do it that didn't occur to me.

Does it handle the domain ^x*\*x*=x*$ without needing to be augmented in functionality?

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

I'd very much like to see your 899 character solution. I really can't imagine what the algorithm is.

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

Does it handle the domain ^x*\*x*=x*$ without needing to be augmented in functionality?

No.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 1, 2014

I'd very much like to see your 899 character solution. I really can't imagine what the algorithm is.

I'm glad to hear that. I'm currently commenting my regex, then I will put it up on github.

BTW, I'm curious about your original Dominoes 2 algorithm too, and intend to investigate it, but haven't gotten around to that yet. I feel similarly about my multiplication regex as you did about that, I think.

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

BTW, I'm curious about your original Dominoes 2 algorithm too, and intend to investigate it, but haven't gotten around to that yet. I feel similarly about my multiplication regex as you did about that, I think.

I was thinking the same thing. I'll post a short, simple explanation for your.

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

Consider the domino chains:

01 12 20
01 12 21 20
01 12 21 12 20
01 12 21 12 21 20

You'll notice that the first and third entries fit the Dominoes pattern while the second and fourth do not.

My solution uses the window method. Without duplicates, it suffices to check all the 3-piece windows. With duplicates, one must check windows larger than 3, but only if all but the end pieces are the same, and in this case, we need only reference the parity of the window's length.

Here is a spaced and simplified version of my solution (explanation below):

^((.)(.) (?=
    (.?)(\2|\3)(.?)
        ( \4\5\6| \6\5\4)?(( \4\5\6| \6\5\4){2})*
        (
            $
        |
            (?! \4\5\6| \6\5\4)((?=\7).?\4\6|(?!\7).?\5)
        )
))*..$

For each domino, it considers the next domino, (.?)(\2|\3)(.?). This domino is \4\5\6 and its flipped form is \6\5\4. Next, we read as many duplicate dominoes as we can, storing the parity in \7. If we reach a non-duplicate, (?! \4\5\6| \6\5\4), we check that it can fit with everything from (.?)(\2|\3)(.?) by referring to the parity. We then close the window and loop.

This solution scores 79. To reach 118 points requires lots of sneaky golf, akin to our work on Triples. The bulk of the savings came from storing the parity in the spaces between the dominoes, making the final domino checking much shorter.

^((.)(.) (?=((.?)(\2|\3)(\d?))(?=((( (\4|\7\6\5)?){2})*))\8($|\d?\6| .?\5\7)))*..$

Finally, you posted something about "pulverized" and "fine mist", along with

^.?((.).? (\2|(?=.\2)))*\S+$

and I cried.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 1, 2014

Thank you! I will read that now.

And, here is my regex on github.
Here is a direct link to it.

@teukon

This comment has been minimized.

Copy link

commented Mar 1, 2014

Very clever! I did consider the idea of checking that, for a prime p, the product of maximal powers of p dividing a and b respectively (p^s and p^t say), always divided c (pattern: a*b=c); however, I had difficulty trying to multiply these powers. I filed this under "hard but maybe possible" and started thinking about other methods.

It hadn't occurred to me to consider p^s + p^t. This is brilliant, and I imagine it was this insight that caused you to switch from "thinking about" to "working on" the problem.

@teukon

This comment has been minimized.

Copy link

commented Mar 2, 2014

Multiplication (58): md5sum: c021190bfddef85a072e16643e777893

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 2, 2014

Wow, that Dominoes 2 algorithm you used sounds really cool. And storing parity in the spaces between the dominoes? That's some really clever thinking. (Actually it reminds me very much of how ^.?((.).? (\2|(?=.\2)))*\S+$ works.) Do you think there's another problem that could use that technique and couldn't be optimized away to almost nothing like Dominoes 2?

Your pretty-printed Dominoes 2 string, however, does not work when the indentation whitespace is removed, and doesn't match any of the past versions of your string that you showed me. It is also longer than any of the past versions of the string you showed me. BTW, doesn't (\4|\7\6\5)? rely on non-participating capture group behavior? Couldn't it be made portable by changing it to (\4|\7\6\5|)? i.e. ^((.)(.) (?=((.?)(\2|\3)(\d?))(?=((( (\4|\7\6\5|)){2})*))\8($|\d?\6| .?\5\7)))*..$

It hadn't occurred to me to consider p^s + p^t . This is brilliant, and I imagine it was this insight that caused you to
switch from "thinking about" to "working on" the problem.

Thanks! And indeed, that is when I switched.

I still have no idea what algorithm your Multiplication uses, but I have reduced my former 913 character regex down to 667 characters so far. The golf in this is really fun (I see again how you must've felt with Dominoes 2). And I've made the test script much faster.

And... woah. 58 characters?!

@teukon

This comment has been minimized.

Copy link

commented Mar 2, 2014

Do you think there's another problem that could use that technique and couldn't be optimized away to almost nothing like Dominoes 2?

Perhaps. Maybe something could be done by adding spaces to a binary, little-endian adder: [0 0 1 0 1 1 0 0] + [0 1 1 0 0 0 0 1] = [0 1 0 1 1 1 0 1].

Your pretty-printed Dominoes 2 string, however, does not work when the indentation whitespace is removed

Sorry, I was trying to be clever when formatting the string. There should be a space between (?! \4\5\6| \6\5\4) and ((?=\7).?\4\6|(?!\7).?\5)). Here's a working version (77 points):

^((.)(.) (?=(.?)(\2|\3)(.?)( \4\5\6| \6\5\4)?(( \4\5\6| \6\5\4){2})*($|(?! \4\5\6| \6\5\4) ((?=\7).?\4\6|(?!\7).?\5))))*..$

and doesn't match any of the past versions of your string that you showed me.

Yes, I intentionally expanded my longest string to make it easier to understand.

BTW, doesn't (\4|\7\6\5)? rely on non-participating capture group behavior? Couldn't it be made portable by changing it to (\4|\7\6\5|)?

No, I don't refer to this group. Both forms should work.

I still have no idea what algorithm your Multiplication uses, but I have reduced my former 913 character regex down to 667 characters so far. The golf in this is really fun (I see again how you must've felt with Dominoes 2).

Wow! Where did these savings come from?

And... woah. 58 characters?!

Thanks. This took a lot of work. The regex looks very different from the 59-character one.

@teukon

This comment has been minimized.

Copy link

commented Mar 2, 2014

Nice. The test script really is much faster.

I wonder why the older version was slower. Could it be in the time taken to compile the given regex 12^4 times?

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 2, 2014

Wow! Where did these savings come from?

See the commit history.

Unfortunately Github has this stupid width-limiting that can't be adjusted, and you have to click the ... on each commit message individually to view the full message. So I recommend installing git if you haven't already, and cloning the repository and viewing it offline using gitk.

I wonder why the older version was slower. Could it be in the time taken to compile the given regex 12^4 times?

Hmmm, I didn't consider that. I thought it was just the process spawning, but now that you mention it, it could be the compiling of the regex too.

@teukon

This comment has been minimized.

Copy link

commented Mar 2, 2014

See the commit history.

Yes, I started browsing after posting the question. You really have been busy.

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 2, 2014

Yes, I started browsing after posting the question. You really have been busy.

Indeed. Now have it down to 655 characters and it's much faster, too.

But another method occurred to me, and the first working implementation of it is 122 characters. It's much, much faster than the 655 character method. I guess I won't know if it's your method until I optimize it down and see how big it is then. md5sum = 03007e64e3fcbb1e3ef0aaaf1c531592 and git sha1 hash = 75bc23985c0cd27070124ec32cf0d9df577b4ef4.

Now I kinda wish I hadn't shown you my multiplication regex, because now that I know a shorter way to do multiplication, I might've been able to think of a problem that requires the ps + pt method.

@teukon

This comment has been minimized.

Copy link

commented Mar 2, 2014

md5sum: 120ee814ba4693231a3c6bf050370d3f

@Davidebyzero

This comment has been minimized.

Copy link
Owner Author

commented Mar 2, 2014

md5sum: 120ee814ba4693231a3c6bf050370d3f

What's that?

My guess: md5sum 62d4d95413bcaf67488b2c9d819e3719 (of a string explaining what I think it is)

@teukon

This comment has been minimized.

Copy link

commented Mar 2, 2014

But another method occurred to me, and the first working implementation of it is 122 characters.

Great! I hope you're able to reduce it further.

Now I kinda wish I hadn't shown you my multiplication regex, because now that I know a shorter way to do multiplication, I might've been able to think of